Plugins now require explicit consent to load. Discovery still finds every
plugin — user-installed, bundled, and pip — so they all show up in
`hermes plugins` and `/plugins`, but the loader only instantiates
plugins whose name appears in `plugins.enabled` in config.yaml. This
removes the previous ambient-execution risk where a newly-installed or
bundled plugin could register hooks, tools, and commands on first run
without the user opting in.
The three-state model is now explicit:
enabled — in plugins.enabled, loads on next session
disabled — in plugins.disabled, never loads (wins over enabled)
not enabled — discovered but never opted in (default for new installs)
`hermes plugins install <repo>` prompts "Enable 'name' now? [y/N]"
(defaults to no). New `--enable` / `--no-enable` flags skip the prompt
for scripted installs. `hermes plugins enable/disable` manage both lists
so a disabled plugin stays explicitly off even if something later adds
it to enabled.
Config migration (schema v20 → v21): existing user plugins already
installed under ~/.hermes/plugins/ (minus anything in plugins.disabled)
are auto-grandfathered into plugins.enabled so upgrades don't silently
break working setups. Bundled plugins are NOT grandfathered — even
existing users have to opt in explicitly.
Also: HERMES_DISABLE_BUNDLED_PLUGINS env var removed (redundant with
opt-in default), cmd_list now shows bundled + user plugins together with
their three-state status, interactive UI tags bundled entries
[bundled], docs updated across plugins.md and built-in-plugins.md.
Validation: 442 plugin/config tests pass. E2E: fresh install discovers
disk-cleanup but does not load it; `hermes plugins enable disk-cleanup`
activates hooks; migration grandfathers existing user plugins correctly
while leaving bundled plugins off.
The original name was cute but non-obvious; disk-cleanup says what it
does. Plugin directory, script, state path, log lines, slash command,
and test module all renamed. No user-visible state exists yet, so no
migration path is needed.
New website page "Built-in Plugins" documents the <repo>/plugins/<name>/
source, how discovery interacts with user/project plugins, the
HERMES_DISABLE_BUNDLED_PLUGINS escape hatch, disk-cleanup's hook
behaviour and deletion rules, and guidance on when a plugin belongs
bundled vs. user-installable. Added to the Features → Core sidebar next
to the main Plugins page, with a cross-reference from plugins.md.
* fix(docs): unbreak ascii-guard lint on github-pr-review-agent diagram
The intro diagram used 4 side-by-side boxes in one row. ascii-guard can't
parse that layout — it reads the whole thing as one 80-wide outer box and
flags the inner box borders at columns 17/39/60 as 'extra characters after
right border'. Per the ascii-guard-lint-fixing skill, the only fix is to
merge into a single outer box.
Rewritten as one 69-char outer box with four labeled regions separated by
arrows. Same semantic content, lint-clean.
Was blocking docs-site-checks CI as 'action_required' across multiple PRs
(see e.g. run 24661820677).
* fix(docs): backtick-wrap `<1%` to avoid MDX JSX parse error
Docusaurus MDX parses `<1%` as the start of a JSX tag, but `1` isn't a
valid tag-name start so compilation fails with 'Unexpected character `1`
(U+0031) before name'. Wrap in backticks so MDX treats it as literal code
text.
Found by running Build Docusaurus step on the PR that unblocked the
ascii-guard step; full docs tree scanned for other `<digit>` patterns
outside backticks/fences, only this one was unsafe.
OpenAI-compatible clients (Open WebUI, LobeChat, etc.) can now send vision
requests to the API server. Both endpoints accept the canonical OpenAI
multimodal shape:
Chat Completions: {type: text|image_url, image_url: {url, detail?}}
Responses: {type: input_text|input_image, image_url: <str>, detail?}
The server validates and converts both into a single internal shape that the
existing agent pipeline already handles (Anthropic adapter converts,
OpenAI-wire providers pass through). Remote http(s) URLs and data:image/*
URLs are supported.
Uploaded files (file, input_file, file_id) and non-image data: URLs are
rejected with 400 unsupported_content_type.
Changes:
- gateway/platforms/api_server.py
- _normalize_multimodal_content(): validates + normalizes both Chat and
Responses content shapes. Returns a plain string for text-only content
(preserves prompt-cache behavior on existing callers) or a canonical
[{type:text|image_url,...}] list when images are present.
- _content_has_visible_payload(): replaces the bare truthy check so a
user turn with only an image no longer rejects as 'No user message'.
- _handle_chat_completions and _handle_responses both call the new helper
for user/assistant content; system messages continue to flatten to text.
- Codex conversation_history, input[], and inline history paths all share
the same validator. No duplicated normalizers.
- run_agent.py
- _summarize_user_message_for_log(): produces a short string summary
('[1 image] describe this') from list content for logging, spinner
previews, and trajectory writes. Fixes AttributeError when list
user_message hit user_message[:80] + '...' / .replace().
- _chat_content_to_responses_parts(): module-level helper that converts
chat-style multimodal content to Responses 'input_text'/'input_image'
parts. Used in _chat_messages_to_responses_input for Codex routing.
- _preflight_codex_input_items() now validates and passes through list
content parts for user/assistant messages instead of stringifying.
- tests/gateway/test_api_server_multimodal.py (new, 38 tests)
- Unit coverage for _normalize_multimodal_content, including both part
formats, data URL gating, and all reject paths.
- Real aiohttp HTTP integration on /v1/chat/completions and /v1/responses
verifying multimodal payloads reach _run_agent intact.
- 400 coverage for file / input_file / non-image data URL.
- tests/run_agent/test_run_agent_multimodal_prologue.py (new)
- Regression coverage for the prologue no-crash contract.
- _chat_content_to_responses_parts round-trip coverage.
- website/docs/user-guide/features/api-server.md
- Inline image examples for both endpoints.
- Updated Limitations: files still unsupported, images now supported.
Validated live against openrouter/anthropic/claude-opus-4.6:
POST /v1/chat/completions → 200, vision-accurate description
POST /v1/responses → 200, same image, clean output_text
POST /v1/chat/completions [file] → 400 unsupported_content_type
POST /v1/responses [input_file] → 400 unsupported_content_type
POST /v1/responses [non-image data URL] → 400 unsupported_content_type
Closes#5621, #8253, #4046, #6632.
Co-authored-by: Paul Bergeron <paul@gamma.app>
Co-authored-by: zhangxicen <zhangxicen@example.com>
Co-authored-by: Manuel Schipper <manuelschipper@users.noreply.github.com>
Co-authored-by: pradeep7127 <pradeep7127@users.noreply.github.com>
Replaces the permanent "OK" receipt reaction with a 3-phase visual
lifecycle:
- Typing animation appears when the agent starts processing.
- Cleared when processing succeeds — the reply message is the signal.
- Replaced with CrossMark when processing fails.
- Cleared when processing is cancelled or interrupted.
When Feishu rejects the reaction-delete call, we keep the Typing in
place and skip adding CrossMark. Showing both at once would leave the
user seeing both "still working" and "done/failed" simultaneously,
which is worse than a stuck Typing.
A FEISHU_REACTIONS env var (default on) disables the whole lifecycle.
User-added reactions with the same emoji still route through to the
agent; only bot-origin reactions are filtered to break the feedback
loop.
Change-Id: I527081da31f0f9d59b451f45de59df4ddab522ba
The example-skin.yaml was removed as part of the stale docs cleanup.
Docusaurus features/skins.md covers the same material.
Also update AUTHOR_MAP for balyan.sid@gmail.com → alt-glitch (actual
GitHub login; balyansid returns 404).
Smart model routing (auto-routing short/simple turns to a cheap model
across providers) was opt-in and disabled by default. This removes the
feature wholesale: the routing module, its config keys, docs, tests, and
the orchestration scaffolding it required in cli.py / gateway/run.py /
cron/scheduler.py.
The /fast (Priority Processing / Anthropic fast mode) feature kept its
hooks into _resolve_turn_agent_config — those still build a route dict
and attach request_overrides when the model supports it; the route now
just always uses the session's primary model/provider rather than
running prompts through choose_cheap_model_route() first.
Also removed:
- DEFAULT_CONFIG['smart_model_routing'] block and matching commented-out
example sections in hermes_cli/config.py and cli-config.yaml.example
- _load_smart_model_routing() / self._smart_model_routing on GatewayRunner
- self._smart_model_routing / self._active_agent_route_signature on
HermesCLI (signature kept; just no longer initialised through the
smart-routing pipeline)
- route_label parameter on HermesCLI._init_agent (only set by smart
routing; never read elsewhere)
- 'Smart Model Routing' section in website/docs/integrations/providers.md
- tip in hermes_cli/tips.py
- entries in hermes_cli/dump.py + hermes_cli/web_server.py
- row in skills/autonomous-ai-agents/hermes-agent/SKILL.md
Tests:
- Deleted tests/agent/test_smart_model_routing.py
- Rewrote tests/agent/test_credential_pool_routing.py to target the
simplified _resolve_turn_agent_config directly (preserves credential
pool propagation + 429 rotation coverage)
- Dropped 'cheap model' test from test_cli_provider_resolution.py
- Dropped resolve_turn_route patches from cli + gateway test_fast_command
— they now exercise the real method end-to-end
- Removed _smart_model_routing stub assignments from gateway/cron test
helpers
Targeted suites: 74/74 in the directly affected test files;
tests/agent + tests/cron + tests/cli pass except 5 failures that
already exist on main (cron silent-delivery + alias quick-command).
Follow-up to 93fe4b35. The behavior (free-response channels bypass
auto-threading so the channel stays a lightweight inline chat) was
intentional but never documented, causing user confusion ("is this a
bug?" reports).
Adds one line to the behavior table, one paragraph under
discord.free_response_channels, and a cross-reference under
discord.auto_thread.
AWS Bedrock paths (bedrock_converse + AnthropicBedrock SDK) use boto3
with its own timeout config and are not wired to the per-provider knob.
Documented in cli-config.yaml.example and website configuration.md so
users don't expect it to take effect there.
Live test with timeout_seconds: 0.5 on claude-sonnet-4.6 proved the
initial wiring was insufficient: run_agent.py was overriding the
client-level timeout on every call via hardcoded per-request kwargs.
Root cause: run_agent.py had two sites that pass an explicit timeout=
kwarg into chat.completions.create() — api_kwargs['timeout'] at line
7075 (HERMES_API_TIMEOUT=1800s default) and the streaming path's
_httpx.Timeout(..., read=HERMES_STREAM_READ_TIMEOUT=120s, ...) at line
5760. Both override the per-provider config value the client was
constructed with, so a 0.5s config timeout would silently not enforce.
This commit:
- Adds AIAgent._resolved_api_call_timeout() — config > HERMES_API_TIMEOUT env > 1800s default.
- Uses it for the non-streaming api_kwargs['timeout'] field.
- Uses it for the streaming path's httpx.Timeout(connect, read, write, pool)
so both connect and read respect the configured value when set.
Local-provider auto-bump (Ollama/vLLM cold-start) only applies when
no explicit config value is set.
- New test: test_resolved_api_call_timeout_priority covers all three
precedence cases (config, env, default).
Live verified: 0.5s config on claude-sonnet-4.6 now triggers
APITimeoutError at ~3s per retry, exhausts 3 retries in ~15s total
(was: 29-47s success with timeout ignored). Positive case (60s config
+ gpt-4o-mini) still succeeds at 1.3s.
Follow-up on top of mvanhorn's cherry-picked commit. Original PR only
wired request_timeout_seconds into the explicit-creds OpenAI branch at
run_agent.py init; router-based implicit auth, native Anthropic, and the
fallback chain were still hardcoded to SDK defaults.
- agent/anthropic_adapter.py: build_anthropic_client() accepts an optional
timeout kwarg (default 900s preserved when unset/invalid).
- run_agent.py: resolve per-provider/per-model timeout once at init; apply
to Anthropic native init + post-refresh rebuild + stale/interrupt
rebuilds + switch_model + _restore_primary_runtime + the OpenAI
implicit-auth path + _try_activate_fallback (with immediate client
rebuild so the first fallback request carries the configured timeout).
- tests: cover anthropic adapter kwarg honoring; widen mock signatures
to accept the new timeout kwarg.
- docs/example: clarify that the knob now applies to every transport,
the fallback chain, and rebuilds after credential rotation.
Adds optional providers.<id>.request_timeout_seconds and
providers.<id>.models.<model>.timeout_seconds config, resolved via a new
hermes_cli/timeouts.py helper and applied where client_kwargs is built
in run_agent.py. Zero default behavior change: when both keys are unset,
the openai SDK default takes over.
Mirrors the existing _get_task_timeout pattern in agent/auxiliary_client.py
for auxiliary tasks - the primary turn path just never got the equivalent
knob.
Cross-project demand: openclaw/openclaw#43946 (17 reactions) asks for
exactly this config - specifically calls out Ollama cold-start hanging
the client.
Adds two complementary GitHub PR review guides from contest submissions:
- Cron-based PR review agent (from PR #5836 by @dieutx) — polls on a
schedule, no server needed, teaches skills + memory authoring
- Webhook-based PR review (from PR #6503 by @gaijinkush) — real-time via
GitHub webhooks, documents previously undocumented webhook feature
Both guides are cross-linked so users can pick the approach that fits.
Reworks quickstart.md by integrating the best content from PR #5744
by @aidil2105:
- Opinionated decision table ('The fastest path')
- Common failure modes table with causes and fixes
- Recovery toolkit sequence
- Session lifecycle verification step
- Better first-chat guidance with example prompts
Slims down installation.md:
- Removes 10-step manual/dev install section (already covered in
developer-guide/contributing.md)
- Links to Contributing guide for dev setup
- Keeps focused on the automated installer + prerequisites + troubleshooting
find-nearby and the (new) maps optional skill both used OpenStreetMap's
Overpass + Nominatim to answer the same question — 'what's near this
location?' — so shipping both would be duplicate code for overlapping
capability. Consolidate into one active-by-default skill at
skills/productivity/maps/ that is a strict superset of find-nearby.
Moves + deletions:
- optional-skills/productivity/maps/ → skills/productivity/maps/ (active,
no install step needed)
- skills/leisure/find-nearby/ → DELETED (fully superseded)
Upgrades to maps_client.py so it covers everything find-nearby did:
- Overpass server failover — tries overpass-api.de then
overpass.kumi.systems so a single-mirror outage doesn't break the skill
(new overpass_query helper, used by both nearby and bbox)
- nearby now accepts --near "<address>" as a shortcut that auto-geocodes,
so one command replaces the old 'search → copy coords → nearby' chain
- nearby now accepts --category (repeatable) for multi-type queries in
one call (e.g. --category restaurant --category bar), results merged
and deduped by (osm_type, osm_id), sorted by distance, capped at --limit
- Each nearby result now includes maps_url (clickable Google Maps search
link) and directions_url (Google Maps directions from the search point
— only when a ref point is known)
- Promoted commonly-useful OSM tags to top-level fields on each result:
cuisine, hours (opening_hours), phone, website — instead of forcing
callers to dig into the raw tags dict
SKILL.md:
- Version bumped 1.1.0 → 1.2.0, description rewritten to lead with
capability surface
- New 'Working With Telegram Location Pins' section replacing
find-nearby's equivalent workflow
- metadata.hermes.supersedes: [find-nearby] so tooling can flag any
lingering references to the old skill
External references updated:
- optional-skills/productivity/telephony/SKILL.md — related_skills
find-nearby → maps
- website/docs/reference/skills-catalog.md — removed the (now-empty)
'leisure' section, added 'maps' row under productivity
- website/docs/user-guide/features/cron.md — find-nearby example
usages swapped to maps
- tests/tools/test_cronjob_tools.py, tests/hermes_cli/test_cron.py,
tests/cron/test_scheduler.py — fixture string values swapped
- cli.py:5290 — /cron help-hint example swapped
Not touched:
- RELEASE_v0.2.0.md — historical record, left intact
E2E-verified live (Nominatim + Overpass, one query each):
- nearby --near "Times Square" --category restaurant --category bar → 3 results,
sorted by distance, all with maps_url, directions_url, cuisine, phone, website
where OSM had the tags
All 111 targeted tests pass across tests/cron/, tests/tools/, tests/hermes_cli/.
External services can now push plain-text notifications to a user's chat
via the webhook adapter without invoking the agent. Set deliver_only=true
on a route and the rendered prompt template becomes the literal message
body — dispatched directly to the configured target (Telegram, Discord,
Slack, GitHub PR comment, etc.).
Reuses all existing webhook infrastructure: HMAC-SHA256 signature
validation, per-route rate limiting, idempotency cache, body-size limits,
template rendering with dot-notation, home-channel fallback. No new HTTP
server, no new auth scheme, no new port.
Use cases: Supabase/Firebase webhooks → user notifications, monitoring
alert forwarding, inter-agent pings, background job completion alerts.
Changes:
- gateway/platforms/webhook.py: new _direct_deliver() helper + early
dispatch branch in _handle_webhook when deliver_only=true. Startup
validation rejects deliver_only with deliver=log.
- hermes_cli/main.py + hermes_cli/webhook.go: --deliver-only flag on
subscribe; list/show output marks direct-delivery routes.
- website/docs/user-guide/messaging/webhooks.md: new Direct Delivery
Mode section with config example, CLI example, response codes.
- skills/devops/webhook-subscriptions/SKILL.md: document --deliver-only
with use cases (bumped to v1.1.0).
- tests/gateway/test_webhook_deliver_only.py: 14 new tests covering
agent bypass, template rendering, status codes, HMAC still enforced,
idempotency still applies, rate limit still applies, startup
validation, and direct-deliver dispatch.
Validation: 78 webhook tests pass (64 existing + 14 new). E2E verified
with real aiohttp server + real urllib POST — agent not invoked, target
adapter.send() called with rendered template, duplicate delivery_id
suppressed.
Closes the gap identified in PR #12117 (thanks to @H1an1 / Antenna team)
without adding a second HTTP ingress server.
Agents can now send arbitrary CDP commands to the browser. The tool is
gated on a reachable CDP endpoint at session start — it only appears in
the toolset when BROWSER_CDP_URL is set (from '/browser connect') or
'browser.cdp_url' is configured in config.yaml. Backends that don't
currently expose CDP to the Python side (Camofox, default local
agent-browser, cloud providers whose per-session cdp_url is not yet
surfaced) do not see the tool at all.
Tool schema description links to the CDP method reference at
https://chromedevtools.github.io/devtools-protocol/ so the agent can
web_extract specific method docs on demand.
Stateless per call. Browser-level methods (Target.*, Browser.*,
Storage.*) omit target_id. Page-level methods attach to the target
with flatten=true and dispatch the method on the returned sessionId.
Clean errors when the endpoint becomes unreachable mid-session or
the URL isn't a WebSocket.
Tests: 19 unit (mock CDP server + gate checks) + E2E against real
headless Chrome (Target.getTargets, Browser.getVersion,
Runtime.evaluate with target_id, Page.navigate + re-eval, bogus
method, bogus target_id, missing endpoint) + E2E of the check_fn
gate (tool hidden without CDP URL, visible with it, hidden again
after unset).
Setup wizard now always writes dialecticCadence=2 on new configs and
surfaces the reasoning level as an explicit step with all five options
(minimal / low / medium / high / max), always writing
dialecticReasoningLevel.
Code keeps a backwards-compat fallback of 1 when dialecticCadence is
unset so existing honcho.json configs that predate the setting keep
firing every turn on upgrade. New setups via the wizard get 2
explicitly; docs show 2 as the default.
Also scrubs editorial lines from code and docs ("max is reserved for
explicit tool-path selection", "Unset → every turn; wizard pre-fills 2",
and similar process-exposing phrasing) and adds an inline link to
app.honcho.dev where the server-side observation sync is mentioned in
honcho.md. Recommended cadence range updated to 1-5 across docs and
wizard copy.
- cli: setup wizard pre-fills dialecticCadence=2 (code default stays 1
so unset → every turn)
- honcho.md: fix stale dialecticCadence default in tables, add
Session-Start Prewarm subsection (depth runs at init), add
Query-Adaptive Reasoning Level subsection, expand Observation
section with directional vs unified semantics and per-peer patterns
- memory-providers.md: fix stale default, rename Multi-agent/Profiles
to Multi-peer setup, add concrete walkthrough for new profiles and
sync, document observation toggles + presets, link to honcho.md
- SKILL.md: fix stale defaults, add Depth at session start callout
- Revert website/docs and SKILL.md changes; docs unification handled separately
- Scrub commit/PR refs and process narration from code comments and test
docstrings (no behavior change)
Several correctness and cost-safety fixes to the Honcho dialectic path
after a multi-turn investigation surfaced a chain of silent failures:
- dialecticCadence default flipped 3 → 1. PR #10619 changed this from 1 to
3 for cost, but existing installs with no explicit config silently went
from per-turn dialectic to every-3-turns on upgrade. Restores pre-#10619
behavior; 3+ remains available for cost-conscious setups. Docs + wizard
+ status output updated to match.
- Session-start prewarm now consumed. Previously fired a .chat() on init
whose result landed in HonchoSessionManager._dialectic_cache and was
never read — pop_dialectic_result had zero call sites. Turn 1 paid for
a duplicate synchronous dialectic. Prewarm now writes directly to the
plugin's _prefetch_result via _prefetch_lock so turn 1 consumes it with
no extra call.
- Prewarm is now dialecticDepth-aware. A single-pass prewarm can return
weak output on cold peers; the multi-pass audit/reconcile cycle is
exactly the case dialecticDepth was built for. Prewarm now runs the
full configured depth in the background.
- Silent dialectic failure no longer burns the cadence window.
_last_dialectic_turn now advances only when the result is non-empty.
Empty result → next eligible turn retries immediately instead of
waiting the full cadence gap.
- Thread pile-up guard. queue_prefetch skips when a prior dialectic
thread is still in-flight, preventing stacked races on _prefetch_result.
- First-turn sync timeout is recoverable. Previously on timeout the
background thread's result was stored in a dead local list. Now the
thread writes into _prefetch_result under lock so the next turn
picks it up.
- Cadence gate applies uniformly. At cadence=1 the old "cadence > 1"
guard let first-turn sync + same-turn queue_prefetch both fire.
Gate now always applies.
- Restored query-length reasoning-level scaling, dropped in 9a0ab34c.
Scales dialecticReasoningLevel up on longer queries (+1 at ≥120 chars,
+2 at ≥400), clamped at reasoningLevelCap. Two new config keys:
`reasoningHeuristic` (bool, default true) and `reasoningLevelCap`
(string, default "high"; previously parsed but never enforced).
Respects dialecticDepthLevels and proportional lighter-early passes.
- Restored short-prompt skip, dropped in ef7f3156. One-word
acknowledgements ("ok", "y", "thanks") and slash commands bypass
both injection and dialectic fire.
- Purged dead code in session.py: prefetch_dialectic, _dialectic_cache,
set_dialectic_result, pop_dialectic_result — all unused after prewarm
refactor.
Tests: 542 passed across honcho_plugin/, agent/test_memory_provider.py,
and run_agent/test_run_agent.py. New coverage:
- TestTrivialPromptHeuristic (classifier + prefetch/queue skip)
- TestDialecticCadenceAdvancesOnSuccess (empty-result retry, pile-up guard)
- TestSessionStartDialecticPrewarm (prewarm consumed, sync fallback)
- TestReasoningHeuristic (length bumps, cap clamp, interaction with depth)
- TestDialecticLifecycleSmoke (end-to-end 8-turn session walk)
- Remove orphan skills/creative/touchdesigner/references/pitfalls.md
left over from the rename commit (git add-then-edit instead of git mv
meant the old file never got deleted).
- Honour $HERMES_HOME in setup.sh and SKILL.md setup invocation so
profile-aware installs work correctly.
- Fix troubleshooting.md config path to use $HERMES_HOME instead of
hardcoding ~/.hermes/.
- Add touchdesigner-mcp entries to skills-catalog.md and
optional-skills-catalog.md for parity with blender-mcp/meme-generation.
Swap the social-media/xitter skill (third-party wrapper around
Infatoshi/x-cli) for a new social-media/xurl skill wrapping
xdevplatform/xurl — the official X API CLI from the X developer
platform team.
Why:
- xurl is officially maintained by the X dev platform team
- OAuth 2.0 PKCE with auto-refresh + multi-app / multi-user support
(vs. xitter's 5-env-var OAuth 1.0a + single account)
- Credentials stored in ~/.xurl managed by xurl itself — no manual
env var juggling for users
- Substantially larger API surface: DMs, follows, blocks, mutes,
media upload, streaming, and raw v2 endpoint access
- Ships stronger agent-safety guardrails (forbidden-flag list,
no --verbose in agent mode, never-read-~/.xurl rule)
Adaptation:
- Ported the openclaw SKILL.md (which the xdevplatform team seeded)
to Hermes frontmatter conventions (prerequisites.commands, platforms,
metadata.hermes.tags/homepage) — dropped openclaw-specific metadata
- Added a Hermes-oriented one-time user setup section so the agent
knows to direct the user to run auth commands themselves, never
execute them with inline secrets
- Preserved the mandatory secret-safety rules verbatim
- Attribution block credits xdevplatform, openclaw, and the Hermes
port
Docs: updated website/docs/reference/skills-catalog.md to replace
the xitter row with xurl.
- Note that /browser connect is CLI-only and won't work in gateways (WebUI, Telegram, Discord).
- Update the Chrome launch command to use a dedicated --user-data-dir, so port 9222 actually comes up even when Chrome is already running with the user's regular profile.
- Add --no-first-run --no-default-browser-check to skip the fresh-profile wizard.
- Explain why the dedicated user-data-dir matters.
Community tip via Karamjit Singh.
Co-authored-by: teknium1 <teknium@noreply.github.com>
Follow-up to PR #11971. Documents the new code_execution.mode config
key and what each mode actually does.
- user-guide/configuration.md: add mode: project to the yaml example,
explain project vs strict and call out that security invariants are
identical across modes.
- user-guide/features/code-execution.md: new 'Execution Mode' section
with a comparison table and usage guidance; update the 'temporary
directory' note so it reflects that script.py runs in the session
CWD in project mode (staging dir stays on PYTHONPATH for imports);
drop stale 'sandboxed' framing from the intro and skill-passthrough
paragraph.
- getting-started/learning-path.md: update the one-line Code Execution
summary to match (no longer 'sandboxed environments' — the default
runs in the session's real working directory).
No code changes.
Comprehensive audit of every reference/messaging/feature doc page against the
live code registries (PROVIDER_REGISTRY, OPTIONAL_ENV_VARS, COMMAND_REGISTRY,
TOOLSETS, tool registry, on-disk skills). Every fix was verified against code
before writing.
### Wrong values fixed (users would paste-and-fail)
- reference/environment-variables.md:
- DASHSCOPE_BASE_URL default was `coding-intl.dashscope.aliyuncs.com/v1` \u2192
actual `dashscope-intl.aliyuncs.com/compatible-mode/v1`.
- MINIMAX_BASE_URL and MINIMAX_CN_BASE_URL defaults were `/v1` \u2192 actual
`/anthropic` (Hermes calls MiniMax via its Anthropic Messages endpoint).
- reference/toolsets-reference.md MCP example used the non-existent nested
`mcp: servers:` key \u2192 real key is the flat `mcp_servers:`.
- reference/skills-catalog.md listed ~20 bundled skills that no longer exist
on disk (all moved to `optional-skills/`). Regenerated the whole bundled
section from `skills/**/SKILL.md` \u2014 79 skills, accurate paths and names.
- messaging/slack.md ":::info" callout claimed Slack has no
`free_response_channels` equivalent; both the env var and the yaml key are
in fact read.
- messaging/qqbot.md documented `QQ_MARKDOWN_SUPPORT` as an env var, but the
adapter only reads `extra.markdown_support` from config.yaml. Removed the
env var row and noted config-only nature.
- messaging/qqbot.md `hermes setup gateway` \u2192 `hermes gateway setup`.
### Missing coverage added
- Providers: AWS Bedrock and Qwen Portal (qwen-oauth) \u2014 both in
PROVIDER_REGISTRY but undocumented everywhere. Added sections to
integrations/providers.md, rows to quickstart.md and fallback-providers.md.
- integrations/providers.md "Fallback Model" provider list now includes
gemini, google-gemini-cli, qwen-oauth, xai, nvidia, ollama-cloud, bedrock.
- reference/cli-commands.md `--provider` enum and HERMES_INFERENCE_PROVIDER
enum in env-vars now include the same set.
- reference/slash-commands.md: added `/agents` (alias `/tasks`) and `/copy`.
Removed duplicate rows for `/snapshot`, `/fast` (\u00d72), `/debug`.
- reference/tools-reference.md: fixed "47 built-in tools" \u2192 52. Added
`feishu_doc` and `feishu_drive` toolset sections.
- reference/toolsets-reference.md: added `feishu_doc` / `feishu_drive` core
rows + all missing `hermes-<platform>` toolsets in the platform table
(bluebubbles, dingtalk, feishu, qqbot, wecom, wecom-callback, weixin,
homeassistant, webhook, gateway). Fixed the `debugging` composite to
describe the actual `includes=[...]` mechanism.
- reference/optional-skills-catalog.md: added `fitness-nutrition`.
- reference/environment-variables.md: added NOUS_BASE_URL,
NOUS_INFERENCE_BASE_URL, NVIDIA_API_KEY/BASE_URL, OLLAMA_API_KEY/BASE_URL,
XAI_API_KEY/BASE_URL, MISTRAL_API_KEY, AWS_REGION/AWS_PROFILE,
BEDROCK_BASE_URL, HERMES_QWEN_BASE_URL, DISCORD_ALLOWED_CHANNELS,
DISCORD_PROXY, TELEGRAM_REPLY_TO_MODE, MATRIX_DEVICE_ID, MATRIX_REACTIONS,
QQBOT_HOME_CHANNEL_NAME, QQ_SANDBOX.
- messaging/discord.md: documented DISCORD_ALLOWED_CHANNELS, DISCORD_PROXY,
HERMES_DISCORD_TEXT_BATCH_DELAY_SECONDS and HERMES_DISCORD_TEXT_BATCH_SPLIT
_DELAY_SECONDS (all actively read by the adapter).
- messaging/matrix.md: documented MATRIX_REACTIONS (default true).
- messaging/telegram.md: removed the redundant second Webhook Mode section
that invented a `telegram.webhook_mode: true` yaml key the adapter does
not read.
- user-guide/features/hooks.md: added `on_session_finalize` and
`on_session_reset` (both emitted via invoke_hook but undocumented).
- user-guide/features/api-server.md: documented GET /health/detailed, the
`/api/jobs/*` CRUD surface, POST /v1/runs, and GET /v1/runs/{id}/events
(10 routes that were live but undocumented).
- user-guide/features/fallback-providers.md: added `approval` and
`title_generation` auxiliary-task rows; added gemini, bedrock, qwen-oauth
to the supported-providers table.
- user-guide/features/tts.md: "seven providers" \u2192 "eight" (post-xAI add
oversight in #11942).
- user-guide/configuration.md: TTS provider enum gains `xai` and `gemini`;
yaml example block gains `mistral:`, `gemini:`, `xai:` subsections.
Auxiliary-provider enum now enumerates all real registry entries.
- reference/faq.md: stale AIAgent/config examples bumped from
`nous/hermes-3-llama-3.1-70b` and `claude-sonnet-4.6` to
`claude-opus-4.7`.
### Docs-site integrity
- guides/build-a-hermes-plugin.md referenced two nonexistent hooks
(`pre_api_request`, `post_api_request`). Replaced with the real
`on_session_finalize` / `on_session_reset` entries.
- messaging/open-webui.md and features/api-server.md had pre-existing
broken links to `/docs/user-guide/features/profiles` (actual path is
`/docs/user-guide/profiles`). Fixed.
- reference/skills-catalog.md had one `<1%` literal that MDX parsed as a
JSX tag. Escaped to `<1%`.
### False positives filtered out (not changed, verified correct)
- `/set-home` is a registered alias of `/sethome` \u2014 docs were fine.
- `hermes setup gateway` is valid syntax (`hermes setup \<section\>`);
changed in qqbot.md for cross-doc consistency, not as a bug fix.
- Telegram reactions "disabled by default" matches code (default `"false"`).
- Matrix encryption "opt-in" matches code (empty env default \u2192 disabled).
- `pre_api_request` / `post_api_request` hooks do NOT exist in current code;
documented instead the real `on_session_finalize` / `on_session_reset`.
- SIGNAL_IGNORE_STORIES is already in env-vars.md (subagent missed it).
Validation:
- `docusaurus build` \u2014 passes (only pre-existing nix-setup anchor warning).
- `ascii-guard lint docs` \u2014 124 files, 0 errors.
- 22 files changed, +317 / \u2212158.
Three tightly-scoped built-in skill consolidations to reduce redundancy in
the available_skills listing injected into every system prompt:
1. gguf-quantization → llama-cpp (merged)
GGUF is llama.cpp's format; two skills covered the same toolchain. The
merged llama-cpp skill keeps the full K-quant table + imatrix workflow
from gguf and the ROCm/benchmarks/supported-models sections from the
original llama-cpp. All 5 reference files preserved.
2. grpo-rl-training → fine-tuning-with-trl (folded in)
GRPO isn't a framework, it's a trainer inside TRL. Moved the 17KB
deep-dive SKILL.md to references/grpo-training.md and the working
template to templates/basic_grpo_training.py. TRL's GRPO workflow
section now points to both. Atropos skill's related_skills updated.
3. guidance → optional-skills/mlops/
Dropped from built-in. Outlines (still built-in) covers the same
structured-generation ground with wider adoption. Listed in the
optional catalog for users who specifically want Guidance.
Net: 3 fewer built-in skill lines in every system prompt, zero content
loss. Contributor authorship preserved via git rename detection.
hermes update no longer dies when the controlling terminal closes
(SSH drop, shell close) during pip install. SIGHUP is set to SIG_IGN
for the duration of the update, and stdout/stderr are wrapped so writes
to a closed pipe are absorbed instead of cascading into process exit.
All update output is mirrored to ~/.hermes/logs/update.log so users can
see what happened after reconnecting.
SIGINT (Ctrl-C) and SIGTERM (systemd) are intentionally still honored —
those are deliberate cancellations, not accidents. In gateway mode the
helper is a no-op since the update is already detached.
POSIX preserves SIG_IGN across exec(), so pip and git subprocesses
inherit hangup protection automatically — no changes to subprocess
spawning needed.
The existing 'Persistent browser sessions' section had the correct config
snippet but users still hit the flag at the wrong config path, assumed
Hermes could force persistence when the server was ephemeral, and had no
way to verify the flag was actually taking effect.
Adds to that section:
- Warning admonition calling out the nested path vs top-level mistake.
- Explicit 'What Hermes does / does not do' split so users understand
Hermes can only send a stable userId; the Camofox server must map it
to a persistent profile.
- 5-step verification flow for confirming persistence works end-to-end.
- Reminder to restart Hermes after editing config.yaml.
- Where Hermes derives the stable userId (~/.hermes/browser_auth/camofox/)
so users can reset or back up state.
Docs-only change.
Extend forum support from PR #10145:
- REST path (_send_discord): forum thread creation now uploads media
files as multipart attachments on the starter message in a single
call. Previously media files were silently dropped on the forum
path.
- Websocket media paths (_send_file_attachment, send_voice, send_image,
send_animation — covers send_image_file, send_video, send_document
transitively): forum channels now go through a new _forum_post_file
helper that creates a thread with the file as starter content,
instead of failing via channel.send(file=...) which forums reject.
- _send_to_forum chunk follow-up failures are collected into
raw_response['warnings'] so partial-send outcomes surface.
- Process-local probe cache (_DISCORD_CHANNEL_TYPE_PROBE_CACHE) avoids
GET /channels/{id} on every uncached send after the first.
- Dedup of TestSendDiscordMedia that the PR merge-resolution left
behind.
- Docs: Forum Channels section under website/docs/user-guide/messaging/discord.md.
Tests: 117 passed (22 new for forum+media, probe cache, warnings).
- AI Cards: how to configure ``card_template_id`` for streaming rich replies
- Emoji reactions: 🤔Thinking → 🥳Done lifecycle
- Per-platform display settings (streaming, tool_progress, reasoning, etc.)
- Installation: switch to the ``hermes-agent[dingtalk]`` extra (adds
alibabacloud-dingtalk alongside dingtalk-stream)
- Messaging capability matrix updated to reflect images, audio, video,
and threading support
Adds NVIDIA NIM as a first-class provider: ProviderConfig in
auth.py, HermesOverlay in providers.py, curated models
(Nemotron plus other open source models hosted on
build.nvidia.com), URL mapping in model_metadata.py, aliases
(nim, nvidia-nim, build-nvidia, nemotron), and env var tests.
Docs updated: providers page, quickstart table, fallback
providers table, and README provider list.
- stop rewriting markdown tables, headings, and links before delivery
- keep markdown table blocks and headings together during chunking
- update Weixin tests and docs for native markdown rendering
Closes#10308
discord.py does not apply a default AllowedMentions to the client, so any
reply whose content contains @everyone/@here or a role mention would ping
the whole server — including verbatim echoes of user input or LLM output
that happens to contain those tokens.
Set a safe default on commands.Bot: everyone=False, roles=False,
users=True, replied_user=True. Operators can opt back in via four
DISCORD_ALLOW_MENTION_* env vars or discord.allow_mentions.* in
config.yaml. No behavior change for normal user/reply pings.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When a user edits a bundled skill, sync flags it as user_modified and
skips it forever. The problem: if the user later tries to undo the edit
by copying the current bundled version back into ~/.hermes/skills/, the
manifest still holds the old origin hash from the last successful
sync, so the fresh bundled hash still doesn't match and the skill stays
stuck as user_modified.
Adds an escape hatch for this case.
hermes skills reset <name>
Drops the skill's entry from ~/.hermes/skills/.bundled_manifest and
re-baselines against the user's current copy. Future 'hermes update'
runs accept upstream changes again. Non-destructive.
hermes skills reset <name> --restore
Also deletes the user's copy and re-copies the bundled version.
Use when you want the pristine upstream skill back.
Also available as /skills reset in chat.
- tools/skills_sync.py: new reset_bundled_skill(name, restore=False)
- hermes_cli/skills_hub.py: do_reset() + wired into skills_command and
handle_skills_slash; added to the slash /skills help panel
- hermes_cli/main.py: argparse entry for 'hermes skills reset'
- tests/tools/test_skills_sync.py: 5 new tests covering the stuck-flag
repro, --restore, unknown-skill error, upstream-removed-skill, and
no-op on already-clean state
- website/docs/user-guide/features/skills.md: new 'Bundled skill updates'
section explaining the origin-hash mechanic + reset usage
* feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro
Upstream asked for these two upgrades ASAP — the old entries show
stale models when newer, higher-quality versions are available on FAL.
Recraft V3 → Recraft V4 Pro
ID: fal-ai/recraft-v3 → fal-ai/recraft/v4/pro/text-to-image
Price: $0.04/image → $0.25/image (6x — V4 Pro is premium tier)
Schema: V4 dropped the required `style` enum entirely; defaults
handle taste now. Added `colors` and `background_color`
to supports for brand-palette control. `seed` is not
supported by V4 per the API docs.
Nano Banana → Nano Banana Pro
ID: fal-ai/nano-banana → fal-ai/nano-banana-pro
Price: $0.08/image → $0.15/image (1K); $0.30 at 4K
Schema: Aspect ratio family unchanged. Added `resolution`
(1K/2K/4K, default 1K for billing predictability),
`enable_web_search` (real-time info grounding, +$0.015),
and `limit_generations` (force exactly 1 image).
Architecture: Gemini 2.5 Flash → Gemini 3 Pro Image. Quality
and reasoning depth improved; slower (~6s → ~8s).
Migration: users who had the old IDs in `image_gen.model` will
fall through the existing 'unknown model → default' warning path
in `_resolve_fal_model()` and get the Klein 9B default on the next
run. Re-run `hermes tools` → Image Generation to pick the new
version. No silent cost-upgrade aliasing — the 2-6x price jump
on these tiers warrants explicit user re-selection.
Portal note: both new model IDs need to be allowlisted on the
Nous fal-queue-gateway alongside the previous 7 additions, or
users on Nous Subscription will see the 'managed gateway rejected
model' error we added previously (which is clear and
self-remediating, just noisy).
* docs: wrap '<1s' in backticks to unblock MDX compilation
Docusaurus's MDX parser treats unquoted '<' as the start of JSX, and
'<1s' fails because '1' isn't a valid tag-name start character. This
was broken on main since PR #11265 (never noticed because
docs-site-checks was failing on OTHER issues at the time and we
admin-merged through it).
Wrapping in backticks also gives the cell monospace styling which
reads more cleanly alongside the inline-code model ID in the same row.
The other '<1s' occurrence (line 52) is inside a fenced code block
and is already safe — code fences bypass MDX parsing.
* docs: fix ascii-guard border alignment errors
Three docs pages had ASCII diagram boxes with off-by-one column
alignment issues that failed docs-site-checks CI:
- architecture.md: outer box is 71 cols but inner-box content lines
and border corners were offset by 1 col, making content-line right
border at col 70/72 while top/bottom border was at col 71. Inner
boxes also had border corners at cols 19/36/53 but content pipes
at cols 20/37/54. Rewrote the diagram with consistent 71-col width
throughout, aligned inner boxes at cols 4-19, 22-37, 40-55 with
2-space gaps and 15-space trailing padding.
- gateway-internals.md: same class of issue — outer box at 51 cols,
inner content lines varied 52-54 cols. Rewrote with consistent
51-col width, inner boxes at cols 4-15, 18-29, 32-43. Also
restructured the bottom-half message flow so it's bare text
(not half-open box cells) matching the intent of the original.
- agent-loop.md line 112-114: box 2 (API thread) content lines had
one extra space pushing the right border to col 46 while the top
and bottom borders of that box sat at col 45. Trimmed one trailing
space from each of the three content lines.
All 123 docs files now pass `npm run lint:diagrams`:
✓ Errors: 0 (warnings: 6, non-fatal)
Pre-existing failures on main — unrelated to any open PR.
* test(setup): accept description kwarg in prompt_choice mock lambdas
setup.py's `_curses_prompt_choice` gained an optional `description`
parameter (used for rendering context hints alongside the prompt).
`prompt_choice` forwards it via keyword arg. The two existing tests
mocked `_curses_prompt_choice` with lambdas that didn't accept the
new kwarg, so the forwarded call raised TypeError.
Fix: add `description=None` to both mock lambda signatures so they
absorb the new kwarg without changing behavior.
* test(matrix): update stale audio-caching assertion
test_regular_audio_has_http_url asserted that non-voice audio
messages keep their HTTP URL and are NOT downloaded/cached. That
was true when the caching code only triggered on
`is_voice_message`. Since bec02f37 (encrypted-media caching
refactor), matrix.py caches all media locally — photos, audio,
video, documents — so downstream tools can read them as real
files via media_urls. This applies to regular audio too.
Renamed the test to `test_regular_audio_is_cached_locally`,
flipped the assertions accordingly, and documented the
intentional behavior change in the docstring. Other tests in
the file (voice-specific caching, message-type detection,
reply-to threading) continue to pass.
* test(413): allow multi-pass preflight compression
run_agent.py's preflight compression runs up to 3 passes in a loop
for very large sessions (each pass summarizes the middle N turns,
then re-checks tokens). The loop breaks when a pass returns a
message list no shorter than its input (can't compress further).
test_preflight_compresses_oversized_history used a static mock
return value that returned the same 2 messages regardless of input,
so the loop ran pass 1 (41 -> 2) and pass 2 (2 -> 2 -> break),
making call_count == 2. The assert_called_once() assertion was
strictly wrong under the multi-pass design.
The invariant the test actually cares about is: preflight ran, and
its first invocation received the full oversized history. Replaced
the count assertion with those two invariants.
* docs: drop '...' from gateway diagram, merge side-by-side boxes
ascii-guard 2.3.0 flagged two remaining issues after the initial fix
pass:
1. gateway-internals.md L33: the '...' suffix after inner box 3's
right border got parsed as 'extra characters after inner-box right
border'. Dropped the '...' — the surrounding prose already conveys
'and more platforms' without needing the visual hint.
2. agent-loop.md: ascii-guard can't cleanly parse two side-by-side
boxes of different heights (main thread 7 rows, API thread 5 rows).
Even equalizing heights didn't help — the linter treats the left
box's right border as the end of the diagram. Merged into a single
54-char-wide outer box with both threads labeled as regions inside,
keeping the ▶ arrow to preserve the main→API flow direction.
* feat(image_gen): multi-model FAL support with picker in hermes tools
Adds 8 FAL text-to-image models selectable via `hermes tools` →
Image Generation → (FAL.ai | Nous Subscription) → model picker.
Models supported:
- fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP)
- fal-ai/flux-2-pro (previous default, kept backward-compat upscaling)
- fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN)
- fal-ai/nano-banana (Gemini 2.5 Flash Image)
- fal-ai/gpt-image-1.5 (with quality tier: low/medium/high)
- fal-ai/ideogram/v3 (best typography)
- fal-ai/recraft-v3 (vector, brand styles)
- fal-ai/qwen-image (LLM-based)
Architecture:
- FAL_MODELS catalog declares per-model size family, defaults, supports
whitelist, and upscale flag. Three size families handled uniformly:
image_size_preset (flux family), aspect_ratio (nano-banana), and
gpt_literal (gpt-image-1.5).
- _build_fal_payload() translates unified inputs (prompt + aspect_ratio)
into model-specific payloads, merges defaults, applies caller overrides,
wires GPT quality_setting, then filters to the supports whitelist — so
models never receive rejected keys.
- IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen
providers (Replicate, Stability, etc.); each provider entry tags itself
with imagegen_backend: 'fal' to select the right catalog.
- Upscaler (Clarity) defaults off for new models (preserves <1s value
prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS.
Config:
image_gen.model = fal-ai/flux-2/klein/9b (new)
image_gen.quality_setting = medium (new, GPT only)
image_gen.use_gateway = bool (existing)
Agent-facing schema unchanged (prompt + aspect_ratio only) — model
choice is a user-level config decision, not an agent-level arg.
Picker uses curses_radiolist (arrow keys, auto numbered-fallback on
non-TTY). Column-aligned: Model / Speed / Strengths / Price.
Docs: image-generation.md rewritten with the model table and picker
walkthrough. tools-reference, tool-gateway, overview updated to drop
the stale "FLUX 2 Pro" wording.
Tests: 42 new in tests/tools/test_image_generation.py covering catalog
integrity, all 3 size families, supports filter, default merging, GPT
quality wiring, model resolution fallback. 8 new in
tests/hermes_cli/test_tools_config.py for picker wiring (registry,
config writes, GPT quality follow-up prompt, corrupt-config repair).
* feat(image_gen): translate managed-gateway 4xx to actionable error
When the Nous Subscription managed FAL proxy rejects a model with 4xx
(likely portal-side allowlist miss or billing gate), surface a clear
message explaining:
1. The rejected model ID + HTTP status
2. Two remediation paths: set FAL_KEY for direct access, or
pick a different model via `hermes tools`
5xx, connection errors, and direct-FAL errors pass through unchanged
(those have different root causes and reasonable native messages).
Motivation: new FAL models added to this release (flux-2-klein-9b,
z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3,
qwen-image) are untested against the Nous Portal proxy. If the portal
allowlists model IDs, users on Nous Subscription will hit cryptic
4xx errors without guidance on how to work around it.
Tests: 8 new cases covering status extraction across httpx/fal error
shapes and 4xx-vs-5xx-vs-ConnectionError translation policy.
Docs: brief note in image-generation.md for Nous subscribers.
Operator action (Nous Portal side): verify that fal-queue-gateway
passes through these 7 new FAL model IDs. If the proxy has an
allowlist, add them; otherwise Nous Subscription users will see the
new translated error and fall back to direct FAL.
* feat(image_gen): pin GPT-Image quality to medium (no user choice)
Previously the tools picker asked a follow-up question for GPT-Image
quality tier (low / medium / high) and persisted the answer to
`image_gen.quality_setting`. This created two problems:
1. Nous Portal billing complexity — the 22x cost spread between tiers
($0.009 low / $0.20 high) forces the gateway to meter per-tier per
user, which the portal team can't easily support at launch.
2. User footgun — anyone picking `high` by mistake burns through
credit ~6x faster than `medium`.
This commit pins quality at medium by baking it into FAL_MODELS
defaults for gpt-image-1.5 and removes all user-facing override paths:
- Removed `_resolve_gpt_quality()` runtime lookup
- Removed `honors_quality_setting` flag on the model entry
- Removed `_configure_gpt_quality_setting()` picker helper
- Removed `_GPT_QUALITY_CHOICES` constant
- Removed the follow-up prompt call in `_configure_imagegen_model()`
- Even if a user manually edits `image_gen.quality_setting` in
config.yaml, no code path reads it — always sends medium.
Tests:
- Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium
(5 tests) — proves medium is baked in, config is ignored, flag is
removed, helper is removed, non-gpt models never get quality.
- Replaced test_picker_with_gpt_image_also_prompts_quality with
test_picker_with_gpt_image_does_not_prompt_quality — proves only 1
picker call fires when gpt-image is selected (no quality follow-up).
Docs updated: image-generation.md replaces the quality-tier table
with a short note explaining the pinning decision.
* docs(image_gen): drop stale 'wires GPT quality tier' line from internals section
Caught in a cleanup sweep after pinning quality to medium. The
"How It Works Internally" walkthrough still described the removed
quality-wiring step.
The status bar was showing stale lifecycle text ("running…") while the
face+verb stream flickered through the thinking panel as Python pushed
thinking.delta events. That's backwards — the face ticker is the
primary "I'm alive" signal, it belongs in the status bar; the thinking
panel is for substantive reasoning and tool activity.
Status bar now reads `ui.busy`: when true, renders a local `<FaceTicker>`
cycling FACES × VERBS on a 2.5s interval, unaffected by server events.
When false, the bar shows the actual status string (ready, starting
agent…, interrupted, etc.).
Side effect: `scheduleThinkingStatus` still patches `ui.status` with
Python's face text, but while busy the bar ignores that string and uses
the ticker instead. No server-side changes needed — Python keeps
emitting thinking.delta as a liveness heartbeat, the TUI just doesn't
let it fight the status bar.
"Ink" is the React reconciler — implementation detail, not branding.
Consistent naming: the classic CLI is the CLI, the new one is the TUI.
Updated docs: user-guide/tui.md, user-guide/cli.md cross-link, quickstart,
cli-commands reference, environment-variables reference.
Updated code: main.py --tui help text, server.py user-visible setup
error, AGENTS.md "TUI Architecture" section.
Kept "Ink" only where it is literally the library (hermes-ink internal
source comments, AGENTS.md tree note flagging ui-tui/ as a React/Ink dir).
New primary guide at `user-guide/tui.md` covering launch, requirements,
keybindings, slash commands, status line, configuration, sessions, and
the revert path. Matches the voice of `user-guide/cli.md`.
Cross-links:
- `user-guide/cli.md`: tip callout pointing readers at the Ink TUI
- `getting-started/quickstart.md`: shows both `hermes` and `hermes --tui`
under "Start Chatting" so first-run users know they have the choice
- `reference/environment-variables.md`: new "Interface" section with
`HERMES_TUI` and `HERMES_TUI_DIR`
- `reference/cli-commands.md`: `--tui` and `--dev` added to global options
Sidebar: `user-guide/tui` slotted right after `user-guide/cli`.
* - make buffered streaming
- fix path naming to expand `~` for agent.
- fix stripping of matrix ID to not remove other mentions / localports.
* fix(matrix): register MembershipEventDispatcher for invite auto-join
The mautrix migration (#7518) broke auto-join because InternalEventType.INVITE
events are only dispatched when MembershipEventDispatcher is registered on the
client. Without it, _on_invite is dead code and the bot silently ignores all
room invites.
Closes#10094Closes#10725
Refs: PR #10135 (digging-airfare-4u), PR #10732 (fxfitz)
* fix(matrix): preserve _joined_rooms reference for CryptoStateStore
connect() reassigned self._joined_rooms = set(...) after initial sync,
orphaning the reference captured by _CryptoStateStore at init time.
find_shared_rooms() returned [] forever, breaking Megolm session rotation
on membership changes.
Mutate in place with clear() + update() so the CryptoStateStore reference
stays valid.
Refs #8174, PR #8215
* fix(matrix): remove dual ROOM_ENCRYPTED handler to fix dedup race
mautrix auto-registers DecryptionDispatcher when client.crypto is set.
The adapter also registered _on_encrypted_event for the same event type.
_on_encrypted_event had zero awaits and won the race to mark event IDs
in the dedup set, causing _on_room_message to drop successfully decrypted
events from DecryptionDispatcher. The retry loop masked this by re-decrypting
every message ~4 seconds later.
Remove _on_encrypted_event entirely. DecryptionDispatcher handles decryption;
genuinely undecryptable events are logged by mautrix and retried on next
key exchange.
Refs #8174, PR #8215
* fix(matrix): re-verify device keys after share_keys() upload
Matrix homeservers treat ed25519 identity keys as immutable per device.
share_keys() can return 200 but silently ignore new keys if the device
already exists with different identity keys. The bot would proceed with
shared=True while peers encrypt to the old (unreachable) keys.
Now re-queries the server after share_keys() and fails closed if keys
don't match, with an actionable error message.
Refs #8174, PR #8215
* fix(matrix): encrypt outbound attachments in E2EE rooms
_upload_and_send() uploaded raw bytes and used the 'url' key for all
rooms. In E2EE rooms, media must be encrypted client-side with
encrypt_attachment(), the ciphertext uploaded, and the 'file' key
(with key/iv/hashes) used instead of 'url'.
Now detects encrypted rooms via state_store.is_encrypted() and
branches to the encrypted upload path.
Refs: PR #9822 (charles-brooks)
* fix(matrix): add stop_typing to clear typing indicator after response
The adapter set a 30-second typing timeout but never cleared it.
The base class stop_typing() is a no-op, so the typing indicator
lingered for up to 30 seconds after each response.
Closes#6016
Refs: PR #6020 (r266-tech)
* fix(matrix): cache all media types locally, not just photos/voice
should_cache_locally only covered PHOTO, VOICE, and encrypted media.
Unencrypted audio/video/documents in plaintext rooms were passed as MXC
URLs that require authentication the agent doesn't have, resulting
in 401 errors.
Refs #3487, #3806
* fix(matrix): detect stale OTK conflict on startup and fail closed
When crypto state is wiped but the same device ID is reused, the
homeserver may still hold one-time keys signed with the previous
identity key. Identity key re-upload succeeds but OTK uploads fail
with "already exists" and a signature mismatch. Peers cannot
establish new Olm sessions, so all new messages are undecryptable.
Now proactively flushes OTKs via share_keys() during connect() and
catches the "already exists" error with an actionable log message
telling the operator to purge the device from the homeserver or
generate a fresh device ID.
Also documents the crypto store recovery procedure in the Matrix
setup guide.
Refs #8174
* docs(matrix): improve crypto recovery docs per review
- Put easy path (fresh access token) first, manual purge second
- URL-encode user ID in Synapse admin API example
- Note that device deletion may invalidate the access token
- Add "stop Synapse first" caveat for direct SQLite approach
- Mention the fail-closed startup detection behavior
- Add back-reference from upgrade section to OTK warning
* refactor(matrix): cleanup from code review
- Extract _extract_server_ed25519() and _reverify_keys_after_upload()
to deduplicate the re-verification block (was copy-pasted in two
places, three copies of ed25519 key extraction total)
- Remove dead code: _pending_megolm, _retry_pending_decryptions,
_MAX_PENDING_EVENTS, _PENDING_EVENT_TTL — all orphaned after
removing _on_encrypted_event
- Remove tautological TestMediaCacheGate (tested its own predicate,
not production code)
- Remove dead TestMatrixMegolmEventHandling and
TestMatrixRetryPendingDecryptions (tested removed methods)
- Merge duplicate TestMatrixStopTyping into TestMatrixTypingIndicator
- Trim comment to just the "why"
Adds Google Gemini TTS as the seventh voice provider, with 30 prebuilt
voices (Zephyr, Puck, Kore, Enceladus, Gacrux, etc.) and natural-language
prompt control. Integrates through the existing provider chain:
- tools/tts_tool.py: new _generate_gemini_tts() calls the
generativelanguage REST endpoint with responseModalities=[AUDIO],
wraps the returned 24kHz mono 16-bit PCM (L16) in a WAV RIFF header,
then ffmpeg-converts to MP3 or Opus depending on output extension.
For .ogg output, libopus is forced explicitly so Telegram voice
bubbles get Opus (ffmpeg defaults to Vorbis for .ogg).
- hermes_cli/tools_config.py: exposes 'Google Gemini TTS' as a provider
option in the curses-based 'hermes tools' UI.
- hermes_cli/setup.py: adds gemini to the setup wizard picker, tool
status display, and API key prompt branch (accepts existing
GEMINI_API_KEY or GOOGLE_API_KEY, falls back to Edge if neither set).
- tests/tools/test_tts_gemini.py: 15 unit tests covering WAV header
wrap correctness, env var fallback (GEMINI/GOOGLE), voice/model
overrides, snake_case vs camelCase inlineData handling, HTTP error
surfacing, and empty-audio edge cases.
- docs: TTS features page updated to list seven providers with the new
gemini config block and ffmpeg notes.
Live-tested against api key against gemini-2.5-flash-preview-tts: .wav,
.mp3, and Telegram-compatible .ogg (Opus codec) all produce valid
playable audio.
llm-wiki was the only shipped skill using metadata.hermes.config, which
caused 'hermes update' and 'hermes config migrate' to prompt for a wiki
directory on every run — even for users who have never touched the skill
— because 'enabled' is opt-out (all shipped skills count as enabled unless
explicitly disabled). Declining the prompt didn't persist anything, so
the nag fired again on every update.
Switch llm-wiki to the env var + runtime default pattern that obsidian and
google-workspace already use: WIKI_PATH env var, default $HOME/wiki. No
prompting infrastructure, no config.yaml touch, no nag loop.
Changes:
- skills/research/llm-wiki/SKILL.md: remove metadata.hermes.config,
document WIKI_PATH env var in the Wiki Location section, update the
orientation snippet and initialization guidance.
- Docs: replace llm-wiki's wiki.path examples with a generic 'myplugin.path'
placeholder across configuration.md, features/skills.md, and
creating-skills.md so users don't try to set skills.config.wiki.path
expecting llm-wiki to use it.
- skills-catalog.md: mention WIKI_PATH instead of skills.config.wiki.path.
E2E verified: discover_all_skill_config_vars() and get_missing_skill_config_vars()
both return 0 entries after this change, so the prompt branch in migrate_config()
no longer fires.
The metadata.hermes.config feature stays in place for third-party skills
that genuinely need structured config, but built-ins now prefer env vars.
- New page: user-guide/features/tool-gateway.md covering eligibility,
setup (hermes model, hermes tools, manual config), how use_gateway
works, precedence, switching back, status checking, self-hosted
gateway env vars, and FAQ
- Added to sidebar under Features (top-level, before Core category)
- Cross-references from: overview.md, tools.md, browser.md,
image-generation.md, tts.md, providers.md, environment-variables.md
- Added Nous Tool Gateway subsection to env vars reference with
TOOL_GATEWAY_DOMAIN, TOOL_GATEWAY_SCHEME, TOOL_GATEWAY_USER_TOKEN,
and FIRECRAWL_GATEWAY_URL
Camofox automatically maps each userId to a persistent Firefox profile
on the server side — no CAMOFOX_PROFILE_DIR env var exists. Our docs
incorrectly told users to configure this on the server.
Removed the fabricated env var from:
- browser docs (:::note block)
- config.py DEFAULT_CONFIG comment
- test docstring
Pass platform_env_var="TELEGRAM_PROXY" to resolve_proxy_url() in both
telegram.py (main connect) and telegram_network.py (fallback transport),
so a Telegram-specific proxy takes priority over the generic HTTPS_PROXY.
Also bridge telegram.proxy_url from config.yaml to the TELEGRAM_PROXY
env var (env var takes precedence if both are set), add OPTIONAL_ENV_VARS
entry, docs, and tests.
Composite salvage of four community PRs:
- Core approach (both call sites): #9414 by @leeyang1990
- config.yaml bridging + docs: #6530 by @WhiteWorld
- Naming convention: #9074 by @brantzh6
- Earlier proxy work: #7786 by @ten-ltw
Closes#9414, closes#9074, closes#7786, closes#6530
Co-authored-by: WhiteWorld <WhiteWorld@users.noreply.github.com>
Co-authored-by: brantzh6 <brantzh6@users.noreply.github.com>
Co-authored-by: ten-ltw <ten-ltw@users.noreply.github.com>
* feat: implement register_command() on plugin context
Complete the half-built plugin slash command system. The dispatch
code in cli.py and gateway/run.py already called
get_plugin_command_handler() but the registration side was never
implemented.
Changes:
- Add register_command() to PluginContext — stores handler,
description, and plugin name; normalizes names; rejects conflicts
with built-in commands
- Add _plugin_commands dict to PluginManager
- Add commands_registered tracking on LoadedPlugin
- Add get_plugin_command_handler() and get_plugin_commands()
module-level convenience functions
- Fix commands.py to use actual plugin description in Telegram
bot menu (was hardcoded 'Plugin command')
- Add plugin commands to SlashCommandCompleter autocomplete
- Show command count in /plugins display
- 12 new tests covering registration, conflict detection,
normalization, handler dispatch, and introspection
Closes#10495
* docs: add register_command() to plugin guides
- Build a Plugin guide: new 'Register slash commands' section with
full API reference, comparison table vs register_cli_command(),
sync/async examples, and conflict protection docs
- Features/Plugins page: add slash commands to capabilities table
and plugin types summary
Extract resolve_channel_prompt() shared helper into
gateway/platforms/base.py. Refactor Discord to use it.
Wire channel_prompts into Telegram (groups + forum topics),
Slack (channels), and Mattermost (channels).
Config bridging now applies to all platforms (not just Discord).
Added channel_prompts defaults to telegram/slack/mattermost
config sections.
Docs added to all four platform pages with platform-specific
examples (topic inheritance for Telegram, channel IDs for Slack,
etc.).
Users are confused about the difference between `hermes model` (terminal
command for full provider setup) and `/model` (session command for switching
between already-configured providers). This distinction was not documented
anywhere.
Changes across 4 doc pages:
- cli-commands.md: Added warning callout explaining the difference, added
--global flag docs, added 'only see OpenRouter models?' info box
- slash-commands.md: Added notes on both TUI and messaging /model entries
that /model only switches between configured providers
- providers.md: Added 'Two Commands for Model Management' comparison table
near top of page, added warning callout in switching section
- faq.md: Added new FAQ entry '/model only shows one provider' with quick
reference table
Prompted by user feedback in Discord — new users consistently hit this
confusion when trying to add providers from inside a session.
Update the Termux guide to mention that the browser tool now
automatically discovers Termux directories, and add the missing
pkg install nodejs-lts step.
Adds --from flag to gmail send and gmail reply commands, allowing agents
to customize the From header display name when sharing the same email
account. Usage: --from '"Agent Name" <user@example.com>'
Also syncs repo google_api.py with the deployed standalone implementation
(replaces outdated gws_bridge thin wrapper), adds dedicated docs page
under Features > Skills, and updates sidebar navigation.
Requested by community user @Maxime44.
- Running in gateway mode: expose port 8642 for the API server and
health endpoint, with a note on when it's needed.
- New 'Running the dashboard' section: docker run command with
GATEWAY_HEALTH_URL and env var reference table.
- Docker Compose example: updated to include both gateway and dashboard
services with internal network connectivity (hermes-net), so the
dashboard probes the gateway via http://hermes:8642.
- Concurrent access warning: clarified that running a read-only
dashboard alongside the gateway is safe.
* feat(skills): add fitness-nutrition skill to optional-skills
Cherry-picked from PR #9177 by @haileymarshall.
Adds a fitness and nutrition skill for gym-goers and health-conscious users:
- Exercise search via wger API (690+ exercises, free, no auth)
- Nutrition lookup via USDA FoodData Central (380K+ foods, DEMO_KEY fallback)
- Offline body composition calculators (BMI, TDEE, 1RM, macros, body fat %)
- Pure stdlib Python, no pip dependencies
Changes from original PR:
- Moved from skills/ to optional-skills/health/ (correct location)
- Fixed BMR formula in FORMULAS.md (removed confusing -5+10, now just +5)
- Fixed author attribution to match PR submitter
- Marked USDA_API_KEY as optional (DEMO_KEY works without signup)
Also adds optional env var support to the skill readiness checker:
- New 'optional: true' field in required_environment_variables entries
- Optional vars are preserved in metadata but don't block skill readiness
- Optional vars skip the CLI capture prompt flow
- Skills with only optional missing vars show as 'available' not 'setup_needed'
* docs: add automation templates gallery and comparison post
- New docs page: guides/automation-templates.md with 15+ ready-to-use
automation recipes covering development workflow, devops, research,
GitHub events, and business operations
- Comparison post (hermes-already-has-routines.md) showing Hermes has
had schedule/webhook/API triggers since March 2026
- Added automation-templates to sidebar navigation
---------
Co-authored-by: haileymarshall <haileymarshall@users.noreply.github.com>
- Matrix docs: full Proxy Mode section with architecture diagram,
step-by-step setup (host + Docker), docker-compose.yml/Dockerfile
examples, configuration reference, and limitations notes
- API Server docs: add Proxy Mode section explaining the api_server
serves as the backend for gateway proxy mode
- Environment variables reference: add GATEWAY_PROXY_URL and
GATEWAY_PROXY_KEY entries
Add ctx.register_skill() API so plugins can ship SKILL.md files under
a 'plugin:skill' namespace, preventing name collisions with built-in
Hermes skills. skill_view() detects the ':' separator and routes to
the plugin registry while bare names continue through the existing
flat-tree scan unchanged.
Key additions:
- agent/skill_utils: parse_qualified_name(), is_valid_namespace()
- hermes_cli/plugins: PluginContext.register_skill(), PluginManager
skill registry (find/list/remove)
- tools/skills_tool: qualified name dispatch in skill_view(),
_serve_plugin_skill() with full guards (disabled, platform,
injection scan), bundle context banner with sibling listing,
stale registry self-heal
- Hoisted _INJECTION_PATTERNS to module level (dedup)
- Updated skill_view schema description
Based on PR #9334 by N0nb0at. Lean P1 salvage — omits autogen shim
(P2) for a simpler first merge.
Closes#8422
- Rename platform from 'qq' to 'qqbot' across all integration points
(Platform enum, toolset, config keys, import paths, file rename qq.py → qqbot.py)
- Add PLATFORM_HINTS for QQBot in prompt_builder (QQ supports markdown)
- Set SUPPORTS_MESSAGE_EDITING = False to skip streaming on QQ
(prevents duplicate messages from non-editable partial + final sends)
- Add _send_qqbot() standalone send function for cron/send_message tool
- Add interactive _setup_qq() wizard in hermes_cli/setup.py
- Restore missing _setup_signal/email/sms/dingtalk/feishu/wecom/wecom_callback
functions that were lost during the original merge
Add a second light-mode skin option with warm brown/parchment tones,
adapted from ygd58's contribution in PR #4811. Includes completion
menu and status bar color keys for full light-terminal support.
Co-authored-by: buray <78954051+ygd58@users.noreply.github.com>
Adds Arcee AI as a standard direct provider (ARCEEAI_API_KEY) with
Trinity models: trinity-large-thinking, trinity-large-preview, trinity-mini.
Standard OpenAI-compatible provider checklist: auth.py, config.py,
models.py, main.py, providers.py, doctor.py, model_normalize.py,
model_metadata.py, setup.py, trajectory_compressor.py.
Based on PR #9274 by arthurbr11, simplified to a standard direct
provider without dual-endpoint OpenRouter routing.
Cherry-picked from PR #7637 by hcshen0111.
Adds kimi-coding-cn provider with dedicated KIMI_CN_API_KEY env var
and api.moonshot.cn/v1 endpoint for China-region Moonshot users.
- New docs page: user-guide/features/web-dashboard.md covering
quick start, prerequisites, all three pages (Status, Config, API Keys),
the /reload slash command, REST API endpoints, CORS config, and
development workflow
- Added 'Management' category in sidebar for web-dashboard
- Added 'hermes web' to CLI commands reference with options table
- Added '/reload' to slash commands reference (both CLI and gateway tables)
Follow-up for cherry-picked PR #8272:
- Add MATRIX_RECOVERY_KEY to module docstring header in matrix.py
- Register in OPTIONAL_ENV_VARS (config.py) with password=True, advanced=True
- Add to _NON_SETUP_ENV_VARS set
- Document cross-signing verification in matrix.md E2EE section
- Update migration guide with recovery key step (step 3)
- Add to environment-variables.md reference
- Add rebrand_text() that replaces OpenClaw, Open Claw, Open-Claw,
ClawdBot, and MoltBot with Hermes (case-insensitive, word-boundary)
- Apply rebranding to memory entries (MEMORY.md, USER.md, daily memory)
- Apply rebranding to SOUL.md and workspace instructions via new
transform parameter on copy_file()
- Fix moldbot -> moltbot typo across codebase (claw.py, migration
script, docs, tests)
- Add unit tests for rebrand_text and integration tests for memory
and soul migration rebranding
Fixes#7952 — Matrix E2EE completely broken after mautrix migration.
- Replace MemoryCryptoStore + pickle/HMAC persistence with mautrix's
PgCryptoStore backed by SQLite via aiosqlite. Crypto state now
persists reliably across restarts without fragile serialization.
- Add handle_sync() call on initial sync response so to-device events
(queued Megolm key shares) are dispatched to OlmMachine instead of
being silently dropped.
- Add _verify_device_keys_on_server() after loading crypto state.
Detects missing keys (re-uploads), stale keys from migration
(attempts re-upload), and corrupted state (refuses E2EE).
- Add _CryptoStateStore adapter wrapping MemoryStateStore to satisfy
mautrix crypto's StateStore interface (is_encrypted,
get_encryption_info, find_shared_rooms).
- Remove redundant share_keys() call from sync loop — OlmMachine
already handles this via DEVICE_OTK_COUNT event handler.
- Fix datetime vs float TypeError in session.py suspend_recently_active()
that crashed gateway startup.
- Add aiosqlite and asyncpg to [matrix] extra in pyproject.toml.
- Update test mocks for PgCryptoStore/Database and add query_keys mock
for key verification. 174 tests pass.
- Add E2EE upgrade/migration docs to Matrix user guide.
The check_interval parameter on terminal_tool sent periodic output
updates to the gateway chat, but these were display-only — the agent
couldn't see or act on them. This added schema bloat and introduced
a bug where notify_on_complete=True was silently dropped when
check_interval was also set (the not-check_interval guard skipped
fast-watcher registration, and the check_interval watcher dict
was missing the notify_on_complete key).
Removing check_interval entirely:
- Eliminates the notify_on_complete interaction bug
- Reduces tool schema size (one fewer parameter for the model)
- Simplifies the watcher registration path
- notify_on_complete (agent wake-on-completion) still works
- watch_patterns (output alerting) still works
- process(action='poll') covers manual status checking
Closes#7947 (root cause eliminated rather than patched).
* feat(nix): container-aware CLI — auto-route all subcommands into managed container
When container.enable = true, the host `hermes` CLI transparently execs
every subcommand into the managed Docker/Podman container. A symlink
bridge (~/.hermes -> /var/lib/hermes/.hermes) unifies state between host
and container so sessions, config, and memories are shared.
CLI changes:
- Global routing before subcommand dispatch (all commands forwarded)
- docker exec with -u exec_user, env passthrough (TERM, COLORTERM,
LANG, LC_ALL), TTY-aware flags
- Retry with spinner on failure (TTY: 5s, non-TTY: 10s silent)
- Hard fail instead of silent fallback
- HERMES_DEV=1 env var bypasses routing for development
- No routing messages (invisible to user)
NixOS module changes:
- container.hostUsers option: lists users who get ~/.hermes symlink
and automatic hermes group membership
- Activation script creates symlink bridge (with backup of existing
~/.hermes dirs), writes exec_user to .container-mode
- Cleanup on disable: removes symlinks + .container-mode + stops service
- Warning when hostUsers set without addToSystemPackages
* fix: address review — reuse sudo var, add chown -h on symlink update
- hermes_cli/main.py: reuse the existing `sudo` variable instead of
redundant `shutil.which("sudo")` call that could return None
- nix/nixosModules.nix: add missing `chown -h` when updating an
existing symlink target so ownership stays consistent with the
fresh-create and backup-replace branches
* fix: address remaining review items from cursor bugbot
- hermes_cli/main.py: move container routing BEFORE parse_args() so
--help, unrecognised flags, and all subcommands are forwarded
transparently into the container instead of being intercepted by
argparse on the host (high severity)
- nix/nixosModules.nix: resolve home dirs via
config.users.users.${user}.home instead of hardcoding /home/${user},
supporting users with custom home directories (medium severity)
- nix/nixosModules.nix: gate hostUsers group membership on
container.enable so setting hostUsers without container mode doesn't
silently add users to the hermes group (low severity)
* fix: simplify container routing — execvp, no retries, let it crash
- Replace subprocess.run retry loop with os.execvp (no idle parent process)
- Extract _probe_container helper for sudo detection with 15s timeout
- Narrow exception handling: FileNotFoundError only in get_container_exec_info,
catch TimeoutExpired specifically, remove silent except Exception: pass
- Collapse needs_sudo + sudo into single sudo_path variable
- Simplify NixOS symlink creation from 4 branches to 2
- Gate NixOS sudoers hint with "On NixOS:" prefix
- Full test rewrite: 18 tests covering execvp, sudo probe, timeout, permissions
---------
Co-authored-by: Hermes Agent <hermes@nousresearch.com>
Add display.interim_assistant_messages config (enabled by default) that
forwards completed assistant commentary between tool calls to the user
as separate chat messages. Models already emit useful status text like
'I'll inspect the repo first.' — this surfaces it on Telegram, Discord,
and other messaging platforms instead of swallowing it.
Independent from tool_progress and gateway streaming. Disabled for
webhooks. Uses GatewayStreamConsumer when available, falls back to
direct adapter send. Tracks response_previewed to prevent double-delivery
when interim message matches the final response.
Also fixes: cursor not stripped from fallback prefix in stream consumer
(affected continuation calculation on no-edit platforms like Signal).
Cherry-picked from PR #7885 by asheriif, default changed to enabled.
Fixes#5016
- Fix auto list (was only gpt, actually includes codex/gemini/gemma/grok)
- Document the three guidance layers (general, OpenAI-specific, Google-specific)
- Add 'When to turn it on' section for users on non-default models
- Clarify that substring matching is case-insensitive
Three root causes of the 'agent stops mid-task' gateway bug:
1. Compression threshold floor (64K tokens minimum)
- The 50% threshold on a 100K-context model fired at 50K tokens,
causing premature compression that made models lose track of
multi-step plans. Now threshold_tokens = max(50% * context, 64K).
- Models with <64K context are rejected at startup with a clear error.
2. Budget warning removal — grace call instead
- Removed the 70%/90% iteration budget warnings entirely. These
injected '[BUDGET WARNING: Provide your final response NOW]' into
tool results, causing models to abandon complex tasks prematurely.
- Now: no warnings during normal execution. When the budget is
actually exhausted (90/90), inject a user message asking the model
to summarise, allow one grace API call, and only then fall back
to _handle_max_iterations.
3. Activity touches during long terminal execution
- _wait_for_process polls every 0.2s but never reported activity.
The gateway's inactivity timeout (default 1800s) would fire during
long-running commands that appeared 'idle.'
- Now: thread-local activity callback fires every 10s during the
poll loop, keeping the gateway's activity tracker alive.
- Agent wires _touch_activity into the callback before each tool call.
Also: docs update noting 64K minimum context requirement.
Closes#7915 (root cause was agent-loop termination, not Weixin delivery limits).
Add the missing 'Adding a Platform Adapter' developer guide — a
comprehensive step-by-step checklist covering all 20+ integration
points (enum, adapter, config, runner, CLI, tools, toolsets, cron,
webhooks, tests, and docs). Includes common patterns for long-poll,
callback/webhook, and token-lock adapters with reference implementations.
Also adds full docs coverage for the WeCom Callback platform:
- New docs page: user-guide/messaging/wecom-callback.md
- Environment variables reference (9 WECOM_CALLBACK_* vars)
- Toolsets reference (hermes-wecom-callback)
- Messaging index (comparison table, architecture diagram, toolsets,
security, next-steps links)
- Integrations index listing
- Sidebar entries for both new pages
Wire Signal, Email, SMS (Twilio), DingTalk, Feishu/Lark, and WeCom into
the hermes setup gateway interactive wizard. These platforms all had
working adapters and _PLATFORMS entries in gateway.py but were invisible
in the setup checklist — users had to manually edit .env to configure them.
Changes:
- gateway.py: Add _setup_email/sms/dingtalk/feishu/wecom functions
delegating to _setup_standard_platform (Signal already had a custom one)
- setup.py: Add wrapper functions for all 6 new platforms
- setup.py: Add all 6 to _GATEWAY_PLATFORMS checklist registry
- setup.py: Add missing env vars to any_messaging check
- setup.py: Add all missing platforms to _get_section_config_summary
(was also missing Matrix, Mattermost, Weixin, Webhooks)
- docs: Add FEISHU_ALLOWED_USERS and WECOM_ALLOWED_USERS examples
Incorporates and extends the work from PR #7918 by bugmaker2.
The Weixin adapter was splitting responses at every top-level newline,
causing notification spam (up to 70 API calls for a single long markdown
response). This salvages the best aspects of six contributor PRs:
Compact mode (new default):
- Messages under the 4000-char limit stay as a single bubble even with
multiple lines, paragraphs, and code blocks
- Only oversized messages get split at logical markdown boundaries
- Inter-chunk delay (0.3s) between chunks prevents WeChat rate-limit drops
Legacy mode (opt-in):
- Set split_multiline_messages: true in platforms.weixin.extra config
- Or set WEIXIN_SPLIT_MULTILINE_MESSAGES=true env var
- Restores the old per-line splitting behavior
Salvaged from PRs #7797 (guantoubaozi), #7792 (luoxiao6645),
#7838 (qyx596), #7825 (weedge), #7784 (sherunlock03), #7773 (JnyRoad).
Core fix unanimous across all six; config toggle from #7838; inter-chunk
delay from #7825.
hermes claw migrate now always shows a full dry-run preview before
making any changes. The user reviews what would be imported, then
confirms to proceed. --dry-run stops after the preview. --yes skips
the confirmation prompt.
This matches the existing setup wizard flow (_offer_openclaw_migration)
which already did preview-then-confirm.
Docs updated across both docs/migration/openclaw.md and
website/docs/guides/migrate-from-openclaw.md to reflect:
- New preview-first UX flow
- workspace-main/ fallback paths
- accounts.default channel token layout
- TTS edge/microsoft rename
- openclaw.json env sub-object as API key source
- Hyphenated provider API types
- Matrix accessToken field
- SecretRef file/exec warnings
- Skills session restart note
- WhatsApp re-pairing note
- Archive cleanup step
The summary model used for context compaction must have a context window
at least as large as the main agent model. If it's smaller, the
summarization API call fails and middle turns are dropped without a
summary, silently losing conversation context.
Promoted the existing note in configuration.md to a visible warning
admonition, and added a matching warning in the developer guide's
context compression page.
Matrix gateway: fix sync loop never dispatching events (#5819)
- _sync_loop() called client.sync() but never called handle_sync()
to dispatch events to registered callbacks — _on_room_message was
registered but never fired for new messages
- Store next_batch token from initial sync and pass as since= to
subsequent incremental syncs (was doing full initial sync every time)
- 17 comments, confirmed by multiple users on matrix.org
Feishu docs: add interactive card configuration for approvals (#6893)
- Error 200340 is a Feishu Developer Console configuration issue,
not a code bug — users need to enable Interactive Card capability
and configure Card Request URL
- Added required 3-step setup instructions to feishu.md
- Added troubleshooting entry for error 200340
- 17 comments from Feishu users
Copilot provider drift: detect GPT-5.x Responses API requirement (#3388)
- GPT-5.x models are rejected on /v1/chat/completions by both OpenAI
and OpenRouter (unsupported_api_for_model error)
- Added _model_requires_responses_api() to detect models needing
Responses API regardless of provider
- Applied in __init__ (covers OpenRouter primary users) and in
_try_activate_fallback() (covers Copilot->OpenRouter drift)
- Fixed stale comment claiming gateway creates fresh agents per message
(it caches them via _agent_cache since the caching was added)
- 7 comments, reported on Copilot+Telegram gateway
The System Overview ASCII diagram had inconsistent box widths:
- Entry Points box bottom border was 73 chars instead of 71
This caused the docs-site-checks CI to fail on every docs-only PR
due to pre-existing errors in the diagram.
Fix: normalize Entry Points bottom border to 71 characters,
matching the top border width.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add shared is_wsl() to hermes_constants (like is_termux)
- Update supports_systemd_services() to verify systemd is actually
running on WSL before returning True
- Add WSL-specific guidance in gateway install/start/setup/status
for both cases: WSL+systemd and WSL without systemd
- Improve help strings: 'run' now says recommended for WSL/Docker,
'start'/'install' now mention systemd/launchd explicitly
- Add WSL gateway FAQ section with tmux/nohup/Task Scheduler tips
- Update CLI commands docs with WSL tip
- Deduplicate _is_wsl() from clipboard.py to shared hermes_constants
- Fix clipboard tests to reset hermes_constants cache
- 20 new WSL-specific tests covering detection, systemd check,
supports_systemd_services integration, and command output
Motivated by user feedback: took 1 hour to figure out run vs start
on WSL, Telegram bot kept disconnecting due to flaky WSL systemd.
Add is_network_accessible() helper using Python's ipaddress module to
robustly classify bind addresses (IPv4/IPv6 loopback, wildcards,
mapped addresses, hostname resolution with DNS-failure-fails-closed).
The API server connect() now refuses to start when the bind address is
network-accessible and no API_SERVER_KEY is set, preventing RCE from
other machines on the network.
Co-authored-by: entropidelic <entropidelic@users.noreply.github.com>
When enabled, @mentioning the bot in a DM creates a thread (default:
false). Supports both env var and YAML config (matrix.dm_mention_threads).
6 new tests, docs updated.
From #6957
- Remove sys.path.insert hack (leftover from standalone dev)
- Add token lock (acquire_scoped_lock/release_scoped_lock) in
connect()/disconnect() to prevent duplicate pollers across profiles
- Fix get_connected_platforms: WEIXIN check must precede generic
token/api_key check (requires both token AND account_id)
- Add WEIXIN_HOME_CHANNEL_NAME to _EXTRA_ENV_KEYS
- Add gateway setup wizard with QR login flow
- Add platform status check for partially configured state
- Add weixin.md docs page with full adapter documentation
- Update environment-variables.md reference with all 11 env vars
- Update sidebars.ts to include weixin docs page
- Wire all gateway integration points onto current main
Salvaged from PR #6747 by Zihan Huang.
**Problem:**
The BlueBubbles iMessage gateway was not receiving incoming messages even though:
1. BlueBubbles Server was properly configured and running
2. Hermes gateway started without errors
3. Webhook listener was started on the configured port
The root cause was that the BlueBubbles adapter only started a local webhook
listener but never registered the webhook URL with the BlueBubbles server via
the API. Without registration, the server doesn't know where to send events.
**Fix:**
1. Added _register_webhook() method that POSTs to /api/v1/webhook with the
listener URL and event types (new-message, updated-message, message)
2. Added _unregister_webhook() method for clean shutdown
3. Both methods handle the case where webhook listens on 0.0.0.0/127.0.0.1
by using 'localhost' as the external hostname
4. Fixed documentation: 'hermes gateway logs' → 'hermes logs gateway'
**API Reference:**
https://docs.bluebubbles.app/server/developer-guides/rest-api-and-webhooks
**Testing:**
- Webhook registration is now automatic when gateway starts
- Failed registration logs a warning but doesn't prevent startup
- Clean shutdown unregisters the webhook
Closes: iMessage gateway not working issue
The delivery tuple in webhook.py only had 5 of 14 platforms with
gateway adapters. Adds whatsapp, matrix, mattermost, homeassistant,
email, dingtalk, feishu, wecom, and bluebubbles so webhooks can
deliver to any connected platform.
Updates docs delivery options table to list all platforms.
Follow-up to cherry-picked fix from olafthiele (PR #7035).
- Add HERMES_CRON_TIMEOUT and HERMES_CRON_SCRIPT_TIMEOUT to env vars reference
- Add script timeout and provider recovery sections to cron features page
- Add timeout resolution chain and credential pool details to cron internals
Add streaming timeout documentation to three pages:
- guides/local-llm-on-mac.md: New 'Timeouts' section with table of all
three timeouts, their defaults, local auto-adjustments, and env var
overrides
- reference/faq.md: Tip box in the local models FAQ section
- user-guide/configuration.md: 'Streaming Timeouts' subsection under
the agent config section
Follow-up to #6967.
Raise the default httpx stream read timeout from 60s to 120s for all
providers. Additionally, auto-detect local LLM endpoints (Ollama,
llama.cpp, vLLM) and raise the read timeout to HERMES_API_TIMEOUT
(1800s) since local models can take minutes for prefill on large
contexts before producing the first token.
The stale stream timeout already had this local auto-detection pattern;
the httpx read timeout was missing it — causing a hard 60s wall that
users couldn't find (HERMES_STREAM_READ_TIMEOUT was undocumented).
Changes:
- Default HERMES_STREAM_READ_TIMEOUT: 60s -> 120s
- Auto-detect local endpoints -> raise to 1800s (user override respected)
- Document HERMES_STREAM_READ_TIMEOUT and HERMES_STREAM_STALE_TIMEOUT
- Add 10 parametrized tests
Reported-by: Pavan Srinivas (@pavanandums)
* feat: API server model name derived from profile name
For multi-user setups (e.g. OpenWebUI), each profile's API server now
advertises a distinct model name on /v1/models:
- Profile 'lucas' -> model ID 'lucas'
- Profile 'admin' -> model ID 'admin'
- Default profile -> 'hermes-agent' (unchanged)
Explicit override via API_SERVER_MODEL_NAME env var or
platforms.api_server.model_name config for custom names.
Resolves friction where OpenWebUI couldn't distinguish multiple
hermes-agent connections all advertising the same model name.
* docs: multi-user setup with profiles for API server + Open WebUI
- api-server.md: added Multi-User Setup section, API_SERVER_MODEL_NAME
to config table, updated /v1/models description
- open-webui.md: added Multi-User Setup with Profiles section with
step-by-step guide, updated model name references
- environment-variables.md: added API_SERVER_MODEL_NAME entry
When the API returns "max_tokens too large given prompt" (input tokens
are within the context window, but input + requested output > window),
the old code incorrectly routed through the same handler as "prompt too
long" errors, calling get_next_probe_tier() and permanently halving
context_length. This made things worse: the window was fine, only the
requested output size needed trimming for that one call.
Two distinct error classes now handled separately:
Prompt too long — input itself exceeds context window.
Fix: compress history + halve context_length (existing behaviour,
unchanged).
Output cap too large — input OK, but input + max_tokens > window.
Fix: parse available_tokens from the error message, set a one-shot
_ephemeral_max_output_tokens override for the retry, and leave
context_length completely untouched.
Changes:
- agent/model_metadata.py: add parse_available_output_tokens_from_error()
that detects Anthropic's "available_tokens: N" error format and returns
the available output budget, or None for all other error types.
- run_agent.py: call the new parser first in the is_context_length_error
block; if it fires, set _ephemeral_max_output_tokens (with a 64-token
safety margin) and break to retry without touching context_length.
_build_api_kwargs consumes the ephemeral value exactly once then clears
it so subsequent calls use self.max_tokens normally.
- agent/anthropic_adapter.py: expand build_anthropic_kwargs docstring to
clearly document the max_tokens (output cap) vs context_length (total
window) distinction, which is a persistent source of confusion due to
the OpenAI-inherited "max_tokens" name.
- cli-config.yaml.example: add inline comments explaining both keys side
by side where users are most likely to look.
- website/docs/integrations/providers.md: add a callout box at the top
of "Context Length Detection" and clarify the troubleshooting entry.
- tests/test_ctx_halving_fix.py: 24 tests across four classes covering
the parser, build_anthropic_kwargs clamping, ephemeral one-shot
consumption, and the invariant that context_length is never mutated
on output-cap errors.
/pr <anything> silently resolved to /prompt via the shortest-match
tiebreaker in prefix expansion, permanently overwriting the system
prompt and persisting to config. The command's functionality (setting
agent.system_prompt) is available via config.yaml and /personality
covers the common use case.
Removes: CommandDef, dispatch branch, _handle_prompt_command handler,
docs references, and updates subcommand extraction test.
* fix(nix): switch nixpkgs input from nixos-24.11 to nixos-unstable
nixos-24.11 reached EOL on 2025-06-30. For a dev tool, tracking a
frozen release branch causes dependency versions to go stale.
nixos-unstable provides rolling updates and is the conventional
choice for development packages.
* docs(website): update nix flake example
---------
Co-authored-by: sk <sk@mercury>
Documents both debugging commands with full option tables,
examples, and usage guidance. Adds both to the top-level
commands table and as detailed sections with subsections for
log files, filtering behavior, and log rotation.
The docs incorrectly referenced 'hermes pairing generate bluebubbles'
which doesn't exist. The existing reactive pairing flow already handles
this — when an unknown user messages the bot, it sends them a code
automatically, and the owner approves with 'hermes pairing approve'.
The old setup wizard (pre-March 2026) wrote LLM_MODEL to ~/.hermes/.env
across 12 provider flows. Commit 9302690e removed the writes but never
cleaned up existing .env files, leaving a dead variable that:
- Nothing in the codebase reads (zero os.getenv calls)
- The docs incorrectly claimed the gateway still used as fallback
- Caused user confusion when debugging model resolution issues
Changes:
- config.py: Bump _config_version 12 → 13, add migration to clear
LLM_MODEL and OPENAI_MODEL from .env (both dead since March 2026)
- environment-variables.md: Remove LLM_MODEL row, fix HERMES_MODEL
description to stop referencing it
- providers.md: Update deprecation notice from 'deprecated' to 'removed'
Port missing features from the hindsight-hermes external integration
package into the native plugin. Only touches plugin files — no core
changes.
Features:
- Tags on retain/recall (tags, recall_tags, recall_tags_match)
- Recall config (recall_max_tokens, recall_max_input_chars, recall_types,
recall_prompt_preamble)
- Retain controls (retain_every_n_turns, auto_retain, auto_recall,
retain_async via aretain_batch, retain_context)
- Bank config via Banks API (bank_mission, bank_retain_mission)
- Structured JSON retain with per-message timestamps
- Full session accumulation with document_id for dedup
- Custom post_setup() wizard with curses picker
- Mode-aware dep install (hindsight-client for cloud, hindsight-all for local)
- local_external mode and openai_compatible LLM provider
- OpenRouter support with auto base URL
- Auto-upgrade of hindsight-client to >=0.4.22 on session start
- Comprehensive debug logging across all operations
- 46 unit tests
- Updated README and website docs
Documents the per-platform tool_progress_overrides config key added in
PR #6348. Shows example YAML with Signal set to 'off' while Telegram
stays on 'verbose'. Lists all valid platform keys.
Add configurable reply-reference behavior for Discord, matching the
existing Telegram (TELEGRAM_REPLY_TO_MODE) and Mattermost
(MATTERMOST_REPLY_MODE) implementations.
Modes:
- 'off': never reply-reference the original message
- 'first': reply-reference on first chunk only (default, current behavior)
- 'all': reply-reference on every chunk
Set DISCORD_REPLY_TO_MODE=off in .env to disable reply-to messages.
Changes:
- gateway/config.py: parse DISCORD_REPLY_TO_MODE env var
- gateway/platforms/discord.py: read reply_to_mode from config, respect
it in send() — skip fetch_message entirely when 'off'
- hermes_cli/config.py: add to OPTIONAL_ENV_VARS for hermes setup
- 23 tests covering config, send behavior, env var override
- docs: discord.md env var table + environment-variables.md reference
Closes community request from Stuart on Discord.
Users were setting model.provider to "main" after reading the auxiliary
provider docs, causing "Unknown provider" errors. The "main" alias is
only valid inside auxiliary:, compression:, and fallback_model: configs
where it means "use the same provider as my main agent chat."
Added warning admonitions and inline clarifications to:
- configuration.md: Auxiliary Models provider list and Provider Options table
- fallback-providers.md: Provider Options for Auxiliary Tasks table
Reported by community member cn on Discord.
Documents the proxy env var support added in PR #3591 (salvage of #3411
by @kufufu9). Covers HTTPS_PROXY/HTTP_PROXY/ALL_PROXY precedence,
configuration methods, and scope.
Add documentation for cocktailpeanut's hermes-mod community tool —
a web UI for creating and managing Hermes skins visually. Covers
installation (Pinokio, npx, manual), usage walkthrough, and feature
overview including ASCII art generation from images.
Ref: https://github.com/cocktailpeanut/hermes-mod
* fix(telegram): replace substring caption check with exact line-by-line match
Captions in photo bursts and media group albums were silently dropped when
a shorter caption happened to be a substring of an existing one (e.g.
"Meeting" lost inside "Meeting agenda"). Extract a shared _merge_caption
static helper that splits on "\n\n" and uses exact match with whitespace
normalisation, then use it in both _enqueue_photo_event and
_queue_media_group_event.
Adds 13 unit tests covering the fixed bug scenarios.
Cherry-picked from PR #2671 by Dilee.
* fix: extend caption substring fix to all platforms
Move _merge_caption helper from TelegramAdapter to BasePlatformAdapter
so all adapters inherit it. Fix the same substring-containment bug in:
- gateway/platforms/base.py (photo burst merging)
- gateway/run.py (priority photo follow-up merging)
- gateway/platforms/feishu.py (media batch merging)
The original fix only covered telegram.py. The same bug existed in base.py
and run.py (pure substring check) and feishu.py (list membership without
whitespace normalization).
* fix(auxiliary): resolve named custom providers and 'main' alias in auxiliary routing
Two bugs caused auxiliary tasks (vision, compression, etc.) to fail when
using named custom providers defined in config.yaml:
1. 'provider: main' was hardcoded to 'custom', which only checks legacy
OPENAI_BASE_URL env vars. Now reads _read_main_provider() to resolve
to the actual provider (e.g., 'custom:beans', 'openrouter', 'deepseek').
2. Named custom provider names (e.g., 'beans') fell through to
PROVIDER_REGISTRY which doesn't know about config.yaml entries.
Now checks _get_named_custom_provider() before the registry fallback.
Fixes both resolve_provider_client() and _normalize_vision_provider()
so the fix covers all auxiliary tasks (vision, compression, web_extract,
session_search, etc.).
Adds 13 unit tests. Reported by Laura via Discord.
---------
Co-authored-by: Dilee <uzmpsk.dilekakbas@gmail.com>
Based on PR #5413 spec by MaheshtheDev (Mahesh Sanikommu).
Changes:
- Add search_mode config (hybrid/memories/documents) passed to SDK
- Add {identity} template support in container_tag for profile-scoped containers
- Add SUPERMEMORY_CONTAINER_TAG env var override (priority over config)
- Add multi-container mode: enable_custom_container_tags, custom_containers,
custom_container_instructions in supermemory.json
- Dynamic tool schemas when multi-container enabled (optional container_tag param)
- Whitelist validation for custom container tags in tool calls
- Simplify get_config_schema() to only prompt for API key during setup
- Defer container_tag sanitization to initialize() (after template resolution)
- Add custom_id support to documents.add calls
- Update README with multi-container docs, search_mode, identity template,
support links (Discord, email)
- Update memory-providers.md with new features and multi-container example
- Update memory-provider-plugin.md with minimal vs full schema guidance
- Add 12 new tests covering identity template, search_mode, multi-container,
config schema, and env var override
* refactor: remove browser_close tool — auto-cleanup handles it
The browser_close tool was called in only 9% of browser sessions (13/144
navigations across 66 sessions), always redundantly — cleanup_browser()
already runs via _cleanup_task_resources() at conversation end, and the
background inactivity reaper catches anything else.
Removing it saves one tool schema slot in every browser-enabled API call.
Also fixes a latent bug: cleanup_browser() now handles Camofox sessions
too (previously only Browserbase). Camofox sessions were never auto-cleaned
per-task because they live in a separate dict from _active_sessions.
Files changed (13):
- tools/browser_tool.py: remove function, schema, registry entry; add
camofox cleanup to cleanup_browser()
- toolsets.py, model_tools.py, prompt_builder.py, display.py,
acp_adapter/tools.py: remove browser_close from all tool lists
- tests/: remove browser_close test, update toolset assertion
- docs/skills: remove all browser_close references
* fix: repeat browser_scroll 5x per call for meaningful page movement
Most backends scroll ~100px per call — barely visible on a typical
viewport. Repeating 5x gives ~500px (~half a viewport), making each
scroll tool call actually useful.
Backend-agnostic approach: works across all 7+ browser backends without
needing to configure each one's scroll amount individually. Breaks
early on error for the agent-browser path.
* feat: auto-return compact snapshot from browser_navigate
Every browser session starts with navigate → snapshot. Now navigate
returns the compact accessibility tree snapshot inline, saving one
tool call per browser task.
The snapshot captures the full page DOM (not viewport-limited), so
scroll position doesn't affect it. browser_snapshot remains available
for refreshing after interactions or getting full=true content.
Both Browserbase and Camofox paths auto-snapshot. If the snapshot
fails for any reason, navigation still succeeds — the snapshot is
a bonus, not a requirement.
Schema descriptions updated to guide models: navigate mentions it
returns a snapshot, snapshot mentions it's for refresh/full content.
* refactor: slim cronjob tool schema — consolidate model/provider, drop unused params
Session data (151 calls across 67 sessions) showed several schema
properties were never used by models. Consolidated and cleaned up:
Removed from schema (still work via backend/CLI):
- skill (singular): use skills array instead
- reason: pause-only, unnecessary
- include_disabled: now defaults to true
- base_url: extreme edge case, zero usage
- provider (standalone): merged into model object
Consolidated:
- model + provider → single 'model' object with {model, provider} fields.
If provider is omitted, the current main provider is pinned at creation
time so the job stays stable even if the user changes their default.
Kept:
- script: useful data collection feature
- skills array: standard interface for skill loading
Schema shrinks from 14 to 10 properties. All backend functionality
preserved — the Python function signature and handler lambda still
accept every parameter.
* fix: remove mixture_of_agents from core toolsets — opt-in only via hermes tools
MoA was in _HERMES_CORE_TOOLS and composite toolsets (hermes-cli,
hermes-messaging, safe), which meant it appeared in every session
for anyone with OPENROUTER_API_KEY set. The _DEFAULT_OFF_TOOLSETS
gate only works after running 'hermes tools' explicitly.
Now MoA only appears when a user explicitly enables it via
'hermes tools'. The moa toolset definition and check_fn remain
unchanged — it just needs to be opted into.
- Add full Supermemory section to memory-providers.md with config table,
tools, setup instructions, and key features
- Update provider count from 7 to 8 across memory.md and memory-providers.md
- Add SUPERMEMORY_API_KEY to environment-variables.md
- Add Supermemory to integrations/providers.md optional API keys table
- Add supermemory to cli-commands.md provider list
- Add Supermemory to profile isolation section (config file providers)
signal-cli is not available via apt or snap. Replace the incorrect
'sudo apt install signal-cli' with the official install method:
downloading from GitHub releases (Linux) or brew (macOS).
Updated both signal.md docs and the gateway.py setup hint.
Inspired by PR #4225 (which proposed snap, also incorrect).
Comprehensive guide covering:
- llama.cpp and MLX (omlx) setup on Apple Silicon
- Model selection and memory optimization (quantized KV cache)
- Real benchmarks on M5 Max comparing both backends
- Hermes connection instructions
Cherry-picked from PR #2590.
Adds a concise post-update validation checklist (git status, hermes
doctor, version check, gateway status). Adapted from PR #3050 with
corrections — removed inaccurate submodule claim (hermes update
already handles submodules) and tightened the checklist.
Cherry-picked and adapted from PR #3050.
The docker container needs the explicit 'setup' subcommand to launch
the setup wizard. Without it, the container starts in default mode.
Co-authored-by: Omar <omar2535@users.noreply.github.com>
Cherry-picked from PR #4896 (also submitted independently as PR #5532).
Two fixes:
1. Replace all stale 'hermes login' references with 'hermes auth' across
auth.py, auxiliary_client.py, delegate_tool.py, config.py, run_agent.py,
and documentation. The 'hermes login' command was deprecated; 'hermes auth'
now handles OAuth credential management.
2. Fix credential removal not persisting for singleton-sourced credentials
(device_code for openai-codex/nous, hermes_pkce for anthropic).
auth_remove_command already cleared env vars for env-sourced credentials,
but singleton credentials stored in the auth store were re-seeded by
_seed_from_singletons() on the next load_pool() call. Now clears the
underlying auth store entry when removing singleton-sourced credentials.
Updates the plugin build guide and features page to reflect the
interactive env var prompting added in PR #5470. Documents the rich
manifest format (name/description/url/secret) alongside the simple
string format.
* feat(tools): add Firecrawl cloud browser provider
Adds Firecrawl (https://firecrawl.dev) as a cloud browser provider
alongside Browserbase and Browser Use. All browser tools route through
Firecrawl's cloud browser via CDP when selected.
- tools/browser_providers/firecrawl.py — FirecrawlProvider
- tools/browser_tool.py — register in _PROVIDER_REGISTRY
- hermes_cli/tools_config.py — add to onboarding provider picker
- hermes_cli/setup.py — add to setup summary
- hermes_cli/config.py — add FIRECRAWL_BROWSER_TTL config
- website/docs/ — browser docs and env var reference
Based on #4490 by @developersdigest.
Co-Authored-By: Developers Digest <124798203+developersdigest@users.noreply.github.com>
* refactor: simplify FirecrawlProvider.emergency_cleanup
Use self._headers() and self._api_url() instead of duplicating
env-var reads and header construction.
* fix: recognize Firecrawl in subscription browser detection
_resolve_browser_feature_state() now handles "firecrawl" as a direct
browser provider (same pattern as "browser-use"), so hermes setup
summary correctly shows "Browser Automation (Firecrawl)" instead of
misreporting as "Local browser".
Also fixes test_config_version_unchanged assertion (11 → 12).
---------
Co-authored-by: Developers Digest <124798203+developersdigest@users.noreply.github.com>
Skills can now declare config.yaml settings via metadata.hermes.config
in their SKILL.md frontmatter. Values are stored under skills.config.*
namespace, prompted during hermes config migrate, shown in hermes config
show, and injected into the skill context at load time.
Also adds the llm-wiki skill (Karpathy's LLM Wiki pattern) as the first
skill to use the new config interface, declaring wiki.path.
Skill config interface (new):
- agent/skill_utils.py: extract_skill_config_vars(), discover_all_skill_config_vars(),
resolve_skill_config_values(), SKILL_CONFIG_PREFIX
- agent/skill_commands.py: _inject_skill_config() injects resolved values
into skill messages as [Skill config: ...] block
- hermes_cli/config.py: get_missing_skill_config_vars(), skill config
prompting in migrate_config(), Skill Settings in show_config()
LLM Wiki skill (skills/research/llm-wiki/SKILL.md):
- Three-layer architecture (raw sources, wiki pages, schema)
- Three operations (ingest, query, lint)
- Session orientation, page thresholds, tag taxonomy, update policy,
scaling guidance, log rotation, archiving workflow
Docs: creating-skills.md, configuration.md, skills.md, skills-catalog.md
Closes#5100
Windows users running Hermes in WSL2 with model servers on the Windows
host hit 'connection refused' because WSL2's NAT networking means
localhost points to the VM, not Windows.
Covers:
- Mirrored networking mode (Win 11 22H2+) — makes localhost work
- NAT mode fallback using the host IP via ip route
- Per-server bind address table (Ollama, LM Studio, llama-server,
vLLM, SGLang)
- Detailed Ollama Windows service config for OLLAMA_HOST
- Windows Firewall rules for WSL2 connections
- Quick verification steps
- Cross-reference from Troubleshooting section
Add full Configuration Reference section to Discord docs covering all
env vars (10 total) and config.yaml options with types, defaults, and
detailed explanations. Previously undocumented: DISCORD_AUTO_THREAD,
DISCORD_ALLOW_BOTS, DISCORD_REACTIONS, discord.auto_thread,
discord.reactions, display.tool_progress, display.tool_progress_command.
Cleaned up manual setup flow to show only required vars.
As the agent navigates into subdirectories via tool calls (read_file,
terminal, search_files, etc.), automatically discover and load project
context files (AGENTS.md, CLAUDE.md, .cursorrules) from those directories.
Previously, context files were only loaded from the CWD at session start.
If the agent moved into backend/, frontend/, or any subdirectory with its
own AGENTS.md, those instructions were never seen.
Now, SubdirectoryHintTracker watches tool call arguments for file paths
and shell commands, resolves directories, and loads hint files on first
access. Discovered hints are appended to the tool result so the model
gets relevant context at the moment it starts working in a new area —
without modifying the system prompt (preserving prompt caching).
Features:
- Extracts paths from tool args (path, workdir) and shell commands
- Loads AGENTS.md, CLAUDE.md, .cursorrules (first match per directory)
- Deduplicates — each directory loaded at most once per session
- Ignores paths outside the working directory
- Truncates large hint files at 8K chars
- Works on both sequential and concurrent tool execution paths
Inspired by Block/goose SubdirectoryHintTracker.
Plugin context from pre_llm_call hooks was injected into the system
prompt, breaking the prompt cache prefix every turn when content
changed (typical for memory plugins). Now all plugin context goes
into the current turn's user message — the system prompt stays
identical across turns, preserving cached tokens.
The system prompt is reserved for Hermes internals. Plugins
contribute context alongside the user's input.
Also adds comprehensive documentation for all 6 plugin hooks:
pre_tool_call, post_tool_call, pre_llm_call, post_llm_call,
on_session_start, on_session_end — each with full callback
signatures, parameter tables, firing conditions, and examples.
Supersedes #5138 which identified the same cache-busting bug
and proposed an uncached system suffix approach. This fix goes
further by removing system prompt injection entirely.
Co-identified-by: OutThisLife (PR #5138)
Bring Matrix feature parity with Discord by adding mention gating and
auto-threading. Both default to true, matching Discord behavior.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Audit found 24+ discrepancies between docs and code. Fixed:
HIGH severity:
- Remove honcho toolset from tools-reference, toolsets-reference, and tools.md
(converted to memory provider plugin, not a built-in toolset)
- Add note that Honcho is available via plugin
MEDIUM severity:
- Add hermes memory command family to cli-commands.md (setup/status/off)
- Add --clone-all, --clone-from to profile create in cli-commands.md
- Add --max-turns option to hermes chat in cli-commands.md
- Add /btw slash command to slash-commands.md
- Fix profile show example output (remove nonexistent disk usage,
add .env and SOUL.md status lines)
- Add missing hermes-webhook toolset to toolsets-reference.md
- Add 5 missing providers to fallback-providers.md table
- Add 7 missing providers to providers.md fallback list
- Fix outdated model examples: glm-4-plus→glm-5, moonshot-v1-auto→kimi-for-coding
- 7 new tests covering skill binding, fallthrough, coercion
- Docs section in telegram.md with config format, field reference,
comparison table, and thread_id discovery tip
Addresses common questions from the Nous Research community Discord:
- Multi-model workflows via delegation config
- WhatsApp per-chat binding limitations and workarounds
- Controlling tool progress display on Telegram
- Per-platform skills config and Telegram 100-command limit
- Shared thread sessions across multiple users
- Exporting/migrating Hermes to a new machine
- Permission denied on shell reload after install
- HTTP 400 on first agent run
* feat(memory): add pluggable memory provider interface with profile isolation
Introduces a pluggable MemoryProvider ABC so external memory backends can
integrate with Hermes without modifying core files. Each backend becomes a
plugin implementing a standard interface, orchestrated by MemoryManager.
Key architecture:
- agent/memory_provider.py — ABC with core + optional lifecycle hooks
- agent/memory_manager.py — single integration point in the agent loop
- agent/builtin_memory_provider.py — wraps existing MEMORY.md/USER.md
Profile isolation fixes applied to all 6 shipped plugins:
- Cognitive Memory: use get_hermes_home() instead of raw env var
- Hindsight Memory: check $HERMES_HOME/hindsight/config.json first,
fall back to legacy ~/.hindsight/ for backward compat
- Hermes Memory Store: replace hardcoded ~/.hermes paths with
get_hermes_home() for config loading and DB path defaults
- Mem0 Memory: use get_hermes_home() instead of raw env var
- RetainDB Memory: auto-derive profile-scoped project name from
hermes_home path (hermes-<profile>), explicit env var overrides
- OpenViking Memory: read-only, no local state, isolation via .env
MemoryManager.initialize_all() now injects hermes_home into kwargs so
every provider can resolve profile-scoped storage without importing
get_hermes_home() themselves.
Plugin system: adds register_memory_provider() to PluginContext and
get_plugin_memory_providers() accessor.
Based on PR #3825. 46 tests (37 unit + 5 E2E + 4 plugin registration).
* refactor(memory): drop cognitive plugin, rewrite OpenViking as full provider
Remove cognitive-memory plugin (#727) — core mechanics are broken:
decay runs 24x too fast (hourly not daily), prefetch uses row ID as
timestamp, search limited by importance not similarity.
Rewrite openviking-memory plugin from a read-only search wrapper into
a full bidirectional memory provider using the complete OpenViking
session lifecycle API:
- sync_turn: records user/assistant messages to OpenViking session
(threaded, non-blocking)
- on_session_end: commits session to trigger automatic memory extraction
into 6 categories (profile, preferences, entities, events, cases,
patterns)
- prefetch: background semantic search via find() endpoint
- on_memory_write: mirrors built-in memory writes to the session
- is_available: checks env var only, no network calls (ABC compliance)
Tools expanded from 3 to 5:
- viking_search: semantic search with mode/scope/limit
- viking_read: tiered content (abstract ~100tok / overview ~2k / full)
- viking_browse: filesystem-style navigation (list/tree/stat)
- viking_remember: explicit memory storage via session
- viking_add_resource: ingest URLs/docs into knowledge base
Uses direct HTTP via httpx (no openviking SDK dependency needed).
Response truncation on viking_read to prevent context flooding.
* fix(memory): harden Mem0 plugin — thread safety, non-blocking sync, circuit breaker
- Remove redundant mem0_context tool (identical to mem0_search with
rerank=true, top_k=5 — wastes a tool slot and confuses the model)
- Thread sync_turn so it's non-blocking — Mem0's server-side LLM
extraction can take 5-10s, was stalling the agent after every turn
- Add threading.Lock around _get_client() for thread-safe lazy init
(prefetch and sync threads could race on first client creation)
- Add circuit breaker: after 5 consecutive API failures, pause calls
for 120s instead of hammering a down server every turn. Auto-resets
after cooldown. Logs a warning when tripped.
- Track success/failure in prefetch, sync_turn, and all tool calls
- Wait for previous sync to finish before starting a new one (prevents
unbounded thread accumulation on rapid turns)
- Clean up shutdown to join both prefetch and sync threads
* fix(memory): enforce single external memory provider limit
MemoryManager now rejects a second non-builtin provider with a warning.
Built-in memory (MEMORY.md/USER.md) is always accepted. Only ONE
external plugin provider is allowed at a time. This prevents tool
schema bloat (some providers add 3-5 tools each) and conflicting
memory backends.
The warning message directs users to configure memory.provider in
config.yaml to select which provider to activate.
Updated all 47 tests to use builtin + one external pattern instead
of multiple externals. Added test_second_external_rejected to verify
the enforcement.
* feat(memory): add ByteRover memory provider plugin
Implements the ByteRover integration (from PR #3499 by hieuntg81) as a
MemoryProvider plugin instead of direct run_agent.py modifications.
ByteRover provides persistent memory via the brv CLI — a hierarchical
knowledge tree with tiered retrieval (fuzzy text then LLM-driven search).
Local-first with optional cloud sync.
Plugin capabilities:
- prefetch: background brv query for relevant context
- sync_turn: curate conversation turns (threaded, non-blocking)
- on_memory_write: mirror built-in memory writes to brv
- on_pre_compress: extract insights before context compression
Tools (3):
- brv_query: search the knowledge tree
- brv_curate: store facts/decisions/patterns
- brv_status: check CLI version and context tree state
Profile isolation: working directory at $HERMES_HOME/byterover/ (scoped
per profile). Binary resolution cached with thread-safe double-checked
locking. All write operations threaded to avoid blocking the agent
(curate can take 120s with LLM processing).
* fix(memory): thread remaining sync_turns, fix holographic, add config key
Plugin fixes:
- Hindsight: thread sync_turn (was blocking up to 30s via _run_in_thread)
- RetainDB: thread sync_turn (was blocking on HTTP POST)
- Both: shutdown now joins sync threads alongside prefetch threads
Holographic retrieval fixes:
- reason(): removed dead intersection_key computation (bundled but never
used in scoring). Now reuses pre-computed entity_residuals directly,
moved role_content encoding outside the inner loop.
- contradict(): added _MAX_CONTRADICT_FACTS=500 scaling guard. Above
500 facts, only checks the most recently updated ones to avoid O(n^2)
explosion (~125K comparisons at 500 is acceptable).
Config:
- Added memory.provider key to DEFAULT_CONFIG ("" = builtin only).
No version bump needed (deep_merge handles new keys automatically).
* feat(memory): extract Honcho as a MemoryProvider plugin
Creates plugins/honcho-memory/ as a thin adapter over the existing
honcho_integration/ package. All 4 Honcho tools (profile, search,
context, conclude) move from the normal tool registry to the
MemoryProvider interface.
The plugin delegates all work to HonchoSessionManager — no Honcho
logic is reimplemented. It uses the existing config chain:
$HERMES_HOME/honcho.json -> ~/.honcho/config.json -> env vars.
Lifecycle hooks:
- initialize: creates HonchoSessionManager via existing client factory
- prefetch: background dialectic query
- sync_turn: records messages + flushes to API (threaded)
- on_memory_write: mirrors user profile writes as conclusions
- on_session_end: flushes all pending messages
This is a prerequisite for the MemoryManager wiring in run_agent.py.
Once wired, Honcho goes through the same provider interface as all
other memory plugins, and the scattered Honcho code in run_agent.py
can be consolidated into the single MemoryManager integration point.
* feat(memory): wire MemoryManager into run_agent.py
Adds 8 integration points for the external memory provider plugin,
all purely additive (zero existing code modified):
1. Init (~L1130): Create MemoryManager, find matching plugin provider
from memory.provider config, initialize with session context
2. Tool injection (~L1160): Append provider tool schemas to self.tools
and self.valid_tool_names after memory_manager init
3. System prompt (~L2705): Add external provider's system_prompt_block
alongside existing MEMORY.md/USER.md blocks
4. Tool routing (~L5362): Route provider tool calls through
memory_manager.handle_tool_call() before the catchall handler
5. Memory write bridge (~L5353): Notify external provider via
on_memory_write() when the built-in memory tool writes
6. Pre-compress (~L5233): Call on_pre_compress() before context
compression discards messages
7. Prefetch (~L6421): Inject provider prefetch results into the
current-turn user message (same pattern as Honcho turn context)
8. Turn sync + session end (~L8161, ~L8172): sync_all() after each
completed turn, queue_prefetch_all() for next turn, on_session_end()
+ shutdown_all() at conversation end
All hooks are wrapped in try/except — a failing provider never breaks
the agent. The existing memory system, Honcho integration, and all
other code paths are completely untouched.
Full suite: 7222 passed, 4 pre-existing failures.
* refactor(memory): remove legacy Honcho integration from core
Extracts all Honcho-specific code from run_agent.py, model_tools.py,
toolsets.py, and gateway/run.py. Honcho is now exclusively available
as a memory provider plugin (plugins/honcho-memory/).
Removed from run_agent.py (-457 lines):
- Honcho init block (session manager creation, activation, config)
- 8 Honcho methods: _honcho_should_activate, _strip_honcho_tools,
_activate_honcho, _register_honcho_exit_hook, _queue_honcho_prefetch,
_honcho_prefetch, _honcho_save_user_observation, _honcho_sync
- _inject_honcho_turn_context module-level function
- Honcho system prompt block (tool descriptions, CLI commands)
- Honcho context injection in api_messages building
- Honcho params from __init__ (honcho_session_key, honcho_manager,
honcho_config)
- HONCHO_TOOL_NAMES constant
- All honcho-specific tool dispatch forwarding
Removed from other files:
- model_tools.py: honcho_tools import, honcho params from handle_function_call
- toolsets.py: honcho toolset definition, honcho tools from core tools list
- gateway/run.py: honcho params from AIAgent constructor calls
Removed tests (-339 lines):
- 9 Honcho-specific test methods from test_run_agent.py
- TestHonchoAtexitFlush class from test_exit_cleanup_interrupt.py
Restored two regex constants (_SURROGATE_RE, _BUDGET_WARNING_RE) that
were accidentally removed during the honcho function extraction.
The honcho_integration/ package is kept intact — the plugin delegates
to it. tools/honcho_tools.py registry entries are now dead code (import
commented out in model_tools.py) but the file is preserved for reference.
Full suite: 7207 passed, 4 pre-existing failures. Zero regressions.
* refactor(memory): restructure plugins, add CLI, clean gateway, migration notice
Plugin restructure:
- Move all memory plugins from plugins/<name>-memory/ to plugins/memory/<name>/
(byterover, hindsight, holographic, honcho, mem0, openviking, retaindb)
- New plugins/memory/__init__.py discovery module that scans the directory
directly, loading providers by name without the general plugin system
- run_agent.py uses load_memory_provider() instead of get_plugin_memory_providers()
CLI wiring:
- hermes memory setup — interactive curses picker + config wizard
- hermes memory status — show active provider, config, availability
- hermes memory off — disable external provider (built-in only)
- hermes honcho — now shows migration notice pointing to hermes memory setup
Gateway cleanup:
- Remove _get_or_create_gateway_honcho (already removed in prev commit)
- Remove _shutdown_gateway_honcho and _shutdown_all_gateway_honcho methods
- Remove all calls to shutdown methods (4 call sites)
- Remove _honcho_managers/_honcho_configs dict references
Dead code removal:
- Delete tools/honcho_tools.py (279 lines, import was already commented out)
- Delete tests/gateway/test_honcho_lifecycle.py (131 lines, tested removed methods)
- Remove if False placeholder from run_agent.py
Migration:
- Honcho migration notice on startup: detects existing honcho.json or
~/.honcho/config.json, prints guidance to run hermes memory setup.
Only fires when memory.provider is not set and not in quiet mode.
Full suite: 7203 passed, 4 pre-existing failures. Zero regressions.
* feat(memory): standardize plugin config + add per-plugin documentation
Config architecture:
- Add save_config(values, hermes_home) to MemoryProvider ABC
- Honcho: writes to $HERMES_HOME/honcho.json (SDK native)
- Mem0: writes to $HERMES_HOME/mem0.json
- Hindsight: writes to $HERMES_HOME/hindsight/config.json
- Holographic: writes to config.yaml under plugins.hermes-memory-store
- OpenViking/RetainDB/ByteRover: env-var only (default no-op)
Setup wizard (hermes memory setup):
- Now calls provider.save_config() for non-secret config
- Secrets still go to .env via env vars
- Only memory.provider activation key goes to config.yaml
Documentation:
- README.md for each of the 7 providers in plugins/memory/<name>/
- Requirements, setup (wizard + manual), config reference, tools table
- Consistent format across all providers
The contract for new memory plugins:
- get_config_schema() declares all fields (REQUIRED)
- save_config() writes native config (REQUIRED if not env-var-only)
- Secrets use env_var field in schema, written to .env by wizard
- README.md in the plugin directory
* docs: add memory providers user guide + developer guide
New pages:
- user-guide/features/memory-providers.md — comprehensive guide covering
all 7 shipped providers (Honcho, OpenViking, Mem0, Hindsight,
Holographic, RetainDB, ByteRover). Each with setup, config, tools,
cost, and unique features. Includes comparison table and profile
isolation notes.
- developer-guide/memory-provider-plugin.md — how to build a new memory
provider plugin. Covers ABC, required methods, config schema,
save_config, threading contract, profile isolation, testing.
Updated pages:
- user-guide/features/memory.md — replaced Honcho section with link to
new Memory Providers page
- user-guide/features/honcho.md — replaced with migration redirect to
the new Memory Providers page
- sidebars.ts — added both new pages to navigation
* fix(memory): auto-migrate Honcho users to memory provider plugin
When honcho.json or ~/.honcho/config.json exists but memory.provider
is not set, automatically set memory.provider: honcho in config.yaml
and activate the plugin. The plugin reads the same config files, so
all data and credentials are preserved. Zero user action needed.
Persists the migration to config.yaml so it only fires once. Prints
a one-line confirmation in non-quiet mode.
* fix(memory): only auto-migrate Honcho when enabled + credentialed
Check HonchoClientConfig.enabled AND (api_key OR base_url) before
auto-migrating — not just file existence. Prevents false activation
for users who disabled Honcho, stopped using it (config lingers),
or have ~/.honcho/ from a different tool.
* feat(memory): auto-install pip dependencies during hermes memory setup
Reads pip_dependencies from plugin.yaml, checks which are missing,
installs them via pip before config walkthrough. Also shows install
guidance for external_dependencies (e.g. brv CLI for ByteRover).
Updated all 7 plugin.yaml files with pip_dependencies:
- honcho: honcho-ai
- mem0: mem0ai
- openviking: httpx
- hindsight: hindsight-client
- holographic: (none)
- retaindb: requests
- byterover: (external_dependencies for brv CLI)
* fix: remove remaining Honcho crash risks from cli.py and gateway
cli.py: removed Honcho session re-mapping block (would crash importing
deleted tools/honcho_tools.py), Honcho flush on compress, Honcho
session display on startup, Honcho shutdown on exit, honcho_session_key
AIAgent param.
gateway/run.py: removed honcho_session_key params from helper methods,
sync_honcho param, _honcho.shutdown() block.
tests: fixed test_cron_session_with_honcho_key_skipped (was passing
removed honcho_key param to _flush_memories_for_session).
* fix: include plugins/ in pyproject.toml package list
Without this, plugins/memory/ wouldn't be included in non-editable
installs. Hermes always runs from the repo checkout so this is belt-
and-suspenders, but prevents breakage if the install method changes.
* fix(memory): correct pip-to-import name mapping for dep checks
The heuristic dep.replace('-', '_') fails for packages where the pip
name differs from the import name: honcho-ai→honcho, mem0ai→mem0,
hindsight-client→hindsight_client. Added explicit mapping table so
hermes memory setup doesn't try to reinstall already-installed packages.
* chore: remove dead code from old plugin memory registration path
- hermes_cli/plugins.py: removed register_memory_provider(),
_memory_providers list, get_plugin_memory_providers() — memory
providers now use plugins/memory/ discovery, not the general plugin system
- hermes_cli/main.py: stripped 74 lines of dead honcho argparse
subparsers (setup, status, sessions, map, peer, mode, tokens,
identity, migrate) — kept only the migration redirect
- agent/memory_provider.py: updated docstring to reflect new
registration path
- tests: replaced TestPluginMemoryProviderRegistration with
TestPluginMemoryDiscovery that tests the actual plugins/memory/
discovery system. Added 3 new tests (discover, load, nonexistent).
* chore: delete dead honcho_integration/cli.py and its tests
cli.py (794 lines) was the old 'hermes honcho' command handler — nobody
calls it since cmd_honcho was replaced with a migration redirect.
Deleted tests that imported from removed code:
- tests/honcho_integration/test_cli.py (tested _resolve_api_key)
- tests/honcho_integration/test_config_isolation.py (tested CLI config paths)
- tests/tools/test_honcho_tools.py (tested the deleted tools/honcho_tools.py)
Remaining honcho_integration/ files (actively used by the plugin):
- client.py (445 lines) — config loading, SDK client creation
- session.py (991 lines) — session management, queries, flush
* refactor: move honcho_integration/ into the honcho plugin
Moves client.py (445 lines) and session.py (991 lines) from the
top-level honcho_integration/ package into plugins/memory/honcho/.
No Honcho code remains in the main codebase.
- plugins/memory/honcho/client.py — config loading, SDK client creation
- plugins/memory/honcho/session.py — session management, queries, flush
- Updated all imports: run_agent.py (auto-migration), hermes_cli/doctor.py,
plugin __init__.py, session.py cross-import, all tests
- Removed honcho_integration/ package and pyproject.toml entry
- Renamed tests/honcho_integration/ → tests/honcho_plugin/
* docs: update architecture + gateway-internals for memory provider system
- architecture.md: replaced honcho_integration/ with plugins/memory/
- gateway-internals.md: replaced Honcho-specific session routing and
flush lifecycle docs with generic memory provider interface docs
* fix: update stale mock path for resolve_active_host after honcho plugin migration
* fix(memory): address review feedback — P0 lifecycle, ABC contract, honcho CLI restore
Review feedback from Honcho devs (erosika):
P0 — Provider lifecycle:
- Remove on_session_end() + shutdown_all() from run_conversation() tail
(was killing providers after every turn in multi-turn sessions)
- Add shutdown_memory_provider() method on AIAgent for callers
- Wire shutdown into CLI atexit, reset_conversation, gateway stop/expiry
Bug fixes:
- Remove sync_honcho=False kwarg from /btw callsites (TypeError crash)
- Fix doctor.py references to dead 'hermes honcho setup' command
- Cache prefetch_all() before tool loop (was re-calling every iteration)
ABC contract hardening (all backwards-compatible):
- Add session_id kwarg to prefetch/sync_turn/queue_prefetch
- Make on_pre_compress() return str (provider insights in compression)
- Add **kwargs to on_turn_start() for runtime context
- Add on_delegation() hook for parent-side subagent observation
- Document agent_context/agent_identity/agent_workspace kwargs on
initialize() (prevents cron corruption, enables profile scoping)
- Fix docstring: single external provider, not multiple
Honcho CLI restoration:
- Add plugins/memory/honcho/cli.py (from main's honcho_integration/cli.py
with imports adapted to plugin path)
- Restore full hermes honcho command with all subcommands (status, peer,
mode, tokens, identity, enable/disable, sync, peers, --target-profile)
- Restore auto-clone on profile creation + sync on hermes update
- hermes honcho setup now redirects to hermes memory setup
* fix(memory): wire on_delegation, skip_memory for cron/flush, fix ByteRover return type
- Wire on_delegation() in delegate_tool.py — parent's memory provider
is notified with task+result after each subagent completes
- Add skip_memory=True to cron scheduler (prevents cron system prompts
from corrupting user representations — closes#4052)
- Add skip_memory=True to gateway flush agent (throwaway agent shouldn't
activate memory provider)
- Fix ByteRover on_pre_compress() return type: None -> str
* fix(honcho): port profile isolation fixes from PR #4632
Ports 5 bug fixes found during profile testing (erosika's PR #4632):
1. 3-tier config resolution — resolve_config_path() now checks
$HERMES_HOME/honcho.json → ~/.hermes/honcho.json → ~/.honcho/config.json
(non-default profiles couldn't find shared host blocks)
2. Thread host=_host_key() through from_global_config() in cmd_setup,
cmd_status, cmd_identity (--target-profile was being ignored)
3. Use bare profile name as aiPeer (not host key with dots) — Honcho's
peer ID pattern is ^[a-zA-Z0-9_-]+$, dots are invalid
4. Wrap add_peers() in try/except — was fatal on new AI peers, killed
all message uploads for the session
5. Gate Honcho clone behind --clone/--clone-all on profile create
(bare create should be blank-slate)
Also: sanitize assistant_peer_id via _sanitize_id()
* fix(tests): add module cleanup fixture to test_cli_provider_resolution
test_cli_provider_resolution._import_cli() wipes tools.*, cli, and
run_agent from sys.modules to force fresh imports, but had no cleanup.
This poisoned all subsequent tests on the same xdist worker — mocks
targeting tools.file_tools, tools.send_message_tool, etc. patched the
NEW module object while already-imported functions still referenced
the OLD one. Caused ~25 cascade failures: send_message KeyError,
process_registry FileNotFoundError, file_read_guards timeouts,
read_loop_detection file-not-found, mcp_oauth None port, and
provider_parity/codex_execution stale tool lists.
Fix: autouse fixture saves all affected modules before each test and
restores them after, matching the pattern in
test_managed_browserbase_and_modal.py.
* docs: add Configuration Options section to Slack docs
Documents all config.yaml options for the Slack bot:
- Thread & reply behavior (reply_to_mode, reply_broadcast)
- Session isolation (group_sessions_per_user)
- Mention & trigger behavior (require_mention, mention_patterns, reply_prefix)
- Unauthorized user handling (unauthorized_dm_behavior)
- Voice transcription (stt_enabled)
- Full example config showing all options together
Includes a note about Slack's hardcoded @mention requirement in channels
(no free_response_channels equivalent like Discord/Telegram).
* docs: consolidate reply_in_thread into Configuration Options section
Folds the standalone Reply Threading subsection from PR #4643 into
the Thread & Reply Behavior subsection, keeping all config options
in one place. Adds reply_in_thread to the table and full example.
Adds a Skills Hub page to the documentation site with browsable/searchable catalog of all skills (built-in, optional, and community from cached hub indexes).
- Python extraction script (website/scripts/extract-skills.py) parses SKILL.md frontmatter and hub index caches into skills.json
- React page (website/src/pages/skills/) with search, category filtering, source filtering, and expandable skill cards
- CI workflow updated to run extraction before Docusaurus build
- Deploy trigger expanded to include skills/ and optional-skills/ changes
Authored by @IAvecilla
Adds two Camofox features:
1. Persistent browser sessions: new `browser.camofox.managed_persistence`
config option. When enabled, Hermes sends a deterministic profile-scoped
userId to Camofox so the server maps it to a persistent browser profile
directory. Cookies, logins, and browser state survive across restarts.
Default remains ephemeral (random userId per session).
2. VNC URL discovery: Camofox /health endpoint returns vncPort when running
in headed mode. Hermes constructs the VNC URL and includes it in navigate
responses so the agent can share it with users.
Also fixes camofox_vision bug where call_llm response object was passed
directly to json.dumps instead of extracting .choices[0].message.content.
Changes from original PR:
- Removed browser_evaluate tool (separate feature, needs own PR)
- Removed snapshot truncation limit change (unrelated)
- Config.yaml only for managed_persistence (no env var, no version bump)
- Rewrote tests to use config mock instead of env var
- Reverted package-lock.json churn
Co-authored-by: analista <psikonetik@gmail.com.com>
* feat(file_tools): harden read_file with size guard, dedup, and device blocking
Three improvements to read_file_tool to reduce wasted context tokens and
prevent process hangs:
1. Character-count guard: reads that produce more than 100K characters
(≈25-35K tokens across tokenisers) are rejected with an error that
tells the model to use offset+limit for a smaller range. The
effective cap is min(file_size, 100K) so small files that happen to
have long lines aren't over-penalised. Large truncated files also
get a hint nudging toward targeted reads.
2. File-read deduplication: when the same (path, offset, limit) is read
a second time and the file hasn't been modified (mtime unchanged),
return a lightweight stub instead of re-sending the full content.
Writes and patches naturally change mtime, so post-edit reads always
return fresh content. The dedup cache is cleared on context
compression — after compression the original read content is
summarised away, so the model needs the full content again.
3. Device path blocking: paths like /dev/zero, /dev/random, /dev/stdin
etc. are rejected before any I/O to prevent process hangs from
infinite-output or blocking-input devices.
Tests: 17 new tests covering all three features plus the dedup-reset-
on-compression integration. All 52 file-read tests pass (35 existing +
17 new). Full tool suite (2124 tests) passes with 0 failures.
* feat: make file_read_max_chars configurable, add docs
Add file_read_max_chars to DEFAULT_CONFIG (default 100K). read_file_tool
reads this on first call and caches for the process lifetime. Users on
large-context models can raise it; users on small local models can lower it.
Also adds a 'File Read Safety' section to the configuration docs
explaining the char limit, dedup behavior, and example values.
* docs: update llama.cpp section with --jinja flag and tool calling guide
The llama.cpp docs were missing the --jinja flag which is required for
tool calling to work. Without it, models output tool calls as raw JSON
text instead of structured API responses, making Hermes unable to
execute them.
Changes:
- Add --jinja and -fa flags to the server startup example
- Replace deprecated env vars (OPENAI_BASE_URL, LLM_MODEL) with
hermes model interactive setup
- Add caution block explaining the --jinja requirement and symptoms
- List models with native tool calling support
- Add /props endpoint verification tip
* docs+feat: comprehensive local LLM provider guides and context length warning
Docs (providers.md):
- Rewrote Ollama section with context length warning (defaults to 4k on
<24GB VRAM), three methods to increase it, and verification steps
- Rewrote vLLM section with --max-model-len, tool calling flags
(--enable-auto-tool-choice, --tool-call-parser), and context guidance
- Rewrote SGLang section with --context-length, --tool-call-parser,
and warning about 128-token default max output
- Added LM Studio section (port 1234, context length defaults to 2048,
tool calling since 0.3.6)
- Added llama.cpp context length flag (-c) and GPU offload (-ngl)
- Added Troubleshooting Local Models section covering:
- Tool calls appearing as text (with per-server fix table)
- Silent context truncation and diagnosis commands
- Low detected context at startup
- Truncated responses
- Replaced all deprecated env vars (OPENAI_BASE_URL, LLM_MODEL) with
hermes model interactive setup and config.yaml examples
- Added deprecation warning for legacy env vars in General Setup
Code (cli.py):
- Added context length warning in show_banner() when detected context
is <= 8192 tokens, with server-specific fix hints:
- Ollama (port 11434): suggests OLLAMA_CONTEXT_LENGTH env var
- LM Studio (port 1234): suggests model settings adjustment
- Other servers: suggests config.yaml override
Tests:
- 9 new tests covering warning thresholds, server-specific hints,
and no-warning cases
* docs: clarify WhatsApp allowlist behavior and document WHATSAPP_ALLOW_ALL_USERS
- Add WHATSAPP_ALLOW_ALL_USERS and WHATSAPP_DEBUG to env vars reference
- Warn that * is not a wildcard and silently blocks all messages
- Show WHATSAPP_ALLOWED_USERS as optional, not required
- Update troubleshooting with the * trap and debug mode tip
- Fix Security section to mention the allow-all alternative
Prompted by a user report in Discord where WHATSAPP_ALLOWED_USERS=*
caused all incoming messages to be silently dropped at the bridge level.
* feat: support * wildcard in platform allowlists
Follow the precedent set by SIGNAL_GROUP_ALLOWED_USERS which already
supports * as an allow-all wildcard.
Bridge (allowlist.js): matchesAllowedUser() now checks for * in the
allowedUsers set before iterating sender aliases.
Gateway (run.py): _is_authorized() checks for * in allowed_ids after
parsing the allowlist. This is generic — works for all platforms, not
just WhatsApp.
Updated docs to document * as a supported value instead of warning
against it. Added WHATSAPP_ALLOW_ALL_USERS and WHATSAPP_DEBUG to
the env vars reference.
Tests: JS allowlist test + 2 Python gateway tests (WhatsApp + Telegram
to verify cross-platform behavior).
* feat(auth): add same-provider credential pools and rotation UX
Add same-provider credential pooling so Hermes can rotate across
multiple credentials for a single provider, recover from exhausted
credentials without jumping providers immediately, and configure
that behavior directly in hermes setup.
- agent/credential_pool.py: persisted per-provider credential pools
- hermes auth add/list/remove/reset CLI commands
- 429/402/401 recovery with pool rotation in run_agent.py
- Setup wizard integration for pool strategy configuration
- Auto-seeding from env vars and existing OAuth state
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
Salvaged from PR #2647
* fix(tests): prevent pool auto-seeding from host env in credential pool tests
Tests for non-pool Anthropic paths and auth remove were failing when
host env vars (ANTHROPIC_API_KEY) or file-backed OAuth credentials
were present. The pool auto-seeding picked these up, causing unexpected
pool entries in tests.
- Mock _select_pool_entry in auxiliary_client OAuth flag tests
- Clear Anthropic env vars and mock _seed_from_singletons in auth remove test
* feat(auth): add thread safety, least_used strategy, and request counting
- Add threading.Lock to CredentialPool for gateway thread safety
(concurrent requests from multiple gateway sessions could race on
pool state mutations without this)
- Add 'least_used' rotation strategy that selects the credential
with the lowest request_count, distributing load more evenly
- Add request_count field to PooledCredential for usage tracking
- Add mark_used() method to increment per-credential request counts
- Wrap select(), mark_exhausted_and_rotate(), and try_refresh_current()
with lock acquisition
- Add tests: least_used selection, mark_used counting, concurrent
thread safety (4 threads × 20 selects with no corruption)
* feat(auth): add interactive mode for bare 'hermes auth' command
When 'hermes auth' is called without a subcommand, it now launches an
interactive wizard that:
1. Shows full credential pool status across all providers
2. Offers a menu: add, remove, reset cooldowns, set strategy
3. For OAuth-capable providers (anthropic, nous, openai-codex), the
add flow explicitly asks 'API key or OAuth login?' — making it
clear that both auth types are supported for the same provider
4. Strategy picker shows all 4 options (fill_first, round_robin,
least_used, random) with the current selection marked
5. Remove flow shows entries with indices for easy selection
The subcommand paths (hermes auth add/list/remove/reset) still work
exactly as before for scripted/non-interactive use.
* fix(tests): update runtime_provider tests for config.yaml source of truth (#4165)
Tests were using OPENAI_BASE_URL env var which is no longer consulted
after #4165. Updated to use model config (provider, base_url, api_key)
which is the new single source of truth for custom endpoint URLs.
* feat(auth): support custom endpoint credential pools keyed by provider name
Custom OpenAI-compatible endpoints all share provider='custom', making
the provider-keyed pool useless. Now pools for custom endpoints are
keyed by 'custom:<normalized_name>' where the name comes from the
custom_providers config list (auto-generated from URL hostname).
- Pool key format: 'custom:together.ai', 'custom:local-(localhost:8080)'
- load_pool('custom:name') seeds from custom_providers api_key AND
model.api_key when base_url matches
- hermes auth add/list now shows custom endpoints alongside registry
providers
- _resolve_openrouter_runtime and _resolve_named_custom_runtime check
pool before falling back to single config key
- 6 new tests covering custom pool keying, seeding, and listing
* docs: add Excalidraw diagram of full credential pool flow
Comprehensive architecture diagram showing:
- Credential sources (env vars, auth.json OAuth, config.yaml, CLI)
- Pool storage and auto-seeding
- Runtime resolution paths (registry, custom, OpenRouter)
- Error recovery (429 retry-then-rotate, 402 immediate, 401 refresh)
- CLI management commands and strategy configuration
Open at: https://excalidraw.com/#json=2Ycqhqpi6f12E_3ITyiwh,c7u9jSt5BwrmiVzHGbm87g
* fix(tests): update setup wizard pool tests for unified select_provider_and_model flow
The setup wizard now delegates to select_provider_and_model() instead
of using its own prompt_choice-based provider picker. Tests needed:
- Mock select_provider_and_model as no-op (provider pre-written to config)
- Call _stub_tts BEFORE custom prompt_choice mock (it overwrites it)
- Pre-write model.provider to config so the pool step is reached
* docs: add comprehensive credential pool documentation
- New page: website/docs/user-guide/features/credential-pools.md
Full guide covering quick start, CLI commands, rotation strategies,
error recovery, custom endpoint pools, auto-discovery, thread safety,
architecture, and storage format.
- Updated fallback-providers.md to reference credential pools as the
first layer of resilience (same-provider rotation before cross-provider)
- Added hermes auth to CLI commands reference with usage examples
- Added credential_pool_strategies to configuration guide
* chore: remove excalidraw diagram from repo (external link only)
* refactor: simplify credential pool code — extract helpers, collapse extras, dedup patterns
- _load_config_safe(): replace 4 identical try/except/import blocks
- _iter_custom_providers(): shared generator for custom provider iteration
- PooledCredential.extra dict: collapse 11 round-trip-only fields
(token_type, scope, client_id, portal_base_url, obtained_at,
expires_in, agent_key_id, agent_key_expires_in, agent_key_reused,
agent_key_obtained_at, tls) into a single extra dict with
__getattr__ for backward-compatible access
- _available_entries(): shared exhaustion-check between select and peek
- Dedup anthropic OAuth seeding (hermes_pkce + claude_code identical)
- SimpleNamespace replaces class _Args boilerplate in auth_commands
- _try_resolve_from_custom_pool(): shared pool-check in runtime_provider
Net -17 lines. All 383 targeted tests pass.
---------
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
setup_model_provider() now delegates to select_provider_and_model()
from main.py, so new providers only need to be wired in main.py.
Removed setup.py from file checklists, replaced the setup.py section
with a tip explaining the automatic inheritance.
Update docs to reflect that tool progress now streams inline during
SSE responses. Previously docs said tool calls were invisible.
- api-server.md: add 'Tool progress in streams' note to streaming docs
- open-webui.md: update 'How It Works' steps, add Tool Progress tip
Major reorganization of the documentation site for better discoverability
and navigation. 94 pages across 8 top-level sections (was 5).
Structural changes:
- Promote Features from 3-level-deep subcategory to top-level section
with new Overview hub page categorizing all 26 feature pages
- Promote Messaging Platforms from User Guide subcategory to top-level
section, add platform comparison matrix (13 platforms x 7 features)
- Create new Integrations section with hub page, grouping MCP, ACP,
API Server, Honcho, Provider Routing, Fallback Providers
- Extract AI provider content (626 lines) from configuration.md into
dedicated integrations/providers.md — configuration.md drops from
1803 to 1178 lines
- Subcategorize Developer Guide into Architecture, Extending, Internals
- Rename "User Guide" to "Using Hermes" for top-level items
Orphan fixes (7 pages now reachable via sidebar):
- build-a-hermes-plugin.md added to Guides
- sms.md added to Messaging Platforms
- context-references.md added to Features > Core
- plugins.md added to Features > Core
- git-worktrees.md added to Using Hermes
- checkpoints-and-rollback.md added to Using Hermes
- checkpoints.md (30-line stub) deleted, superseded by
checkpoints-and-rollback.md (203 lines)
New files:
- integrations/index.md — Integrations hub page
- integrations/providers.md — AI provider setup (extracted)
- user-guide/features/overview.md — Features hub page
Broken link fixes:
- quickstart.md, faq.md: update context-length-detection anchors
- configuration.md: update checkpoints link
- overview.md: fix checkpoint link path
Docusaurus build verified clean (zero broken links/anchors).
Documents the Telegram webhook mode from #3880:
- New 'Webhook Mode' section in telegram.md with polling vs webhook
comparison, config table, Fly.io deployment example, troubleshooting
- Add TELEGRAM_WEBHOOK_URL/PORT/SECRET to environment-variables.md
- Add Telegram section to .env.example (existing + webhook vars)
Co-authored-by: raulbcs <raulbcs@users.noreply.github.com>
* feat(telegram): add webhook mode as alternative to polling
When TELEGRAM_WEBHOOK_URL is set, the adapter starts an HTTP webhook
server (via python-telegram-bot's start_webhook()) instead of long
polling. This enables cloud platforms like Fly.io and Railway to
auto-wake suspended machines on inbound HTTP traffic.
Polling remains the default — no behavior change unless the env var
is set.
Env vars:
TELEGRAM_WEBHOOK_URL Public HTTPS URL for Telegram to push to
TELEGRAM_WEBHOOK_PORT Local listen port (default 8443)
TELEGRAM_WEBHOOK_SECRET Secret token for update verification
Cherry-picked and adapted from PR #2022 by SHL0MS. Preserved all
current main enhancements (network error recovery, polling conflict
detection, DM topics setup).
Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com>
* fix: send_document call in background task delivery + vision download timeout
Two fixes salvaged from PR #2269 by amethystani:
1. gateway/run.py: adapter.send_file() → adapter.send_document()
send_file() doesn't exist on BasePlatformAdapter. Background task
media files were silently never delivered (AttributeError swallowed
by except Exception: pass).
2. tools/vision_tools.py: configurable image download timeout via
HERMES_VISION_DOWNLOAD_TIMEOUT env var (default 30s), plus guard
against raise None when max_retries=0.
The third fix in #2269 (opencode-go auth config) was already resolved
on main.
Co-authored-by: amethystani <amethystani@users.noreply.github.com>
* docs: expand terminal backends section + fix feishu MDX build error
---------
Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com>
Co-authored-by: amethystani <amethystani@users.noreply.github.com>
* feat(telegram): add webhook mode as alternative to polling
When TELEGRAM_WEBHOOK_URL is set, the adapter starts an HTTP webhook
server (via python-telegram-bot's start_webhook()) instead of long
polling. This enables cloud platforms like Fly.io and Railway to
auto-wake suspended machines on inbound HTTP traffic.
Polling remains the default — no behavior change unless the env var
is set.
Env vars:
TELEGRAM_WEBHOOK_URL Public HTTPS URL for Telegram to push to
TELEGRAM_WEBHOOK_PORT Local listen port (default 8443)
TELEGRAM_WEBHOOK_SECRET Secret token for update verification
Cherry-picked and adapted from PR #2022 by SHL0MS. Preserved all
current main enhancements (network error recovery, polling conflict
detection, DM topics setup).
Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com>
* fix: send_document call in background task delivery + vision download timeout
Two fixes salvaged from PR #2269 by amethystani:
1. gateway/run.py: adapter.send_file() → adapter.send_document()
send_file() doesn't exist on BasePlatformAdapter. Background task
media files were silently never delivered (AttributeError swallowed
by except Exception: pass).
2. tools/vision_tools.py: configurable image download timeout via
HERMES_VISION_DOWNLOAD_TIMEOUT env var (default 30s), plus guard
against raise None when max_retries=0.
The third fix in #2269 (opencode-go auth config) was already resolved
on main.
Co-authored-by: amethystani <amethystani@users.noreply.github.com>
---------
Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com>
Co-authored-by: amethystani <amethystani@users.noreply.github.com>
Adds Discord-style mention gating for Telegram groups:
- telegram.require_mention: gate group messages (default: false)
- telegram.mention_patterns: regex wake-word triggers
- telegram.free_response_chats: bypass gating for specific chats
When require_mention is enabled, group messages are accepted only for:
- slash commands
- replies to the bot
- @botusername mentions
- regex wake-word pattern matches
DMs remain unrestricted. @mention text is stripped before passing to
the agent. Invalid regex patterns are ignored with a warning.
Config bridges follow the existing Discord pattern (yaml → env vars).
Cherry-picked and adapted from PR #1977 by mcleay. Fixed ChatType
comparison to work without python-telegram-bot installed (uses string
matching instead of enum, consistent with other entity_type checks).
Co-authored-by: mcleay <mcleay@users.noreply.github.com>
Adds WeCom as a gateway platform adapter using the AI Bot WebSocket
gateway for real-time bidirectional communication. No public endpoint
or new pip dependencies needed (uses existing aiohttp + httpx).
Features:
- WebSocket persistent connection with auto-reconnect (exponential backoff)
- DM and group messaging with configurable access policies
- Media upload/download with AES decryption for encrypted attachments
- Markdown rendering, quote context preservation
- Proactive + passive reply message modes
- Chunked media upload pipeline (512KB chunks)
Cherry-picked from PR #1898 by EvilRan with:
- Moved to current main (PR was 300 commits behind)
- Skipped base.py regressions (reply_to additions are good but belong
in a separate PR since they affect all platforms)
- Fixed test assertions to match current base class send() signature
(reply_to=None kwarg now explicit)
- All 16 integration points added surgically to current main
- No new pip dependencies (aiohttp + httpx already installed)
Fixes#1898
Co-authored-by: EvilRan <EvilRan@users.noreply.github.com>
The existing docs were two lines. The migration script handles 35
categories of data across persona, memory, skills, messaging platforms,
model providers, MCP servers, agent config, and more.
New docs cover:
- All CLI options (--dry-run, --preset, --overwrite, --migrate-secrets,
--source, --workspace-target, --skill-conflict, --yes)
- 27 directly-imported categories with source → destination mapping
- 7 archived categories with manual recreation guidance
- Security notes on API key allowlisting
- Usage examples for common migration scenarios
Adds Feishu (ByteDance's enterprise messaging platform) as a gateway
platform adapter with full feature parity: WebSocket + webhook transports,
message batching, dedup, rate limiting, rich post/card content parsing,
media handling (images/audio/files/video), group @mention gating,
reaction routing, and interactive card button support.
Cherry-picked from PR #1793 by penwyp with:
- Moved to current main (PR was 458 commits behind)
- Fixed _send_with_retry shadowing BasePlatformAdapter method (renamed to
_feishu_send_with_retry to avoid signature mismatch crash)
- Fixed import structure: aiohttp/websockets imported independently of
lark_oapi so they remain available when SDK is missing
- Fixed get_hermes_home import (hermes_constants, not hermes_cli.config)
- Added skip decorators for tests requiring lark_oapi SDK
- All 16 integration points added surgically to current main
New dependency: lark-oapi>=1.5.3,<2 (optional, pip install hermes-agent[feishu])
Fixes#1788
Co-authored-by: penwyp <penwyp@users.noreply.github.com>
hermes mcp serve starts a stdio MCP server that lets any MCP client
(Claude Code, Cursor, Codex, etc.) interact with Hermes conversations.
Matches OpenClaw's 9-tool channel bridge surface:
Tools exposed:
- conversations_list: list active sessions across all platforms
- conversation_get: details on one conversation
- messages_read: read message history
- attachments_fetch: extract non-text content from messages
- events_poll: poll for new events since a cursor
- events_wait: long-poll / block until next event (near-real-time)
- messages_send: send to any platform via send_message_tool
- channels_list: browse available messaging targets
- permissions_list_open: list pending approval requests
- permissions_respond: allow/deny approvals
Architecture:
- EventBridge: background thread polls SessionDB for new messages,
maintains in-memory event queue with waiter support
- Reads sessions.json + SessionDB directly (no gateway dep for reads)
- Reuses send_message_tool for sending (same platform adapters)
- FastMCP server with stdio transport
- Zero new dependencies (uses existing mcp>=1.2.0 optional dep)
Files:
- mcp_serve.py: MCP server + EventBridge (~600 lines)
- hermes_cli/main.py: added serve sub-parser to hermes mcp
- hermes_cli/mcp_config.py: route serve action to run_mcp_server
- tests/test_mcp_serve.py: 53 tests
- docs: updated MCP page + CLI commands reference
The docs incorrectly showed aliases as 'hermes-work' when the actual
implementation creates 'work' (profile name directly, no prefix).
Rewrote the user guide to lead with the alias pattern:
hermes profile create coder → coder chat, coder setup, etc.
Also clarified that the banner shows 'Profile: coder' and the prompt
shows 'coder ❯' when a non-default profile is active.
Fixed alias paths in command reference (hermes-work → work).
The gateway now ships with a built-in boot-md hook that checks for
~/.hermes/BOOT.md on every startup. If the file exists, the agent
executes its instructions in a background thread. No installation
or configuration needed — just create the file.
No BOOT.md = zero overhead (the hook silently returns).
Implementation:
- gateway/builtin_hooks/boot_md.py: handler with boot prompt,
background thread, [SILENT] suppression, error handling
- gateway/hooks.py: _register_builtin_hooks() called at the start
of discover_and_load() to wire in built-in hooks
- Docs updated: hooks page documents BOOT.md as a built-in feature
* fix(discord): clean up deferred "thinking..." after slash commands complete
After a slash command is deferred (interaction.response.defer), the
"thinking..." indicator persisted indefinitely because the code used
followup.send() which creates a separate message instead of replacing
or removing the deferred response.
Fix: use edit_original_response() to replace "thinking..." with the
confirmation text when provided, or delete_original_response() to
remove it when there is no confirmation. Also consolidated /reasoning
and /voice handlers to use _run_simple_slash instead of duplicating
the defer+dispatch pattern.
Fixes#3595.
* docs: update skills catalog — add red-teaming category and all 16 optional skills
The skills catalog was missing:
- red-teaming category with the godmode jailbreaking skill
- The entire optional skills section (16 skills across 10 categories)
Added both with descriptions sourced from each SKILL.md frontmatter.
Verified against the actual skills/ and optional-skills/ directories.
Add skills.external_dirs config option — a list of additional directories
to scan for skills alongside ~/.hermes/skills/. External dirs are read-only:
skill creation/editing always writes to the local dir. Local skills take
precedence when names collide.
This lets users share skills across tools/agents without copying them into
Hermes's own directory (e.g. ~/.agents/skills, /shared/team-skills).
Changes:
- agent/skill_utils.py: add get_external_skills_dirs() and get_all_skills_dirs()
- agent/prompt_builder.py: scan external dirs in build_skills_system_prompt()
- tools/skills_tool.py: _find_all_skills() and skill_view() search external dirs;
security check recognizes configured external dirs as trusted
- agent/skill_commands.py: /skill slash commands discover external skills
- hermes_cli/config.py: add skills.external_dirs to DEFAULT_CONFIG
- cli-config.yaml.example: document the option
- tests/agent/test_external_skills.py: 11 tests covering discovery, precedence,
deduplication, and skill_view for external skills
Requested by community member primco.
Three docs pages updated:
- security.md: New 'Credential File Passthrough' section, updated
sandbox filter table to include Docker/Modal rows, added info box
about Docker env_passthrough merge
- creating-skills.md: New 'Credential File Requirements' section
with frontmatter examples and guidance on when to use env vars
vs credential files
- environment-variables.md: Updated TERMINAL_DOCKER_FORWARD_ENV
description to note auto-passthrough from skills
parallel-cli is a paid third-party vendor skill that requires
PARALLEL_API_KEY, but it was shipped in the default skills/ directory
with no env-var gate. This caused it to appear in every user's system
prompt even when they have no Parallel account or API key.
Move it to optional-skills/ so it is only visible through the Skills
Hub and must be explicitly installed. Also remove it from the default
skills catalog docs.
Adds a complete Docker packaging for Hermes Agent:
- Dockerfile based on debian:13.4 with all deps
- Entrypoint that bootstraps .env, config.yaml, SOUL.md on first run
- CI workflow to build, test, and push to DockerHub
- Documentation for interactive, gateway, and upgrade workflows
Closes#850, #913.
Changes vs original PR:
- Removed pre-created legacy cache/platform dirs from entrypoint
(image_cache, audio_cache, pairing, whatsapp/session) — these are
now created on demand by the application using the consolidated
layout from get_hermes_dir()
- Moved docs from docs/docker.md to website/docs/user-guide/docker.md
and added to Docusaurus sidebar
Co-authored-by: benbarclay <benbarclay@users.noreply.github.com>
Adds MATTERMOST_REQUIRE_MENTION and MATTERMOST_FREE_RESPONSE_CHANNELS
env vars, matching Discord's existing mention gating pattern.
- MATTERMOST_REQUIRE_MENTION=false: respond to all channel messages
- MATTERMOST_FREE_RESPONSE_CHANNELS=id1,id2: specific channels where
bot responds without @mention even when require_mention is true
- DMs always respond regardless of mention settings
- @mention is now stripped from message text (clean agent input)
7 new tests for mention gating, free-response channels, DM bypass,
and mention stripping. Updated existing test for mention stripping.
Docs: updated mattermost.md with Mention Behavior section,
environment-variables.md with new vars, config.py with metadata.
Users intuitively write model: { model: my-model } instead of
model: { default: my-model } and it silently falls back to the
hardcoded default. Now both spellings work across all three config
consumers: runtime_provider, CLI, and gateway.
Co-authored-by: ygd58 <ygd58@users.noreply.github.com>
Adds 'hermes webhook' CLI subcommand and a skill — zero new model tools.
CLI commands (require webhook platform to be enabled):
hermes webhook subscribe <name> [--events, --prompt, --deliver, ...]
hermes webhook list
hermes webhook remove <name>
hermes webhook test <name>
All commands gate on webhook platform being enabled in config. If not
configured, prints setup instructions (gateway setup wizard, manual
config.yaml, or env vars).
The agent uses these via terminal tool, guided by the webhook-subscriptions
skill which documents setup, common patterns (GitHub, Stripe, CI/CD,
monitoring), prompt template syntax, security, and troubleshooting.
Adapter enhancement: webhook.py hot-reloads dynamic subscriptions from
~/.hermes/webhook_subscriptions.json on each incoming request (mtime-gated).
Static config.yaml routes always take precedence.
Docs: updated webhooks.md with Dynamic Subscriptions section, added
hermes webhook to cli-commands.md reference.
No new model tools. No toolset changes.
24 new tests for CLI CRUD, persistence, enabled-gate, and adapter
dynamic route loading.
Salvage of PR #2173 (hanai) and PR #3432 (timknip).
Injects PATH, VIRTUAL_ENV, and HERMES_HOME into the macOS launchd plist so gateway subprocesses find user-installed tools (node, ffmpeg, etc.). Matches systemd unit parity with venv/bin, node_modules/.bin, and resolved node dir in PATH. Includes 7 new tests and docs updates across 4 pages.
Co-Authored-By: Han <ihanai1991@gmail.com>
Co-Authored-By: timknip <timknip@users.noreply.github.com>
Without enabling the Messages Tab in App Home settings, users see
"Sending messages to this app has been turned off" when trying to DM
the bot — even with all correct scopes and event subscriptions.
Add Step 5 (Enable the Messages Tab) between Event Subscriptions and
Install App, with a danger admonition. Also add troubleshooting entry
for this specific error message. Renumber subsequent steps (6→7→8→9).
Co-authored-by: Alberto Leal <mail4alberto@gmail.com>
The documentation incorrectly instructed users to set Public Bot to OFF,
but this prevents using the Discord-provided invite link (recommended method),
causing the error: 'Private application cannot have a default authorization link'.
Changes:
- Changed Step 2: Public Bot now set to ON (required for Installation tab method)
- Added info callout explaining the Private Bot alternative (use Manual URL)
- Added note in Step 5 Option A clarifying the Public Bot requirement
Fixes Discord bot setup flow for new users following the recommended path.
Co-authored-by: Docs Fix <docs-fix@example.com>
Drop the swe-rex dependency for Modal terminal backend and use the
Modal SDK directly (Sandbox.create + Sandbox.exec). This fixes:
- AsyncUsageWarning from synchronous App.lookup() in async context
- DeprecationError from unencrypted_ports / .url on unencrypted tunnels
(deprecated 2026-03-05)
The new implementation:
- Uses modal.App.lookup.aio() for async-safe app creation
- Uses Sandbox.create.aio() with 'sleep infinity' entrypoint
- Uses Sandbox.exec.aio() for direct command execution (no HTTP server
or tunnel needed)
- Keeps all existing features: persistent filesystem snapshots,
configurable resources (CPU/memory/disk), sudo support, interrupt
handling, _AsyncWorker for event loop safety
Consistent with the Docker backend precedent (PR #2804) where we
removed mini-swe-agent in favor of direct docker run.
Files changed:
- tools/environments/modal.py - core rewrite
- tools/terminal_tool.py - health check: modal instead of swerex
- hermes_cli/setup.py - install modal instead of swe-rex[modal]
- pyproject.toml - modal extra: modal>=1.0.0 instead of swe-rex[modal]
- scripts/kill_modal.sh - grep for hermes-agent instead of swe-rex
- tests/ - updated for new implementation
- environments/README.md - updated patches section
- website/docs - updated install command
The plugin system defined six lifecycle hooks but only pre_tool_call and
post_tool_call were invoked. This activates the remaining four so that
external plugins (e.g. memory systems) can hook into the conversation
loop without touching core code.
Hook semantics:
- on_session_start: fires once when a new session is created
- pre_llm_call: fires once per turn before the tool-calling loop;
plugins can return {"context": "..."} to inject into the ephemeral
system prompt (not cached, not persisted)
- post_llm_call: fires once per turn after the loop completes, with
user_message and assistant_response for sync/storage
- on_session_end: fires at the end of every run_conversation call
invoke_hook() now returns a list of non-None callback return values,
enabling pre_llm_call context injection while remaining backward
compatible (existing hooks that return None are unaffected).
Salvaged from PR #2823.
Co-authored-by: Nicolò Boschi <boschi1997@gmail.com>
Salvage of PR #1747 (original PR #1171 by @davanstrien) onto current main.
Registers Hugging Face Inference Providers (router.huggingface.co/v1) as a named provider:
- hermes chat --provider huggingface (or --provider hf)
- 18 curated open models via hermes model picker
- HF_TOKEN in ~/.hermes/.env
- OpenAI-compatible endpoint with automatic failover (Groq, Together, SambaNova, etc.)
Files: auth.py, models.py, main.py, setup.py, config.py, model_metadata.py, .env.example, 5 docs pages, 17 new tests.
Co-authored-by: Daniel van Strien <davanstrien@gmail.com>
- add managed modal and gateway-backed tool integrations\n- improve CLI setup, auth, and configuration for subscriber flows\n- expand tests and docs for managed tool support
* feat: config-gated /verbose command for messaging gateway
Add gateway_config_gate field to CommandDef, allowing cli_only commands
to be conditionally available in the gateway based on a config value.
- CommandDef gains gateway_config_gate: str | None — a config dotpath
that, when truthy, overrides cli_only for gateway surfaces
- /verbose uses gateway_config_gate='display.tool_progress_command'
- Default is off (cli_only behavior preserved)
- When enabled, /verbose cycles tool_progress mode (off/new/all/verbose)
in the gateway, saving to config.yaml — same cycle as the CLI
- Gateway helpers (help, telegram menus, slack mapping) dynamically
check config to include/exclude config-gated commands
- GATEWAY_KNOWN_COMMANDS always includes config-gated commands so
the gateway recognizes them and can respond appropriately
- Handles YAML 1.1 bool coercion (bare 'off' parses as False)
- 8 new tests for the config gate mechanism + gateway handler
* docs: document gateway_config_gate and /verbose messaging support
- AGENTS.md: add gateway_config_gate to CommandDef fields
- slash-commands.md: note /verbose can be enabled for messaging, update Notes
- configuration.md: add tool_progress_command to display section + usage note
- cli.md: cross-link to config docs for messaging enablement
- messaging/index.md: show tool_progress_command in config snippet
- plugins.md: add gateway_config_gate to register_command parameter table
Salvages PR #3005 by web3blind. Cherry-picked onto current main with functional skill binding and docs added.
- DM topic creation via createForumTopic (Bot API 9.4, Feb 2026)
- Config-driven topics with thread_id persistence across restarts
- Session isolation via existing build_session_key thread_id support
- auto_skill field on MessageEvent for topic-skill bindings
- Gateway auto-loads bound skill on new sessions (same as /skill commands)
- Docs: full Private Chat Topics section in Telegram messaging guide
- 20 tests (17 original + 3 for auto_skill)
Closes#2598
Co-authored-by: web3blind <web3blind@users.noreply.github.com>
* feat: nix flake, uv2nix build, dev shell and home manager
* fixed nix run, updated docs for setup
* feat(nix): NixOS module with persistent container mode, managed guards, checks
- Replace homeModules.nix with nixosModules.nix (two deployment modes)
- Mode A (native): hardened systemd service with ProtectSystem=strict
- Mode B (container): persistent Ubuntu container with /nix/store bind-mount,
identity-hash-based recreation, GC root protection, symlink-based updates
- Add HERMES_MANAGED guards blocking CLI config mutation (config set, setup,
gateway install/uninstall) when running under NixOS module
- Add nix/checks.nix with build-time verification (binary, CLI, managed guard)
- Remove container.nix (no Nix-built OCI image; pulls ubuntu:24.04 at runtime)
- Simplify packages.nix (drop fetchFromGitHub submodules, PYTHONPATH wrappers)
- Rewrite docs/nixos-setup.md with full options reference, container
architecture, secrets management, and troubleshooting guide
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Update config.py
* feat(nix): add CI workflow and enhanced build checks
- GitHub Actions workflow for nix flake check + build on linux/macOS
- Entry point sync check to catch pyproject.toml drift
- Expanded managed-guard check to cover config edit
- Wrap hermes-acp binary in Nix package
- Fix Path type mismatch in is_managed()
* Update MCP server package name; bundled skills support
* fix reading .env. instead have container user a common mounted .env file
* feat(nix): container entrypoint with privilege drop and sudo provisioning
Container was running as non-root via --user, which broke apt/pip installs
and caused crashes when $HOME didn't exist. Replace --user with a Nix-built
entrypoint script that provisions the hermes user, sudo (NOPASSWD), and
/home/hermes inside the container on first boot, then drops privileges via
setpriv. Writable layer persists so setup only runs once.
Also expands MCP server options to support HTTP transport and sampling.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix group and user creation in container mode
* feat(nix): persistent /home/hermes and MESSAGING_CWD in container mode
Container mode now bind-mounts ${stateDir}/home to /home/hermes so the
agent's home directory survives container recreation. Previously it lived
in the writable layer and was lost on image/volume/options changes.
Also passes MESSAGING_CWD to the container so the agent finds its
workspace and documents, matching native mode behavior.
Other changes:
- Extract containerDataDir/containerHomeDir bindings (no more magic strings)
- Fix entrypoint chown to run unconditionally (volume mounts always exist)
- Add schema field to container identity hash for auto-recreation
- Add idempotency test (Scenario G) to config-roundtrip check
* docs: add Nix & NixOS setup guide to docs site
Add comprehensive Nix documentation to the Docusaurus site at
website/docs/getting-started/nix-setup.md, covering nix run/profile
install, NixOS module (native + container modes), declarative settings,
secrets management, MCP servers, managed mode, container architecture,
dev shell, flake checks, and full options reference.
- Register nix-setup in sidebar after installation page
- Add Nix callout tip to installation.md linking to new guide
- Add canonical version pointer in docs/nixos-setup.md
* docs: remove docs/nixos-setup.md, consolidate into website docs
Backfill missing details (restart/restartSec in full example,
gateway.pid, 0750 permissions, docker inspect commands) into
the canonical website/docs/getting-started/nix-setup.md and
delete the old standalone file.
* fix(nix): add compression.protect_last_n and target_ratio to config-keys.json
New keys were added to DEFAULT_CONFIG on main, causing the
config-drift check to fail in CI.
* fix(nix): skip checks on aarch64-darwin (onnxruntime wheel missing)
The full Python venv includes onnxruntime (via faster-whisper/STT)
which lacks a compatible uv2nix wheel on aarch64-darwin. Gate all
checks behind stdenv.hostPlatform.isLinux. The package and devShell
still evaluate on macOS.
* fix(nix): skip flake check and build on macOS CI
onnxruntime (transitive dep via faster-whisper) lacks a compatible
uv2nix wheel on aarch64-darwin. Run full checks and build on Linux
only; macOS CI verifies the flake evaluates without building.
* fix(nix): preserve container writable layer across nixos-rebuild
The container identity hash included the entrypoint's Nix store path,
which changes on every nixpkgs update (due to runtimeShell/stdenv
input-addressing). This caused false-positive identity mismatches,
triggering container recreation and losing the persistent writable layer.
- Use stable symlink (current-entrypoint) like current-package already does
- Remove entrypoint from identity hash (only image/volumes/options matter)
- Add GC root for entrypoint so nix-collect-garbage doesn't break it
- Remove global HERMES_HOME env var from addToSystemPackages (conflicted
with interactive CLI use, service already sets its own)
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The hooks page only documented gateway event hooks (HOOK.yaml system).
The plugins page listed plugin hooks (pre_tool_call, etc.) that weren't
referenced from the hooks page, which was confusing.
Changes:
- hooks.md: Add overview table showing both hook systems
- hooks.md: Add Plugin Hooks section with available hooks, callback
signatures, and example
- hooks.md: Add missing session:end gateway event (emitted but undocumented)
- hooks.md: Mark pre_llm_call, post_llm_call, on_session_start,
on_session_end as planned (defined in VALID_HOOKS but not yet invoked)
- hooks.md: Update info box to cross-reference plugin hooks
- hooks.md: Fix heading hierarchy (gateway content as subsections)
- plugins.md: Add cross-reference to hooks page for full details
- plugins.md: Mark planned hooks as (planned)
zsh interprets square brackets as glob patterns, so
`pip install hermes-agent[voice]` fails with 'no matches found'.
Quote all pip install commands with extras across 5 docs pages (12 instances).
Reported by OFumik0OP.
* docs: update all docs for /model command overhaul and custom provider support
Documents the full /model command overhaul across 6 files:
AGENTS.md:
- Add model_switch.py to project structure tree
configuration.md:
- Rewrite General Setup with 3 config methods (interactive, config.yaml, env vars)
- Add new 'Switching Models with /model' section documenting all syntax variants
- Add 'Named Custom Providers' section with config.yaml examples and
custom:name:model triple syntax
slash-commands.md:
- Update /model descriptions in both CLI and messaging tables with
full syntax examples (provider:model, custom:model, custom:name:model,
bare custom auto-detect)
cli-commands.md:
- Add /model slash command subsection under hermes model with syntax table
- Add custom endpoint config to hermes model use cases
faq.md:
- Add config.yaml example for offline/local model setup
- Note that provider: custom is a first-class provider
- Document /model custom auto-detect
provider-runtime.md:
- Add model_switch.py to implementation file list
- Update provider families to show Custom as first-class with named variants
* docs: fix api-server response storage description — SQLite, not in-memory
The ResponseStore class uses SQLite persistence (with in-memory
fallback), not pure in-memory storage. Responses survive gateway
restarts.
New documentation for features that existed in code but had no docs:
New page:
- context-references.md: Full docs for @-syntax inline context
injection (@file:, @folder:, @diff, @staged, @git:, @url:) with
line ranges, CLI autocomplete, size limits, sensitive path blocking,
and error handling
configuration.md additions:
- Environment variable substitution: ${VAR_NAME} syntax in config.yaml
with expansion, fallback, and multi-reference support
- Gateway streaming: Progressive token delivery on messaging platforms
via message editing (StreamingConfig: enabled, transport, edit_interval,
buffer_threshold, cursor) with platform support matrix
- Web search backends: Three providers (Firecrawl, Parallel, Tavily)
with web.backend config key, capability matrix, auto-detection from
API keys, self-hosted Firecrawl, and Parallel search modes
security.md additions:
- SSRF protection: Always-on URL validation blocking private networks,
loopback, link-local, CGNAT, cloud metadata hostnames, with
fail-closed DNS and redirect chain re-validation
- Tirith pre-exec security scanning: Content-level command scanning
for homograph URLs, pipe-to-interpreter, terminal injection with
auto-install, SHA-256/cosign verification, config options, and
fail-open/fail-closed modes
sessions.md addition:
- Auto-generated session titles: Background LLM-powered title
generation after first exchange
creating-skills.md additions:
- Conditional skill activation: requires_toolsets, requires_tools,
fallback_for_toolsets, fallback_for_tools frontmatter fields with
matching logic and use cases
- Environment variable requirements: required_environment_variables
frontmatter for automatic env passthrough to sandboxed execution,
plus terminal.env_passthrough user config
* feat: env var passthrough for skills and user config
Skills that declare required_environment_variables now have those vars
passed through to sandboxed execution environments (execute_code and
terminal). Previously, execute_code stripped all vars containing KEY,
TOKEN, SECRET, etc. and the terminal blocklist removed Hermes
infrastructure vars — both blocked skill-declared env vars.
Two passthrough sources:
1. Skill-scoped (automatic): when a skill is loaded via skill_view and
declares required_environment_variables, vars that are present in
the environment are registered in a session-scoped passthrough set.
2. Config-based (manual): terminal.env_passthrough in config.yaml lets
users explicitly allowlist vars for non-skill use cases.
Changes:
- New module: tools/env_passthrough.py — shared passthrough registry
- hermes_cli/config.py: add terminal.env_passthrough to DEFAULT_CONFIG
- tools/skills_tool.py: register available skill env vars on load
- tools/code_execution_tool.py: check passthrough before filtering
- tools/environments/local.py: check passthrough in _sanitize_subprocess_env
and _make_run_env
- 19 new tests covering all layers
* docs: add environment variable passthrough documentation
Document the env var passthrough feature across four docs pages:
- security.md: new 'Environment Variable Passthrough' section with
full explanation, comparison table, and security considerations
- code-execution.md: update security section, add passthrough subsection,
fix comparison table
- creating-skills.md: add tip about automatic sandbox passthrough
- skills.md: add note about passthrough after secure setup docs
Live-tested: launched interactive CLI, loaded a skill with
required_environment_variables, verified TEST_SKILL_SECRET_KEY was
accessible inside execute_code sandbox (value: passthrough-test-value-42).
Complete cleanup after dropping the mini-swe-agent submodule (PR #2804):
- Remove MSWEA_SILENT_STARTUP and MSWEA_GLOBAL_CONFIG_DIR env var
settings from cli.py, run_agent.py, hermes_cli/main.py, doctor.py
- Remove mini-swe-agent health check from hermes doctor
- Remove 'minisweagent' from logger suppression lists
- Remove litellm/typer/platformdirs from requirements.txt
- Remove mini-swe-agent install steps from install.ps1 (Windows)
- Remove mini-swe-agent install steps from website docs
- Update all stale comments/docstrings referencing mini-swe-agent
in terminal_tool.py, tools/__init__.py, code_execution_tool.py,
environments/README.md, environments/agent_loop.py
- Remove mini_swe_runner from pyproject.toml py-modules
(still exists as standalone script for RL training use)
- Shrink test_minisweagent_path.py to empty stub
The orphaned mini-swe-agent/ directory on disk needs manual removal:
rm -rf mini-swe-agent/
Documents the full /model command overhaul across 6 files:
AGENTS.md:
- Add model_switch.py to project structure tree
configuration.md:
- Rewrite General Setup with 3 config methods (interactive, config.yaml, env vars)
- Add new 'Switching Models with /model' section documenting all syntax variants
- Add 'Named Custom Providers' section with config.yaml examples and
custom:name:model triple syntax
slash-commands.md:
- Update /model descriptions in both CLI and messaging tables with
full syntax examples (provider:model, custom:model, custom:name:model,
bare custom auto-detect)
cli-commands.md:
- Add /model slash command subsection under hermes model with syntax table
- Add custom endpoint config to hermes model use cases
faq.md:
- Add config.yaml example for offline/local model setup
- Note that provider: custom is a first-class provider
- Document /model custom auto-detect
provider-runtime.md:
- Add model_switch.py to implementation file list
- Update provider families to show Custom as first-class with named variants
Reads auxiliary.vision.timeout from config.yaml (default: 30s) and
passes it to async_call_llm. Useful for slow local vision models
that need more than 30 seconds.
Setting is in config.yaml (not .env) since it's not a secret:
auxiliary:
vision:
timeout: 120
Based on PR #2306.
Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
CI enforces ascii-guard linting on docs. Replaced ASCII box diagrams
with Mermaid flowcharts (open-webui architecture) and numbered lists
(CLI layout). Added diagram linting note to website README.
Based on PR #2364 by aydnOktay (closed — README had broken formatting).
Based on PR #1749 by @erosika (reimplemented on current main).
Extracts three protected methods from run() so wrapper CLIs can extend
the TUI without overriding the entire method:
- _get_extra_tui_widgets(): inject widgets between spacer and status bar
- _register_extra_tui_keybindings(kb, input_area): add keybindings
- _build_tui_layout_children(**widgets): full control over ordering
Default implementations reproduce existing layout exactly. The inline
HSplit in run() now delegates to _build_tui_layout_children().
5 tests covering defaults, widget insertion position, and keybinding
registration.
Users reported that the bot fails to resolve usernames without the
Server Members privileged intent enabled. Updated the setup docs
to mark it as Required instead of Optional.
Feedback from Blangs [MADD].
- quickstart.md: mention context length prompt for custom endpoints,
link to configuration docs, add Ollama to provider table
- faq.md: rewrite local models section with hermes model flow and
context length prompt example, add Ollama num_ctx tip, expand
context-length-exceeded troubleshooting with detection override
options and config.yaml examples
Co-authored-by: Test <test@test.com>
Replace the fragile hardcoded context length system with a multi-source
resolution chain that correctly identifies context windows per provider.
Key changes:
- New agent/models_dev.py: Fetches and caches the models.dev registry
(3800+ models across 100+ providers with per-provider context windows).
In-memory cache (1hr TTL) + disk cache for cold starts.
- Rewritten get_model_context_length() resolution chain:
0. Config override (model.context_length)
1. Custom providers per-model context_length
2. Persistent disk cache
3. Endpoint /models (local servers)
4. Anthropic /v1/models API (max_input_tokens, API-key only)
5. OpenRouter live API (existing, unchanged)
6. Nous suffix-match via OpenRouter (dot/dash normalization)
7. models.dev registry lookup (provider-aware)
8. Thin hardcoded defaults (broad family patterns)
9. 128K fallback (was 2M)
- Provider-aware context: same model now correctly resolves to different
context windows per provider (e.g. claude-opus-4.6: 1M on Anthropic,
128K on GitHub Copilot). Provider name flows through ContextCompressor.
- DEFAULT_CONTEXT_LENGTHS shrunk from 80+ entries to ~16 broad patterns.
models.dev replaces the per-model hardcoding.
- CONTEXT_PROBE_TIERS changed from [2M, 1M, 512K, 200K, 128K, 64K, 32K]
to [128K, 64K, 32K, 16K, 8K]. Unknown models no longer start at 2M.
- hermes model: prompts for context_length when configuring custom
endpoints. Supports shorthand (32k, 128K). Saved to custom_providers
per-model config.
- custom_providers schema extended with optional models dict for
per-model context_length (backward compatible).
- Nous Portal: suffix-matches bare IDs (claude-opus-4-6) against
OpenRouter's prefixed IDs (anthropic/claude-opus-4.6) with dot/dash
normalization. Handles all 15 current Nous models.
- Anthropic direct: queries /v1/models for max_input_tokens. Only works
with regular API keys (sk-ant-api*), not OAuth tokens. Falls through
to models.dev for OAuth users.
Tests: 5574 passed (18 new tests for models_dev + updated probe tiers)
Docs: Updated configuration.md context length section, AGENTS.md
Co-authored-by: Test <test@test.com>
The install script creates venv/ but several docs referenced .venv/,
causing agents to fail with 'No such file or directory' when following
AGENTS.md instructions.
Fixes#2066
Previously, Tab only handled dropdown completions. Users seeing gray
ghost text from history-based suggestions had no way to accept them
with Tab - they had to use Right arrow or Ctrl+E.
Now Tab follows priority:
1. Completion menu open → accept selected completion
2. Ghost text suggestion available → accept auto-suggestion
3. Otherwise → start completion menu
This matches user intuition that Tab should 'complete what I see.'
Allow users to configure a custom base_url for the OpenAI TTS provider
in ~/.hermes/config.yaml under tts.openai.base_url. Defaults to the
official OpenAI endpoint. Enables use of self-hosted or OpenAI-compatible
TTS services (e.g. http://localhost:8000/v1).
Also adds a TTS configuration example block to cli-config.yaml.example.
* fix: detect context length for custom model endpoints via fuzzy matching + config override
Custom model endpoints (non-OpenRouter, non-known-provider) were silently
falling back to 2M tokens when the model name didn't exactly match what the
endpoint's /v1/models reported. This happened because:
1. Endpoint metadata lookup used exact match only — model name mismatches
(e.g. 'qwen3.5:9b' vs 'Qwen3.5-9B-Q4_K_M.gguf') caused a miss
2. Single-model servers (common for local inference) required exact name
match even though only one model was loaded
3. No user escape hatch to manually set context length
Changes:
- Add fuzzy matching for endpoint model metadata: single-model servers
use the only available model regardless of name; multi-model servers
try substring matching in both directions
- Add model.context_length config override (highest priority) so users
can explicitly set their model's context length in config.yaml
- Log an informative message when falling back to 2M probe, telling
users about the config override option
- Thread config_context_length through ContextCompressor and AIAgent init
Tests: 6 new tests covering fuzzy match, single-model fallback, config
override (including zero/None edge cases).
* fix: auto-detect local model name and context length for local servers
Cherry-picked from PR #2043 by sudoingX.
- Auto-detect model name from local server's /v1/models when only one
model is loaded (no manual model name config needed)
- Add n_ctx_train and n_ctx to context length detection keys for llama.cpp
- Query llama.cpp /props endpoint for actual allocated context (not just
training context from GGUF metadata)
- Strip .gguf suffix from display in banner and status bar
- _auto_detect_local_model() in runtime_provider.py for CLI init
Co-authored-by: sudo <sudoingx@users.noreply.github.com>
* fix: revert accidental summary_target_tokens change + add docs for context_length config
- Revert summary_target_tokens from 2500 back to 500 (accidental change
during patching)
- Add 'Context Length Detection' section to Custom & Self-Hosted docs
explaining model.context_length config override
---------
Co-authored-by: Test <test@test.com>
Co-authored-by: sudo <sudoingx@users.noreply.github.com>
Update all SOUL.md documentation to reflect that it now occupies
slot #1 in the system prompt, replacing the hardcoded default identity.
Updated pages:
- user-guide/features/personality.md — SOUL.md is primary identity, not just a layer
- developer-guide/prompt-assembly.md — updated prompt layer order, context files list
- guides/use-soul-with-hermes.md — SOUL.md replaces built-in identity
- user-guide/configuration.md — updated context files table and directory tree
Co-authored-by: Test <test@test.com>
Add unauthorized_dm_behavior config (pair|ignore) with global default
and per-platform override. WhatsApp can silently drop unknown DMs
instead of sending pairing codes.
Adapted config bridging to work with gw_data dict (pre-construction)
rather than config object. Dropped implementation plan document.
Co-authored-by: Frederico Ribeiro <fr@tecompanytea.com>
The standard install already includes MCP via .[all]. For users who
need to add it separately, the correct command is:
cd ~/.hermes/hermes-agent && uv pip install -e ".[mcp]"
The venv is created by uv, so bare 'pip' isn't available. All four
occurrences across 3 docs pages updated.
MiniMax: Add M2.7 and M2.7-highspeed as new defaults across provider
model lists, auxiliary client, metadata, setup wizard, RL training tool,
fallback tests, and docs. Retain M2.5/M2.1 as alternatives.
OpenRouter: Add grok-4.20-beta, nemotron-3-super-120b-a12b:free,
trinity-large-preview:free, glm-5-turbo, and hunter-alpha to the
model catalog.
MiniMax changes based on PR #1882 by @octo-patch (applied manually
due to stale conflicts in refactored pricing module).
Add first-class GitHub Copilot and Copilot ACP provider support across
model selection, runtime provider resolution, CLI sessions, delegated
subagents, cron jobs, and the Telegram gateway.
This also normalizes Copilot model catalogs and API modes, introduces a
Copilot ACP OpenAI-compatible shim, and fixes service-mode auth by
resolving Homebrew-installed gh binaries under launchd.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
MDX v2+ interprets curly braces in regular markdown as JSX
expressions. The headings 'GET /v1/responses/{id}' and
'DELETE /v1/responses/{id}' caused a ReferenceError during
Docusaurus static site generation because 'id' is not a
defined JavaScript variable. Escaped with backslashes.
Co-authored-by: Test <test@test.com>
* feat: OpenAI-compatible API server platform adapter
Salvaged from PR #956, updated for current main.
Adds an HTTP API server as a gateway platform adapter that exposes
hermes-agent via the OpenAI Chat Completions and Responses APIs.
Any OpenAI-compatible frontend (Open WebUI, LobeChat, LibreChat,
AnythingLLM, NextChat, ChatBox, etc.) can connect by pointing at
http://localhost:8642/v1.
Endpoints:
- POST /v1/chat/completions — stateless Chat Completions API
- POST /v1/responses — stateful Responses API with chaining
- GET /v1/responses/{id} — retrieve stored response
- DELETE /v1/responses/{id} — delete stored response
- GET /v1/models — list hermes-agent as available model
- GET /health — health check
Features:
- Real SSE streaming via stream_delta_callback (uses main's streaming)
- In-memory LRU response store for Responses API conversation chaining
- Named conversations via 'conversation' parameter
- Bearer token auth (optional, via API_SERVER_KEY)
- CORS support for browser-based frontends
- System prompt layering (frontend system messages on top of core)
- Real token usage tracking in responses
Integration points:
- Platform.API_SERVER in gateway/config.py
- _create_adapter() branch in gateway/run.py
- API_SERVER_* env vars in hermes_cli/config.py
- Env var overrides in gateway/config.py _apply_env_overrides()
Changes vs original PR #956:
- Removed streaming infrastructure (already on main via stream_consumer.py)
- Removed Telegram reply_to_mode (separate feature, not included)
- Updated _resolve_model() -> _resolve_gateway_model()
- Updated stream_callback -> stream_delta_callback
- Updated connect()/disconnect() to use _mark_connected()/_mark_disconnected()
- Adapted to current Platform enum (includes MATTERMOST, MATRIX, DINGTALK)
Tests: 72 new tests, all passing
Docs: API server guide, Open WebUI integration guide, env var reference
* feat(whatsapp): make reply prefix configurable via config.yaml
Reworked from PR #1764 (ifrederico) to use config.yaml instead of .env.
The WhatsApp bridge prepends a header to every outgoing message.
This was hardcoded to '⚕ *Hermes Agent*'. Users can now customize
or disable it via config.yaml:
whatsapp:
reply_prefix: '' # disable header
reply_prefix: '🤖 *My Bot*\n───\n' # custom prefix
How it works:
- load_gateway_config() reads whatsapp.reply_prefix from config.yaml
and stores it in PlatformConfig.extra['reply_prefix']
- WhatsAppAdapter reads it from config.extra at init
- When spawning bridge.js, the adapter passes it as
WHATSAPP_REPLY_PREFIX in the subprocess environment
- bridge.js handles undefined (default), empty (no header),
or custom values with \\n escape support
- Self-chat echo suppression uses the configured prefix
Also fixes _config_version: was 9 but ENV_VARS_BY_VERSION had a
key 10 (TAVILY_API_KEY), so existing users at v9 would never be
prompted for Tavily. Bumped to 10 to close the gap. Added a
regression test to prevent this from happening again.
Credit: ifrederico (PR #1764) for the bridge.js implementation
and the config version gap discovery.
---------
Co-authored-by: Test <test@test.com>
- Add summary_base_url config option to compression block for custom
OpenAI-compatible endpoints (e.g. zai, DeepSeek, Ollama)
- Remove compression env var bridges from cli.py and gateway/run.py
(CONTEXT_COMPRESSION_* env vars no longer set from config)
- Switch run_agent.py to read compression config directly from
config.yaml instead of env vars
- Fix backwards-compat block in _resolve_task_provider_model to also
fire when auxiliary.compression.provider is 'auto' (DEFAULT_CONFIG
sets this, which was silently preventing the compression section's
summary_* keys from being read)
- Add test for summary_base_url config-to-client flow
- Update docs to show compression as config.yaml-only
Closes#1591
Based on PR #1702 by @uzaylisak
* feat(web): add Parallel as alternative web search/extract backend
Adds Parallel (parallel.ai) as a drop-in alternative to Firecrawl for
web_search and web_extract tools using the official parallel-web SDK.
- Backend selection via WEB_SEARCH_BACKEND env var (auto/parallel/firecrawl)
- Auto mode prefers Firecrawl when both keys present; Parallel when sole backend
- web_crawl remains Firecrawl-only with clear error when unavailable
- Lazy SDK imports, interrupt support, singleton clients
- 16 new unit tests for backend selection and client config
Co-authored-by: s-jag <s-jag@users.noreply.github.com>
* fix: add PARALLEL_API_KEY to config registry and fix web_crawl policy tests
Follow-up for Parallel backend integration:
- Add PARALLEL_API_KEY to OPTIONAL_ENV_VARS (hermes doctor, env blocklist)
- Add to set_config_value api_keys list (hermes config set)
- Add to doctor keys display
- Fix 2 web_crawl policy tests that didn't set FIRECRAWL_API_KEY
(needed now that web_crawl has a Firecrawl availability guard)
* refactor: explicit backend selection via hermes tools, not auto-detect
Replace the auto-detect backend selection with explicit user choice:
- hermes tools saves WEB_SEARCH_BACKEND to .env when user picks a provider
- _get_backend() reads the explicit choice first
- Fallback only for manual/legacy config (uses whichever key is present)
- _is_provider_active() shows [active] for the selected web backend
- Updated tests, docs, and .env.example to remove 'auto' mode language
* refactor: use config.yaml for web backend, not env var
Match the TTS/browser pattern — web.backend is stored in config.yaml
(set by hermes tools), not as a WEB_SEARCH_BACKEND env var.
- _load_web_config() reads web: section from config.yaml
- _get_backend() reads web.backend from config, falls back to key detection
- _configure_provider() saves to config dict (saved to config.yaml)
- _is_provider_active() reads from config dict
- Removed WEB_SEARCH_BACKEND from .env.example, set_config_value, docs
- Updated all tests to mock _load_web_config instead of env vars
---------
Co-authored-by: s-jag <s-jag@users.noreply.github.com>
Full Docusaurus docs following the Discord guide structure:
Mattermost (277 lines):
- Step-by-step: enable bot accounts, create bot, get token, add to channels
- All env vars documented with examples
- Reply mode (thread/off), home channel, troubleshooting
Matrix (354 lines):
- Step-by-step: create bot account, get access token (Element or API)
- Dual auth (token + password), E2EE section with libolm install
- Thread support, DM detection, home room, troubleshooting
- Works with any homeserver (Synapse, Conduit, Dendrite, matrix.org)
* feat(gateway): add DingTalk platform adapter
Add DingTalk as a messaging platform using the dingtalk-stream SDK
for real-time message reception via Stream Mode (no webhook needed).
Replies are sent via session webhook using markdown format.
Features:
- Stream Mode connection (long-lived WebSocket, no public URL needed)
- Text and rich text message support
- DM and group chat support
- Message deduplication with 5-minute window
- Auto-reconnection with exponential backoff
- Session webhook caching for reply routing
Configuration:
export DINGTALK_CLIENT_ID=your-app-key
export DINGTALK_CLIENT_SECRET=your-app-secret
# or in config.yaml:
platforms:
dingtalk:
enabled: true
extra:
client_id: your-app-key
client_secret: your-app-secret
Files:
- gateway/platforms/dingtalk.py (340 lines) — adapter implementation
- gateway/config.py — add DINGTALK to Platform enum
- gateway/run.py — add DingTalk to _create_adapter
- hermes_cli/config.py — add env vars to _EXTRA_ENV_KEYS
- hermes_cli/tools_config.py — add dingtalk to PLATFORMS
- tests/gateway/test_dingtalk.py — 21 tests
* docs: add Alibaba Cloud and DingTalk to setup wizard and docs
Wire Alibaba Cloud (DashScope) into hermes setup and hermes model
provider selection flows. Add DingTalk env vars to documentation.
Changes:
- setup.py: Add Alibaba Cloud as provider choice (index 11) with
DASHSCOPE_API_KEY prompt and model studio link
- main.py: Add alibaba to provider_labels, providers list, and
model flow dispatch
- environment-variables.md: Add DASHSCOPE_API_KEY, DINGTALK_CLIENT_ID,
DINGTALK_CLIENT_SECRET, and alibaba to HERMES_INFERENCE_PROVIDER
Add Kilo Gateway (kilo.ai) as an API-key provider with OpenAI-compatible
endpoint at https://api.kilo.ai/api/gateway. Supports 500+ models from
Anthropic, OpenAI, Google, xAI, Mistral, MiniMax via a single API key.
- Register kilocode in PROVIDER_REGISTRY with aliases (kilo, kilo-code,
kilo-gateway) and KILOCODE_API_KEY / KILOCODE_BASE_URL env vars
- Add to model catalog, CLI provider menu, setup wizard, doctor checks
- Add google/gemini-3-flash-preview as default aux model
- 12 new tests covering registration, aliases, credential resolution,
runtime config
- Documentation updates (env vars, config, fallback providers)
- Fix setup test index shift from provider insertion
Inspired by PR #1473 by @amanning3390.
Co-authored-by: amanning3390 <amanning3390@users.noreply.github.com>
Docker terminal sessions are secret-dark by default. This adds
terminal.docker_forward_env as an explicit allowlist for env vars
that may be forwarded into Docker containers.
Values resolve from the current shell first, then fall back to
~/.hermes/.env. Only variables the user explicitly lists are
forwarded — nothing is auto-exposed.
Cherry-picked from PR #1449 by @teknium1, conflict-resolved onto
current main.
Fixes#1436
Supersedes #1439
* feat: add Vercel AI Gateway as a first-class provider
Adds AI Gateway (ai-gateway.vercel.sh) as a new inference provider
with AI_GATEWAY_API_KEY authentication, live model discovery, and
reasoning support via extra_body.reasoning.
Based on PR #1492 by jerilynzheng.
* feat: add AI Gateway to setup wizard, doctor, and fallback providers
* test: add AI Gateway to api_key_providers test suite
* feat: add AI Gateway to hermes model CLI and model metadata
Wire AI Gateway into the interactive model selection menu and add
context lengths for AI Gateway model IDs in model_metadata.py.
* feat: use claude-haiku-4.5 as AI Gateway auxiliary model
* revert: use gemini-3-flash as AI Gateway auxiliary model
* fix: move AI Gateway below established providers in selection order
---------
Co-authored-by: jerilynzheng <jerilynzheng@users.noreply.github.com>
Co-authored-by: jerilynzheng <zheng.jerilyn@gmail.com>
* refactor: centralize slash command registry
Replace 7+ scattered command definition sites with a single
CommandDef registry in hermes_cli/commands.py. All downstream
consumers now derive from this registry:
- CLI process_command() resolves aliases via resolve_command()
- Gateway _known_commands uses GATEWAY_KNOWN_COMMANDS frozenset
- Gateway help text generated by gateway_help_lines()
- Telegram BotCommands generated by telegram_bot_commands()
- Slack subcommand map generated by slack_subcommand_map()
Adding a command or alias is now a one-line change to
COMMAND_REGISTRY instead of touching 6+ files.
Bugfixes included:
- Telegram now registers /rollback, /background (were missing)
- Slack now has /voice, /update, /reload-mcp (were missing)
- Gateway duplicate 'reasoning' dispatch (dead code) removed
- Gateway help text can no longer drift from CLI help
Backwards-compatible: COMMANDS and COMMANDS_BY_CATEGORY dicts are
rebuilt from the registry, so existing imports work unchanged.
* docs: update developer docs for centralized command registry
Update AGENTS.md with full 'Slash Command Registry' and 'Adding a
Slash Command' sections covering CommandDef fields, registry helpers,
and the one-line alias workflow.
Also update:
- CONTRIBUTING.md: commands.py description
- website/docs/reference/slash-commands.md: reference central registry
- docs/plans/centralize-command-registry.md: mark COMPLETED
- plans/checkpoint-rollback.md: reference new pattern
- hermes-agent-dev skill: architecture table
* chore: remove stale plan docs
* feat: add optional smart model routing
Add a conservative cheap-vs-strong routing option that can send very short/simple turns to a cheaper model across providers while keeping the primary model for complex work. Wire it through CLI, gateway, and cron, and document the config.yaml workflow.
* fix(gateway): remove recursive ExecStop from systemd units, extend TimeoutStopSec to 60s
* fix(gateway): avoid recursive ExecStop in user systemd unit
* fix: extend ExecStop removal and TimeoutStopSec=60 to system unit
The cherry-picked PR #1448 fix only covered the user systemd unit.
The system unit had the same TimeoutStopSec=15 and could benefit
from the same 60s timeout for clean shutdown. Also adds a regression
test for the system unit.
---------
Co-authored-by: Ninja <ninja@local>
* feat(skills): add blender-mcp optional skill for 3D modeling
Control a running Blender instance from Hermes via socket connection
to the blender-mcp addon (port 9876). Supports creating 3D objects,
materials, animations, and running arbitrary bpy code.
Placed in optional-skills/ since it requires Blender 4.3+ desktop
with a third-party addon manually started each session.
* feat(acp): support slash commands in ACP adapter (#1532)
Adds /help, /model, /tools, /context, /reset, /compact, /version
to the ACP adapter (VS Code, Zed, JetBrains). Commands are handled
directly in the server without instantiating the TUI — each command
queries agent/session state and returns plain text.
Unrecognized /commands fall through to the LLM as normal messages.
/model uses detect_provider_for_model() for auto-detection when
switching models, matching the CLI and gateway behavior.
Fixes#1402
* fix(logging): improve error logging in session search tool (#1533)
* fix(gateway): restart on retryable startup failures (#1517)
* feat(email): add skip_attachments option via config.yaml
* feat(email): add skip_attachments option via config.yaml
Adds a config.yaml-driven option to skip email attachments in the
gateway email adapter. Useful for malware protection and bandwidth
savings.
Configure in config.yaml:
platforms:
email:
skip_attachments: true
Based on PR #1521 by @an420eth, changed from env var to config.yaml
(via PlatformConfig.extra) to match the project's config-first pattern.
* docs: document skip_attachments option for email adapter
* fix(telegram): retry on transient TLS failures during connect and send
Add exponential-backoff retry (3 attempts) around initialize() to
handle transient TLS resets during gateway startup. Also catches
TimedOut and OSError in addition to NetworkError.
Add exponential-backoff retry (3 attempts) around send_message() for
NetworkError during message delivery, wrapping the existing Markdown
fallback logic.
Both imports are guarded with try/except ImportError for test
environments where telegram is mocked.
Based on PR #1527 by cmd8. Closes#1526.
* feat: permissive block_anchor thresholds and unicode normalization (#1539)
Salvaged from PR #1528 by an420eth. Closes#517.
Improves _strategy_block_anchor in fuzzy_match.py:
- Add unicode normalization (smart quotes, em/en-dashes, ellipsis,
non-breaking spaces → ASCII) so LLM-produced unicode artifacts
don't break anchor line matching
- Lower thresholds: 0.10 for unique matches (was 0.70), 0.30 for
multiple candidates — if first/last lines match exactly, the
block is almost certainly correct
- Use original (non-normalized) content for offset calculation to
preserve correct character positions
Tested: 3 new scenarios fixed (em-dash anchors, non-breaking space
anchors, very-low-similarity unique matches), zero regressions on
all 9 existing fuzzy match tests.
Co-authored-by: an420eth <an420eth@users.noreply.github.com>
* feat(cli): add file path autocomplete in the input prompt (#1545)
When typing a path-like token (./ ../ ~/ / or containing /),
the CLI now shows filesystem completions in the dropdown menu.
Directories show a trailing slash and 'dir' label; files show
their size. Completions are case-insensitive and capped at 30
entries.
Triggered by tokens like:
edit ./src/ma → shows ./src/main.py, ./src/manifest.json, ...
check ~/doc → shows ~/docs/, ~/documents/, ...
read /etc/hos → shows /etc/hosts, /etc/hostname, ...
open tools/reg → shows tools/registry.py
Slash command autocomplete (/help, /model, etc.) is unaffected —
it still triggers when the input starts with /.
Inspired by OpenCode PR #145 (file path completion menu).
Implementation:
- hermes_cli/commands.py: _extract_path_word() detects path-like
tokens, _path_completions() yields filesystem Completions with
size labels, get_completions() routes to paths vs slash commands
- tests/hermes_cli/test_path_completion.py: 26 tests covering
path extraction, prefix filtering, directory markers, home
expansion, case-insensitivity, integration with slash commands
* feat(privacy): redact PII from LLM context when privacy.redact_pii is enabled
Add privacy.redact_pii config option (boolean, default false). When
enabled, the gateway redacts personally identifiable information from
the system prompt before sending it to the LLM provider:
- Phone numbers (user IDs on WhatsApp/Signal) → hashed to user_<sha256>
- User IDs → hashed to user_<sha256>
- Chat IDs → numeric portion hashed, platform prefix preserved
- Home channel IDs → hashed
- Names/usernames → NOT affected (user-chosen, publicly visible)
Hashes are deterministic (same user → same hash) so the model can
still distinguish users in group chats. Routing and delivery use
the original values internally — redaction only affects LLM context.
Inspired by OpenClaw PR #47959.
* fix(privacy): skip PII redaction on Discord/Slack (mentions need real IDs)
Discord uses <@user_id> for mentions and Slack uses <@U12345> — the LLM
needs the real ID to tag users. Redaction now only applies to WhatsApp,
Signal, and Telegram where IDs are pure routing metadata.
Add 4 platform-specific tests covering Discord, WhatsApp, Signal, Slack.
* feat: smart approvals + /stop command (inspired by OpenAI Codex)
* feat: smart approvals — LLM-based risk assessment for dangerous commands
Adds a 'smart' approval mode that uses the auxiliary LLM to assess
whether a flagged command is genuinely dangerous or a false positive,
auto-approving low-risk commands without prompting the user.
Inspired by OpenAI Codex's Smart Approvals guardian subagent
(openai/codex#13860).
Config (config.yaml):
approvals:
mode: manual # manual (default), smart, off
Modes:
- manual — current behavior, always prompt the user
- smart — aux LLM evaluates risk: APPROVE (auto-allow), DENY (block),
or ESCALATE (fall through to manual prompt)
- off — skip all approval prompts (equivalent to --yolo)
When smart mode auto-approves, the pattern gets session-level approval
so subsequent uses of the same pattern don't trigger another LLM call.
When it denies, the command is blocked without user prompt. When
uncertain, it escalates to the normal manual approval flow.
The LLM prompt is carefully scoped: it sees only the command text and
the flagged reason, assesses actual risk vs false positive, and returns
a single-word verdict.
* feat: make smart approval model configurable via config.yaml
Adds auxiliary.approval section to config.yaml with the same
provider/model/base_url/api_key pattern as other aux tasks (vision,
web_extract, compression, etc.).
Config:
auxiliary:
approval:
provider: auto
model: '' # fast/cheap model recommended
base_url: ''
api_key: ''
Bridged to env vars in both CLI and gateway paths so the aux client
picks them up automatically.
* feat: add /stop command to kill all background processes
Adds a /stop slash command that kills all running background processes
at once. Currently users have to process(list) then process(kill) for
each one individually.
Inspired by OpenAI Codex's separation of interrupt (Ctrl+C stops current
turn) from /stop (cleans up background processes). See openai/codex#14602.
Ctrl+C continues to only interrupt the active agent turn — background
dev servers, watchers, etc. are preserved. /stop is the explicit way
to clean them all up.
* feat: first-class plugin architecture + hide status bar cost by default (#1544)
The persistent status bar now shows context %, token counts, and
duration but NOT $ cost by default. Cost display is opt-in via:
display:
show_cost: true
in config.yaml, or: hermes config set display.show_cost true
The /usage command still shows full cost breakdown since the user
explicitly asked for it — this only affects the always-visible bar.
Status bar without cost:
⚕ claude-sonnet-4 │ 12K/200K │ 6% │ 15m
Status bar with show_cost: true:
⚕ claude-sonnet-4 │ 12K/200K │ 6% │ $0.06 │ 15m
* feat: improve memory prioritization + aggressive skill updates (inspired by OpenAI Codex)
* feat: improve memory prioritization — user preferences over procedural knowledge
Inspired by OpenAI Codex's memory prompt improvements (openai/codex#14493)
which focus memory writes on user preferences and recurring patterns
rather than procedural task details.
Key insight: 'Optimize for reducing future user steering — the most
valuable memory prevents the user from having to repeat themselves.'
Changes:
- MEMORY_GUIDANCE (prompt_builder.py): added prioritization hierarchy
and the core principle about reducing user steering
- MEMORY_SCHEMA (memory_tool.py): reordered WHEN TO SAVE list to put
corrections first, added explicit PRIORITY guidance
- Memory nudge (run_agent.py): now asks specifically about preferences,
corrections, and workflow patterns instead of generic 'anything'
- Memory flush (run_agent.py): now instructs to prioritize user
preferences and corrections over task-specific details
* feat: more aggressive skill creation and update prompting
Press harder on skill updates — the agent should proactively patch
skills when it encounters issues during use, not wait to be asked.
Changes:
- SKILLS_GUIDANCE: 'consider saving' → 'save'; added explicit instruction
to patch skills immediately when found outdated/wrong
- Skills header: added instruction to update loaded skills before finishing
if they had missing steps or wrong commands
- Skill nudge: more assertive ('save the approach' not 'consider saving'),
now also prompts for updating existing skills used in the task
- Skill nudge interval: lowered default from 15 to 10 iterations
- skill_manage schema: added 'patch it immediately' to update triggers
* feat: first-class plugin architecture (#1555)
Plugin system for extending Hermes with custom tools, hooks, and
integrations — no source code changes required.
Core system (hermes_cli/plugins.py):
- Plugin discovery from ~/.hermes/plugins/, .hermes/plugins/, and
pip entry_points (hermes_agent.plugins group)
- PluginContext with register_tool() and register_hook()
- 6 lifecycle hooks: pre/post tool_call, pre/post llm_call,
on_session_start/end
- Namespace package handling for relative imports in plugins
- Graceful error isolation — broken plugins never crash the agent
Integration (model_tools.py):
- Plugin discovery runs after built-in + MCP tools
- Plugin tools bypass toolset filter via get_plugin_tool_names()
- Pre/post tool call hooks fire in handle_function_call()
CLI:
- /plugins command shows loaded plugins, tool counts, status
- Added to COMMANDS dict for autocomplete
Docs:
- Getting started guide (build-a-hermes-plugin.md) — full tutorial
building a calculator plugin step by step
- Reference page (features/plugins.md) — quick overview + tables
- Covers: file structure, schemas, handlers, hooks, data files,
bundled skills, env var gating, pip distribution, common mistakes
Tests: 16 tests covering discovery, loading, hooks, tool visibility.
* fix: hermes update causes dual gateways on macOS (launchd)
Three bugs worked together to create the dual-gateway problem:
1. cmd_update only checked systemd for gateway restart, completely
ignoring launchd on macOS. After killing the PID it would print
'Restart it with: hermes gateway run' even when launchd was about
to auto-respawn the process.
2. launchd's KeepAlive.SuccessfulExit=false respawns the gateway
after SIGTERM (non-zero exit), so the user's manual restart
created a second instance.
3. The launchd plist lacked --replace (systemd had it), so the
respawned gateway didn't kill stale instances on startup.
Fixes:
- Add --replace to launchd ProgramArguments (matches systemd)
- Add launchd detection to cmd_update's auto-restart logic
- Print 'auto-restart via launchd' instead of manual restart hint
* fix: add launchd plist auto-refresh + explicit restart in cmd_update
Two integration issues with the initial fix:
1. Existing macOS users with old plist (no --replace) would never
get the fix until manual uninstall/reinstall. Added
refresh_launchd_plist_if_needed() — mirrors the existing
refresh_systemd_unit_if_needed(). Called from launchd_start(),
launchd_restart(), and cmd_update.
2. cmd_update relied on KeepAlive respawn after SIGTERM rather than
explicit launchctl stop/start. This caused races: launchd would
respawn the old process before the PID file was cleaned up.
Now does explicit stop+start (matching how systemd gets an
explicit systemctl restart), with plist refresh first so the
new --replace flag is picked up.
---------
Co-authored-by: Ninja <ninja@local>
Co-authored-by: alireza78a <alireza78a@users.noreply.github.com>
Co-authored-by: Oktay Aydin <113846926+aydnOktay@users.noreply.github.com>
Co-authored-by: JP Lew <polydegen@protonmail.com>
Co-authored-by: an420eth <an420eth@users.noreply.github.com>
Discord uses <@user_id> for mentions and Slack uses <@U12345> — the LLM
needs the real ID to tag users. Redaction now only applies to WhatsApp,
Signal, and Telegram where IDs are pure routing metadata.
Add 4 platform-specific tests covering Discord, WhatsApp, Signal, Slack.
Add privacy.redact_pii config option (boolean, default false). When
enabled, the gateway redacts personally identifiable information from
the system prompt before sending it to the LLM provider:
- Phone numbers (user IDs on WhatsApp/Signal) → hashed to user_<sha256>
- User IDs → hashed to user_<sha256>
- Chat IDs → numeric portion hashed, platform prefix preserved
- Home channel IDs → hashed
- Names/usernames → NOT affected (user-chosen, publicly visible)
Hashes are deterministic (same user → same hash) so the model can
still distinguish users in group chats. Routing and delivery use
the original values internally — redaction only affects LLM context.
Inspired by OpenClaw PR #47959.
* feat(email): add skip_attachments option via config.yaml
Adds a config.yaml-driven option to skip email attachments in the
gateway email adapter. Useful for malware protection and bandwidth
savings.
Configure in config.yaml:
platforms:
email:
skip_attachments: true
Based on PR #1521 by @an420eth, changed from env var to config.yaml
(via PlatformConfig.extra) to match the project's config-first pattern.
* docs: document skip_attachments option for email adapter
Keep Docker sandboxes isolated by default. Add an explicit terminal.docker_mount_cwd_to_workspace opt-in, thread it through terminal/file environment creation, and document the security tradeoff and config.yaml workflow clearly.
Fixes#1445 — When using Docker backend, the user's current working
directory is now automatically bind-mounted to /workspace inside the
container. This allows users to run `cd my-project && hermes` and have
their project files accessible to the agent without manual volume config.
Changes:
- Add host_cwd and auto_mount_cwd parameters to DockerEnvironment
- Capture original host CWD in _get_env_config() before container fallback
- Pass host_cwd through _create_environment() to Docker backend
- Add TERMINAL_DOCKER_NO_AUTO_MOUNT env var to disable if needed
- Skip auto-mount when /workspace is already explicitly mounted
- Add tests for auto-mount behavior
- Add documentation for the new feature
The auto-mount is skipped when:
1. TERMINAL_DOCKER_NO_AUTO_MOUNT=true is set
2. User configured docker_volumes with :/workspace
3. persistent_filesystem=true (persistent sandbox mode)
This makes the Docker backend behave more intuitively — the agent
operates on the user's actual project directory by default.
- Add Status Bar section to user-guide/cli.md with layout example,
element descriptions, responsive width behavior, and color-coded
context threshold table
- Update /usage description in slash-commands reference to mention
cost breakdown and session duration