test(docker): fix svstat 'want up' assertion in profile-gateway lifecycle test

After the supervise-perms fix lands, the s6 lifecycle actually works
for the hermes user — hermes -p <profile> gateway start now genuinely
brings the supervised gateway up rather than silently no-op'ing on
EACCES. That exposes a latent bug in this test's assertion: it
expected 'want up' to appear literally in s6-svstat output, but
s6-svstat elides redundancies — when the slot is currently up AND
s6 wants it up, the output is just 'up (pid N pgid N) X seconds';
the explicit 'want up' token only appears when current ≠ wanted
(e.g. 'down (exitcode 1) … , want up' on a crash-loop).

Add a small helper _svstat_wants_up() that reads the want-state
correctly across both spellings:
  * 'up …'                       → wanted up (unless explicit 'want down')
  * 'down …, want up'            → wanted up explicitly
  * 'down …'                     → wanted down

Both stop and start assertions now use the helper. Also rewords
the module docstring to acknowledge that the supervised process
may succeed OR crash-loop depending on environment, but the want-
state contract holds either way.

(cherry picked from commit 02c933aedc8500e5672aed12475a9ba0534bd77a)
This commit is contained in:
Ben 2026-05-25 11:21:47 +10:00
parent 7d54288d82
commit c524b8a4dc

View File

@ -8,11 +8,14 @@ with the pre-Phase-4 informational message.
These tests were marked ``xfail(strict=True)`` through Phase 03 and These tests were marked ``xfail(strict=True)`` through Phase 03 and
flip to plain ``test_`` once Phase 4 lands (now). flip to plain ``test_`` once Phase 4 lands (now).
NB: The harness profile created here has no model/auth configured, NB: The harness profile has no model/auth configured. Depending on
so the gateway process itself will exit with code 1 on every start how the gateway run script handles missing config, the supervised
attempt (s6 will keep restarting it). We assert against s6's process may either spin up successfully (and svstat reports ``up``)
``want up`` / ``want down`` state which reflects the lifecycle or exit fast and get throttled by s6 (and svstat reports ``down ,
command's intent, not the supervised process's health. want up``). Both states are valid "user asked for gateway up" results
what we assert is the *want* intent the lifecycle command set, NOT
the supervised process's health. ``s6-svc -u`` records ``want up`` in
the supervise/status file regardless of the run-script outcome.
Every ``docker exec`` here runs as the unprivileged ``hermes`` user Every ``docker exec`` here runs as the unprivileged ``hermes`` user
(via :func:`docker_exec_sh` in conftest); see the conftest module (via :func:`docker_exec_sh` in conftest); see the conftest module
@ -42,6 +45,27 @@ def _svstat(container: str) -> str:
return r.stdout if r.returncode == 0 else "" return r.stdout if r.returncode == 0 else ""
def _svstat_wants_up(container: str) -> bool:
"""Read the slot's want-state from s6-svstat output.
s6-svstat formats the output to elide redundancies when the
service is currently up AND s6 wants it up, the literal token
``want up`` doesn't appear (it's implicit from the leading ``up``).
When the service is down but s6 wants it back up, ``, want up``
appears explicitly. So a comprehensive "is the want-intent set to
up" check has to accept both spellings.
"""
state = _svstat(container)
if not state:
return False
head = state.split()[0] if state.split() else ""
if head == "up":
# Currently up implies wanted-up unless ``want down`` is set.
return "want down" not in state
# Currently down — ``want up`` only shows up when explicitly set.
return "want up" in state
def test_profile_create_then_gateway_start( def test_profile_create_then_gateway_start(
built_image: str, container_name: str, built_image: str, container_name: str,
) -> None: ) -> None:
@ -66,17 +90,23 @@ def test_profile_create_then_gateway_start(
# After start, s6's intent is "up" — even if the supervised gateway # After start, s6's intent is "up" — even if the supervised gateway
# process spin-fails (no model/auth in the test profile), the # process spin-fails (no model/auth in the test profile), the
# supervision-state contract holds. # supervision-state contract holds. See ``_svstat_wants_up`` for
# why we accept both ``up …`` (currently up) and ``down …, want
# up`` (down but s6 wants up).
time.sleep(2) time.sleep(2)
state = _svstat(container_name) assert _svstat_wants_up(container_name), (
assert "want up" in state, f"want up not in svstat: {state!r}" f"slot want-state is not up after gateway start: "
f"{_svstat(container_name)!r}"
)
r = _sh(container_name, f"hermes -p {PROFILE} gateway stop", timeout=30) r = _sh(container_name, f"hermes -p {PROFILE} gateway stop", timeout=30)
assert r.returncode == 0 assert r.returncode == 0
time.sleep(2) time.sleep(2)
state = _svstat(container_name) assert not _svstat_wants_up(container_name), (
assert "want up" not in state, f"want up still in svstat: {state!r}" f"slot want-state still up after gateway stop: "
f"{_svstat(container_name)!r}"
)
def test_profile_delete_stops_gateway( def test_profile_delete_stops_gateway(