Files
rose-ash/plans
giles bf8d0bf245
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 1m7s
fed-prims: diagnose fed-sx-m2 Blockers #4 — not a mutex bug, hand back to m2
Investigated the http-listen "handler-mutex deadlock" per
plans/agent-briefings/fed-prims-mutex-fix.md. Reproduced deterministically
(single kernel-route request returns empty reply while a non-kernel route
returns 200; also reproduced with a 3-line minimal echo gen_server).

Root cause is in the Erlang substrate, not the OCaml mutex: native
http-listen runs each handler on a fresh Thread.create outside any Erlang
scheduler step, so gen_server:call -> receive (which raises er-suspend-marker
expecting an enclosing er-sched-step-alive! guard + er-sched-run-all! pump)
can never complete.

Pattern A is inapplicable: the failure reproduces on a single request with
zero contention, so it is not a mutex-contention deadlock; the mutex is in
fact required and must stay. Sx_runtime.sx_call is fully synchronous and no
OCaml symbol reaches the SX-level scheduler, so there is no OCaml-only fix.
The correct fix is Pattern B done entirely in er-bif-http-listen
(lib/erlang/runtime.sx) — spawn the handler as an er-process and
er-sched-run-all! to completion — which is m2 / loops/erlang scope.

Doc-only: full diagnosis + concrete patch sketch added to the Blockers and
Progress log of plans/fed-sx-host-primitives.md. No bin/sx_server.ml change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 14:43:54 +00:00
..