Files
rose-ash/plans
giles 1d771aedea
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 40s
fed-sx-m2: Pattern B from fed-prims diagnosis fails on reproducer
loops/fed-prims commit bf8d0bf2 (merged as 94f6ab9f) diagnosed
Blockers #4 as Erlang-substrate scope and sketched a Pattern B fix
purely in er-bif-http-listen: wrap the handler call in er-spawn-fun
+ er-sched-run-all! and read the spawned process's :exit-result.

Tried it on lib/erlang/runtime.sx — does not work. Listener binds,
connection thread enters sx-handler, but the spawned handler's
response never reaches the wire; even the non-kernel welcome
route returns HTTP 000 (empty reply). Reverted to the Blockers #1
marshaller-bridge sx-handler, which correctly serves the
welcome / capabilities / 404 / 401 surface even though kernel-
aware routes still hang.

Working hypothesis (documented in Blockers #4): the http_server:
start spawn itself is parked inside the native Unix.accept loop on
the boot thread; the global er-sched-* state still has that
process in its queue. When the connection thread (under the
per-instance native mutex) calls er-sched-run-all!, it re-enters
the SAME global scheduler — the boot thread's er-sched-step! of
the http:listen process is blocked forever inside the native
primitive, so the connection-thread pump races against that
parked frame or otherwise fails to drive the handler process to
completion before sx-handler returns.

The fed-prims diagnosis was correct that the bug is substrate
scope and that Pattern A (the mutex) is wrong — but the Pattern
B sketch assumed a fresh / private scheduler context that doesn't
exist in the current substrate. Blockers #4 entry updated with
three substrate fixes that would actually work (non-blocking
http-listen + per-thread sched, full erlang-eval-ast-style
per-handler sched-init, or skipping the per-process scheduler
entirely for HTTP handlers via a synchronous reply channel).

m2 stays at 11/12 steps done; Step 12 remains gated. Loop pacing
dialled back down — substrate work owes to loops/erlang or a
follow-on fed-prims tick with a more careful design pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-07 15:21:18 +00:00
..