Conformance gate + both smoke tests (smoke_kernel_route 6/6,
smoke_federate 6/6) still pass cold on m2 tip cd0de8cb. Dry-run
rebase onto current origin/architecture (0963aa51) shows 109
commits to replay with first conflict at m2's 24e3bf53 — the
binary_to_list/list_to_binary fix that landed independently on
both branches. Textual diff of the runtime.sx changes is identical
on both sides; only the scoreboard files differ. Resolution =
git rebase --skip on m2's duplicate substrate-fix commits.
No code conflict expected on the substantive m2 work (Blockers
#4 :pending-args scheduler fix, er-bif-http-listen rewrite,
er-bif-httpc-request, all of next/**).
The :pending-args extension to er-sched-step-alive! (03c32cda)
is substrate-shaped and only lives on m2 — should propagate to
loops/erlang, but that propagation belongs to the loops/erlang
loop, not this one.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Enabling the epoch serving-mode JIT globally regressed continuation-based guest
interpreters (the epoch mode is the shared command channel every loop's
conformance runner uses). Two-part fix:
1. SAFE DEFAULT GATE. register_jit_hook in the persistent server branch is now
opt-in via SX_SERVING_JIT=1 (default OFF). Default behaviour is unchanged
(no JIT in epoch serving) → zero regression for sibling loops. The
content/Smalltalk page server opts in.
2. GENERAL FIXES + per-guest interpret-only declarations:
- callable? (sx_server/run_tests/integration_tests/mcp_tree) now accepts
VmClosure. A JIT-compiled higher-order function returns its inner closure
as a VmClosure; callable? previously rejected it, so scheme-apply's
(callable? proc) guard failed with "not a procedure: <vm:anon>".
- jit-exclude! gains a trailing-"*" namespace-prefix form
(Sx_types.jit_excluded_prefixes), the robust way to mark a whole guest
interpreter interpret-only (a name-list misses functions in extra files —
it left erlang's vm/dispatcher JIT'd and 13 tests short).
- Per-guest exclusions in each guest's runtime.sx:
scheme "scheme-*" "scm-*" erlang "er-*" "erlang-*"
prolog "pl-*" common-lisp "cl-*" "clos-*"
js "js-*" haskell "hk-*"
Verified under opt-in JIT (== CEK, no hang): smalltalk 847/847, scheme/flow
166/166, erlang 530/530, prolog 590/590, apl 152/152, js 147/148. Residual
(documented, protected by the default gate): common-lisp 6 fails in advanced
suites (parser-recovery/debugger/CLOS/MOP). lua (0/16) and tcl (3/4) fail
identically on CEK — pre-existing, not JIT. run_tests --jit/no-jit unchanged.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
register_jit_hook is now installed in the persistent (epoch) serving-mode
branch of sx_server.ml, not just --http/cli/site. Smalltalk-on-SX conformance
under JIT is 847/847 — identical to the no-JIT baseline; Datalog 356/356.
run_tests --jit/no-jit are byte-identical before/after (no regression).
Five distinct root causes fixed (not one "miscompile"):
1. Serving mode never loaded lib/compiler.sx, so JIT used the native
Sx_compiler.compile stub (arity-0 bytecode, params as GLOBAL_GET →
"VM undefined: <param>"). Server-mode branch now loads compiler.sx
before registering the hook, matching http/cli/site.
2. compile-cond / compile-case-clauses / compile-guard-clauses only treated
keyword :else and true as the catch-all, not the bare symbol `else` that
the CEK's is-else-clause? accepts → GLOBAL_GET "else". (lib/compiler.sx)
3. OP_DIV produced a float for non-divisible Integer/Integer (1/2 → 0.5)
instead of the exact Rational the "/" primitive returns. Now delegates to
the primitive, matching CEK. (sx_vm.ml)
4. OP_EQ / _fast_eq lacked Rational/ListRef cases that the "=" primitive's
safe_eq has → (= 1/2 1/2) false under JIT. OP_EQ now delegates non-scalars
to the "=" primitive; _fast_eq gained rational + ListRef. (sx_vm.ml,
sx_runtime.ml)
5. Continuation-based control flow (Smalltalk ^expr non-local return, block
escape, exceptions via call/cc) can't run in the stack VM. New data-driven
exclusion set Sx_types.jit_excluded + `jit-exclude!` primitive, consulted in
jit_compile_lambda (covers both the CEK hook and vm_call's tiered path).
lib/smalltalk/eval.sx self-declares its continuation dispatch core
interpret-only; pure helpers still JIT. The SUnit suite-runner test helper
pharo-test-class miscompiles mid-loop and is excluded in tests/tokenize.sx.
Also adds SX_JIT_DENY / SX_JIT_ONLY env-var bisection filters to the serving
hook. Known residual documented in plans/jit-bytecode-correctness.md: the hook
re-runs a failed VM execution via CEK (correct result, possible duplicate side
effects); adopting run_tests' propagate-don't-rerun semantics is deferred to
avoid changing shared VM/CEK behavior under this loop.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
next/tests/smoke_federate.sh boots two sx_server instances on
distinct ephemeral ports, each running http_server:start with its
own kernel + actor + the peer's AS pre-populated. The test signs
a real Follow envelope with alice's key in a third subprocess
(outbox:construct(follow, alice, 1, bob) + outbox:sign +
term_codec:encode), POSTs the bytes to B's /actors/bob/inbox over
real HTTP, and asserts:
- Both instances bind and serve their welcome route.
- Each instance's kernel-aware outbox returns the expected tip.
- B accepts the Follow (status 202 — pipeline validated the
signature against the pre-populated alice peer-AS,
nx_kernel appended to the inbox, auto-accept fired).
- bob's outbox tip advances 0 -> 1 (the Accept publish
landed in the outbox via outbox:publish + the kernel
gen_server).
This exercises every layer that m2 built:
- Step 8e httpc:request/4 BIF wrapper
- Step 8f dispatch_http closure (delivery_worker for the peer)
- Step 10c discovery_fetch (peer-actor doc shape)
- Blockers #1 marshaller bridge (er-request-dict-to-proplist
+ er-proplist-to-dict)
- Blockers #4 :pending-args substrate fix (kernel routes
suspend/resume in the SX scheduler)
All under real cross-instance HTTP load with both kernels
running as full gen_servers.
Step 12's plan body sketches the full Follow/Accept/Note/restart
flow (13+ steps); the m2 acceptance criterion is the cross-
instance signed-envelope round-trip with auto-accept fan-out,
which this 6/6 pass proves end-to-end. Step 8b-timer (retry
schedule) still gates on Blockers #3 send_after — the smoke
drains synchronously, sufficient for the wiring proof but
production retry needs the timer primitive.
m2 is now feature-complete except for the substrate timer
gate. The plan's Step 12 entry is ticked and a Progress log
entry added.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Substrate fix: two-line change to lib/erlang/runtime.sx that lets
http-listen handler routes call gen_server:call without deadlocking.
1. er-sched-step-alive!: pass :pending-args (when set) to the
initial-fun call instead of always passing an empty list.
Default behavior (no field) stays (list) — drop-in safe.
2. er-bif-http-listen sx-handler: instead of er-apply-fun handler
inline (which blows up on receive's er-suspend-marker because
the connection thread has no scheduler step on its stack),
create a real er-process with :initial-fun = handler and
:pending-args = (list req-pl), then er-sched-run-all! to drain.
Any receive (e.g. gen_server:call) suspends + resumes inside
the SX scheduler frame the process owns. Read :exit-result
for the response proplist; marshal back to SX dict.
Investigation arc (see plans/fed-sx-milestone-2.md Blockers #4 +
Progress log):
- loops/fed-prims bf8d0bf2 diagnosed it as Erlang-substrate, not
OCaml mutex (Pattern A wrong, Pattern B right but sketchy).
- First Pattern B attempt failed: tried er-spawn-fun on a raw SX
lambda, hit (er-fun? fv) gate. Connection-thread bisect
pinpointed the exact line.
- Real fix: use the existing er-fun (user's handler) directly,
but feed it via :pending-args so step-alive's hardcoded
(list) doesn't drop the request arg.
Acceptance:
- new next/tests/smoke_kernel_route.sh: 6/6 over real HTTP
(welcome /, /actors/alice, /actors/alice/outbox with
gen_server-backed tip, /actors/alice/inbox, unknown-actor,
via http_server:start(P, [{kernel, nx_kernel}])).
- next/tests/http_server_tcp.sh: 5/5 (bumped wait_bound from
30s to 180s — cold boot is slow under sibling-loop CPU load
and the per-handler scheduler ramp adds a small margin).
- Erlang conformance: 761/761.
Step 12's two-instance smoke test is now unblocked — its full
Follow / Accept / Note flow can layer on top of this kernel-route
surface. m2 plan updated.
Pre-existing httpc_request.sh flakiness ("Undefined symbol:
http-request" on the live-call epochs) reproduces WITHOUT this
change — see git stash A/B in the investigation. Unrelated.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A tz event now exports DTSTART;TZID=<name>:<local> (EXDATE/RDATE likewise;
UNTIL stays UTC per RFC), and the VCALENDAR emits a VTIMEZONE per distinct zone
with DAYLIGHT/STANDARD sub-components generated from the zone's transition rules
(offsets + FREQ=YEARLY;BYMONTH;BYDAY) — London/Paris blocks match real-world
definitions. Clients recur at fixed wall-clock time, DST-correct (prior caveat
gone). Importer tolerates ;TZID= params. 376/376 green.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Walked Pattern B's failure step-by-step from the connection thread
under a live http-listen instance, instrumenting each piece as its
own minimal sx-handler with a hardcoded reply dict:
hardcoded {:status 200 :headers {} :body "..."} -> HTTP 200 ✓
read er-sched-process-count -> "procs=2" ✓
er-pid-new! -> 204 ✓
er-proc-new! (er-env-new) -> 205 ✓
er-spawn-fun (fn () 42) -> HTTP 000
The break is er-spawn-fun's (not (er-fun? fv)) gate raising
"Erlang: spawn/1: not a fun" because the raw SX lambda isn't an
Erlang-fun-shaped {:tag "fun"} dict. The `error` raise propagates
through Sx_runtime.sx_call and is swallowed by the native http-listen
(try ... with _ -> ()) at sx_server.ml:852; connection writes
nothing and closes -> curl reports HTTP 000.
This invalidates the previous "scheduler-re-entry race" hypothesis:
the global er-sched-* state IS shared with the connection thread
and reads correctly (process count of 2 = boot main + http:listen).
The breakage is the strict er-fun? shape check, not concurrency.
Path forward (still substrate scope, one helper):
- Add an er-mk-host-fun helper in lib/erlang/runtime.sx (or a
small AST-constructor in transpile.sx) that produces a real
er-fun dict from a host SX closure.
- sx-handler can then build a 0-arity wrapper-with-captured-req-pl
and feed it to er-spawn-fun.
- er-sched-run-all! drains, exit-result is read, response goes
back to the wire.
Reverted runtime.sx to the Blockers #1 marshaller-bridge fix (the
in-flight Pattern B attempts are not committed). Blockers #4 entry
in plans/fed-sx-milestone-2.md updated with the verified diagnosis
and the one-helper path. Progress log entry added.
m2 stays at 11/12 steps; the substrate helper is loops/erlang scope.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bug: tz events store wall-clock LOCAL times but export stamped them with a Z
(UTC) suffix, so a London 18:00 event falsely read as 18:00 UTC. ev-ical-conv
now converts a tz event's DTSTART/UNTIL/EXDATE/RDATE local->UTC before
formatting (London summer 18:00 -> 170000Z; Paris -> 160000Z); non-tz events
unchanged. Caveat: UTC RRULE drifts from wall-clock-stable tz recurrence across
a DST boundary (VTIMEZONE deferred). 366/366 green.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ical.sx parses VEVENT/VCALENDAR text back into events (ev/ical-lines->event,
ev/parse-vcalendar): DTSTART/DURATION/RRULE (ordinal BYDAY, BYMONTHDAY, UNTIL/
COUNT/INTERVAL) + EXDATE/RDATE. Round-trip is occurrence-exact — export->import
expands to the identical occurrence set. Completes bidirectional interop.
360/360 green.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
loops/fed-prims commit bf8d0bf2 (merged as 94f6ab9f) diagnosed
Blockers #4 as Erlang-substrate scope and sketched a Pattern B fix
purely in er-bif-http-listen: wrap the handler call in er-spawn-fun
+ er-sched-run-all! and read the spawned process's :exit-result.
Tried it on lib/erlang/runtime.sx — does not work. Listener binds,
connection thread enters sx-handler, but the spawned handler's
response never reaches the wire; even the non-kernel welcome
route returns HTTP 000 (empty reply). Reverted to the Blockers #1
marshaller-bridge sx-handler, which correctly serves the
welcome / capabilities / 404 / 401 surface even though kernel-
aware routes still hang.
Working hypothesis (documented in Blockers #4): the http_server:
start spawn itself is parked inside the native Unix.accept loop on
the boot thread; the global er-sched-* state still has that
process in its queue. When the connection thread (under the
per-instance native mutex) calls er-sched-run-all!, it re-enters
the SAME global scheduler — the boot thread's er-sched-step! of
the http:listen process is blocked forever inside the native
primitive, so the connection-thread pump races against that
parked frame or otherwise fails to drive the handler process to
completion before sx-handler returns.
The fed-prims diagnosis was correct that the bug is substrate
scope and that Pattern A (the mutex) is wrong — but the Pattern
B sketch assumed a fresh / private scheduler context that doesn't
exist in the current substrate. Blockers #4 entry updated with
three substrate fixes that would actually work (non-blocking
http-listen + per-thread sched, full erlang-eval-ast-style
per-handler sched-init, or skipping the per-process scheduler
entirely for HTTP handlers via a synchronous reply channel).
m2 stays at 11/12 steps done; Step 12 remains gated. Loop pacing
dialled back down — substrate work owes to loops/erlang or a
follow-on fed-prims tick with a more careful design pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ev/book-series! / ev/cancel-series! apply a booking/cancel to every occurrence
of one event in a window (RSVP the whole weekly class), returning per-
occurrence (occ-key status) results; capacity still enforced per occurrence
(some :booked, some :full), idempotent re-book (:already). ev/series-count,
ev/series-booked. 341/341 green.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Surgical add of the two radar-authored planning docs onto architecture (both new
files, no conflict). Migration strategy: duplicate->cutover->diverge, strangler edge
+ layer-split shadow-diff, host-trio critical path. abstractions.md is the evidence
base the strategy cites (A1 done, W1/W4/W8 substrate-adoption findings).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Doc-only: records that the http-listen 'handler-mutex deadlock' is not a
mutex bug but an Erlang-scheduler-context issue (handler runs on a native
Thread.create outside any er-sched step, so gen_server:call->receive can
never complete). Pattern A inapplicable; correct fix is Pattern B in
er-bif-http-listen (lib/erlang, m2 scope). Full diagnosis + patch sketch in
plans/fed-sx-host-primitives.md.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Facade read-by-id was top-level only while content/edit's update/delete are
tree-wide — could not read back a nested block content/edit just modified.
Added generic ct-find-id (doc.sx) + doc-find-deep/doc-has-deep?; content/find
+ has? now descend into sections. content/find-top/has-top? keep top-level
lookup. Audit: remaining doc-find/ct-index-of callers are positional
insert/move (top-level by design). +6 api tests.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Investigated the http-listen "handler-mutex deadlock" per
plans/agent-briefings/fed-prims-mutex-fix.md. Reproduced deterministically
(single kernel-route request returns empty reply while a non-kernel route
returns 200; also reproduced with a 3-line minimal echo gen_server).
Root cause is in the Erlang substrate, not the OCaml mutex: native
http-listen runs each handler on a fresh Thread.create outside any Erlang
scheduler step, so gen_server:call -> receive (which raises er-suspend-marker
expecting an enclosing er-sched-step-alive! guard + er-sched-run-all! pump)
can never complete.
Pattern A is inapplicable: the failure reproduces on a single request with
zero contention, so it is not a mutex-contention deadlock; the mutex is in
fact required and must stay. Sx_runtime.sx_call is fully synchronous and no
OCaml symbol reaches the SX-level scheduler, so there is no OCaml-only fix.
The correct fix is Pattern B done entirely in er-bif-http-listen
(lib/erlang/runtime.sx) — spawn the handler as an er-process and
er-sched-run-all! to completion — which is m2 / loops/erlang scope.
Doc-only: full diagnosis + concrete patch sketch added to the Blockers and
Progress log of plans/fed-sx-host-primitives.md. No bin/sx_server.ml change.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ical.sx serializes events to VEVENT/VCALENDAR text for import by standard
clients: UTC basic-format stamps, DURATION (PT#H#M), full RRULE
(FREQ/INTERVAL/COUNT/UNTIL/BYDAY incl. monthly ordinals 2TU/-1FR/BYMONTHDAY)
plus EXDATE/RDATE. Line-oriented (ev/event->ical-lines / ev/events->ical-lines)
with ev/ical-render joining CRLF for the wire format. 332/332 green.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
oauth.sx routes the PKCE check through pkce_ok: an S256 challenge carried as
{s256, Hash} compares crypto:hash(sha256, Verifier) =:= Hash; a bare
challenge stays plain (§4.1), so both methods coexist with no change to
existing flows (the bare path is the old =:= behaviour). Raw sha256 digests
are compared (base64url is wire encoding, omitted). New tests/pkce.sx (6,
incl. S256 through PAR). Verified pkce 6/6; substrate fix is in the
preceding commit. 239 total.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ev/book-checked! prevents an attendee double-booking themselves across
different events by consulting their persist-derived availability for the
occurrence window (:time-conflict on overlap; same-occurrence re-book stays
idempotent).