From 4f766ea4f14751646a98c3606deb55ba7bf5a069 Mon Sep 17 00:00:00 2001 From: giles Date: Fri, 3 Jul 2026 21:28:41 +0000 Subject: [PATCH] plans: SX review master remediation plan + evidence MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Consolidates the three-lane review (core K01-K110, hosts J*/C*/JS*/P*/S*, conformance F1-F15) into plans/sx-review/: - PLAN.md — 15 workstreams, phased execution, full per-finding coverage ledger (every ~213 finding-instances mapped to a workstream + status) - RULINGS.md — 40 draft normative rulings (Phase-0 gate) - core.md / hosts.md / conformance.md — the lane evidence files dc7aa709 quick-wins batch marked DONE in the ledger; K01 (guard re-raise hang), S1 (live HTTP crash), K03 (shift-k), and W14 (test gate) flagged as the highest-value open work. Co-Authored-By: Claude Fable 5 --- plans/sx-review/PLAN.md | 343 ++++++++++ plans/sx-review/README.md | 21 + plans/sx-review/RULINGS.md | 396 ++++++++++++ plans/sx-review/conformance.md | 188 ++++++ plans/sx-review/core.md | 717 +++++++++++++++++++++ plans/sx-review/hosts.md | 1103 ++++++++++++++++++++++++++++++++ 6 files changed, 2768 insertions(+) create mode 100644 plans/sx-review/PLAN.md create mode 100644 plans/sx-review/README.md create mode 100644 plans/sx-review/RULINGS.md create mode 100644 plans/sx-review/conformance.md create mode 100644 plans/sx-review/core.md create mode 100644 plans/sx-review/hosts.md diff --git a/plans/sx-review/PLAN.md b/plans/sx-review/PLAN.md new file mode 100644 index 00000000..4681232d --- /dev/null +++ b/plans/sx-review/PLAN.md @@ -0,0 +1,343 @@ +# SX Review — Master Remediation Plan + +Consolidates every finding from the three parallel review sessions (2026-07-03): +- `core.md` — language core / spec semantics (K01–K110) +- `hosts.md` — per-host implementations + FFI (J*, C*, JS*, P*, S*, PY) +- `conformance.md` — cross-host agreement + test adequacy (F1–F15, conf-S1–S5) +- `RULINGS.md` — 40 draft normative rulings (R1–R40) that gate the ambiguity fixes + +**How to read this.** Findings are grouped into workstreams (W1–W15). Each workstream lists the +finding IDs it resolves, the approach, what ratified ruling(s) it needs, and status. The full +per-ID coverage ledger is at the bottom — every finding maps to a workstream + status, so nothing +is silently dropped. `[DONE]` = landed in commit dc7aa709 (quick-wins batch). `[GATE]` = blocked on +a Phase-0 decision. `[dup→Kxx]` = same defect found by another lane, fixed once. + +**Prime directive from the review:** the verification infrastructure currently cannot tell you +whether a fix works (runner envs diverge from production, the WASM kernel never runs the corpus, +the JS gate is structurally red, one test passed *because of* the bug it tested). So Phase 1 Track A +(gate repair) comes before the bulk of the semantic work — otherwise fixes land blind. + +--- + +## Phase 0 — Decisions (BLOCKING; maintainer; no code) + +Nothing in Phases 2+ that changes observable semantics should merge before the relevant ruling is +ratified. These three decisions unblock ~40 findings. + +### D1. Host lineup +Evidence: the JS-transpiled bundle is hollow (C0a: define-library files → 0 bytes) and its gate is +red (C0b: 2490/5086 fail); nothing serves it. The standalone Python host cannot load (C30/PY). +Production = OCaml native + WASM kernel (one OCaml library) + the load-bearing Python parser/bridge +in `shared/sx/`. +**Recommendation:** declare the kernel family the only evaluator targets; retire `hosts/javascript` ++ `hosts/python` standalone; shrink `shared/sx/parser.py` to a wire-subset with a parity suite. +→ Ratifying this **closes W13 entirely** (C0a/C0b/JS1–JS8 become "delete") and simplifies W6/W7. + +### D2. Ratify RULINGS.md (R1–R40) +Each ruling is one normative answer + one mechanical fix. Ratify in a pass; four need a +pre-ratification usage sweep because they're high-churn: **R17** (arity: kill nil-fill), **R9** +(cond flat-only), **R31** (append! errors on derived lists), **R15a** (HO swap only when +unambiguous). See RULINGS.md for the per-ruling recommendation. + +### D3. Define the merge gate +Recommendation: (a) native `run_tests` green with hs-upstream skip-listed; (b) same corpus on the +WASM kernel; (c) cross-kernel differential battery output-identical; (d) CEK-vs-forced-JIT +differential when JIT is on; (e) `sx_ref.ml` regen + diff. This is W14's definition of done. + +--- + +## Phase 1 — Trustworthy verification + stop the bleeding + +### W14. Test gate & conformance infrastructure *(do FIRST — everything else verifies against it)* +Findings: C0b, C9, C21, C22, C23, C3, C4, C5, C6, C7, F2, F3, F4, F5, F6, F7, F8, F9, F10, F11, +F12, K19 (harness/runtime primitive drift, partial from batch), K104 (harness log-before-mock). +Approach: +1. **Unify runner env with production env** — delete or productionize every runner-only binding: + `values`/`call-with-values` (F7, K42), the JS runner's fake sha3/equal?/apply/env-set! shims + (JS5, F7). Rule: if the spec needs it, it's a kernel primitive; if not, the test can't have it. +2. **WASM corpus runner** in CI (F2) — promote conformance's `run_wasm.js` prototype. +3. **MCP harness honesty** (K19): `mcp_tree.ml` drops its parallel primitive table and links real + `sx_primitives` (batch aligned 8 entries as a stopgap); make `sx_harness_eval` fresh per call. +4. **Harness fixes**: log IO before invoking the mock (C22/K104); real perform/suspend mode (C21); + adapter-dom render-output tests (C23). +5. **Epoch-loop protocol fuzz suite** (C3/C4/C5/C6/C7) + skip-list hs-upstream (F10) + empty suite + label (C9). +6. **Test-debt ledger**: pin every confirmed finding with a failing test FIRST — the three lane + files are a ready-made corpus of minimal reprs. **Batch gap to close: dc7aa709's fixes have no + pinning tests** (except crit-2, now non-vacuous). Add tests for K09, K11, K18, K20, K39, K49, + C1/C1b, S4 before further evaluator work. +Gate: none (this IS the gate). Status: OPEN — highest priority. + +### W1. Condition system & delimited continuations *(the kernel criticals)* +Findings: K01 (guard/handler re-raise hang — CRITICAL), K03 (shift-k nested cek-run double-exec — +CRITICAL), K10 (dynamic-wind re-entry + sibling winder corruption), K12 (`->` non-HO steps in +nested CEK), K36 (guard multi-expr clause body — inherits cond fix W5), K41 (host errors uncatchable +by guard), K57 (strict errors uncatchable), K106 (SUSPECTED: expand-macro/let-values/qq nested-eval +boundaries), S10 (VM inline-IO in HO callbacks can't suspend). K02 [DONE]. +Root cause (shared by K03/K12/K106/S10): evaluation crosses a **nested `cek-run`/`trampoline +(eval-expr)` boundary** the outer continuation can't see. One architectural fix — invoke +continuations and evaluate these sub-expressions via CEK frames, not nested runs — resolves the +cluster. K01 is separate: run handlers with the OUTER handler set (unwound kont must EXCLUDE the +matched frame); make guard clause bodies evaluate after the escape (the no-match auto-reraise path +already does this — make it the only path). K10: common-ancestor before/after algorithm + winders +stored per-continuation, not one global length-keyed stack. K41/K57: raise host/primitive/strict +errors as structured catchable conditions (needs R7). +Gate: R6 (handler installation), R7 (what's catchable), R8 (raise-continuable). Status: OPEN — +**K01 is the single highest-value fix left** (DoS-able hang, server + browser). + +### W3. HTTP-mode concurrency & serving safety *(production robustness — lib/host is LIVE)* +Findings: S1 (multi-Domain render race — LIVE CRASH), S2 (per-request globals read by queued +workers), S3 (`expand-components?` bind/remove on shared env), S5 (cache key ignores cookies/query), +S11 (URL evaluated as SX — any env binding invokable), S12 (island hydration reuses no SSR DOM), +S13 (SSR/client purity, no dev-mode check), K30 (emit!/emitted cross-request — shared with W2). S4 +[DONE], J1/J2/J3 mitigated by the batch JIT gate. +Approach: serialize or isolate rendering (S1: lock `_stream_mutex` or per-Domain env/cache); +per-request state carried with the request not process-global (S2/S3/K30); include query in cache +key + cookie policy (S5); whitelist URL-routable bindings to a `page:` prefix (S11); hydration cursor ++ dev-mode purity check (S12/S13). Pairs with W2 (per-flow scope stacks). +Gate: none for S1/S4/S5 (safety). Status: OPEN — S1 is a live crash. + +--- + +## Phase 2 — Correctness families (each = ruling + fixes + conformance rows) + +### W2. Environment & scope integrity +Findings: K04 (caller frame leaks into interpreted lambda + JIT disagrees), K05 (letrec injects into +foreign closures — global contamination), K06 (named-let leaks loop name), K07 (~60 unshadowable +names; = J8 VM-honors-CEK-doesn't), K30 (emit! cross-request — shared W3), K31 (provide leak on +raise/shift), K32 (provide! ambient global), K33 (set! unbound creates + JIT/interp split brain), +K40 (scope :value dead + dead frame type), K107 (SUSPECTED env_merge depth-100 flip). +Approach: fresh frames for letrec/named-let (K05/K06); drop the top-frame copy in env_merge +(K04/K107); reserved-words error for dispatch names, aligning VM+CEK (K07/J8); unwind-safe + +invocation-scoped dynamic state — one mechanism for provide/emit!/batch (K30/K31/K32); set!-unbound +per R1 + kill the JIT/interp global split (K33); remove dead scope :value/frame (K40). +Gate: R1 (set!), R2 (reserved names). Status: OPEN. + +### W4. Higher-order forms & threading +Findings: K13 (2-arg reduce returns coll), K14 (reduce init-swap), K15 (data-first drops extra +args), K43 (O(n²) map/filter), K44 (HO names not first-class), K45 (cryptic uncatchable HO errors), +K46 (multi-coll rejects strings/vectors), K47 (thread lambda literal), K78 (component in HO → zeros), +K79 (dead `|>`), K80 (keyword getters in HO/->), K81 (zero-arg HO silent ()), J7 (VM data-first +deopt — shared W11). +Approach: implement R15 sub-rulings (swap only when one arg callable + error otherwise; reduce +arities; drop-extra→error; multi-coll seq-to-list parity; HO first-class; zero-arg→error); fix O(n²) +via reversed-cons accumulation; delete dead `|>`. +Gate: R13 (threading), R15 (HO forms). Status: OPEN. + +### W5. Special forms & macros +Findings: K08 (cond dual-grammar — silent side-effect drops), K34 (qq depth), K35 (qq dict +traversal), K37 (&key misbind on fn/defmacro), K38 (splice non-list/malformed), K70 (case else any +position), K71 (case dialect + punning), K72 (letrec parallel + ref-before-init), K76 (defmacro +unhygienic vs "hygiene" test name), K77 (match guard clauses silently structural), K42 (values — +special forms now registered [DONE-partial]; `values` primitive still runner-only). K09/K11/K39 [DONE]. +Approach: cond flat-only + explicit begin (R9); qq depth tracking + dict traversal + splice arity +errors (R12); &key one binding path for fn/defmacro/component (R5/K37); case final-else + evaluated- +datum doc + clause-syntax error (R10); letrec* + ref-before-init error (R4); match guards implemented +or error (R14); make `values` a real kernel primitive (finish K42). +Gate: R4, R5, R9, R10, R12, R14. Status: OPEN (K09/K11/K39 done). + +### W6. Parser, serializer, canonical form & CIDs +Findings: K21 (canonical.sx runner-only helpers), K22 (serializer dict-key escaping + CID fixed- +point), K23 (four divergent ident/number classifiers), K24 (`1e`→nil), K25 (guest rationals throw), +K63 (`#;` before `)`), K64 (`=` no Char arm — shared W7), K65 (`#\a` mcp crash), K66 (multibyte char +literals), K67 (`\uXXXX` validation), K68 (unknown-escape divergence), K69 (`#name` reader macro +unimpl on OCaml), K100 (parse error locations), K101 (dict literal edges), K102 (`#|` raw string), +K103 (`:`/`::` keyword edges), K108 (SUSPECTED cross-host CID nondeterminism), C25 (Py↔OCaml escape +corruption), C26 (Py unicode symbols), C27 (Py dict order — shared W7/P9). +Approach: ONE normative ident/number classifier bound by every surface (R32); \u validation + +unknown-escape error + datum-comment fix (R27/R33); native `#name` reader macro registry; +canonical path = native CBOR/CID normative, spec/canonical.sx tested mirror or deleted, property- +test `parse(serialize(x))=x` and canonical fixed-point cross-kernel (R34/R35). CID determinism +(K108/K35-in-canonical) is sx-pub-critical. +Gate: R27, R32, R33, R34, R35. Status: OPEN. + +### W7. Numbers, equality, strings, collection primitives +Findings: K17 (append! silent no-op), K52 (byte-based strings), K53 (spec/runtime primitive drift), +K54 (div-by-zero inconsistency), K55 (`/` doc), K56 (sort no comparator), K64 (char equality), K85 +(binary `=`, exactness conflation), K86 (rounding/inexact->exact/sqrt), K87 (float/nil rendering), +K88 (nil/empty tolerance), K89 (keys reverse order — GATED, see R29 note: breaks render tests), K90 +(keyword-name on evaluated kw), K91 (string->number), K92 (apply doesn't spread), P1 (lossy float +wire), P2 (sort mixed int/float), P3 (into needs bridge), P4 (int63 vs float64), P5 (= not deep on +JS dicts, missing eq?/eqv?), P6 (string units), P7 (JS coercion cluster — GATE D1), P8 (nil/list +strictness), P10 (NaN/Inf wire tokens), P11 (upcase/round), P12 (zip-pairs). K18/K20 [DONE]. +Approach: append! errors on non-mutable lists + deprecate (R31); codepoint string semantics (R25); +implement eq?/eqv?, add `=` Char arm, n-ary comparisons (R19); exact-±2^53 + overflow-promote (R21); +shortest-round-trip float printing + inf/nan wire tokens (R23); div-by-zero catchable (R22); apply +spreads (R16); sort comparator + numeric compare (R30/P2); into native, contains?-on-dict [done], +merge-skip-nil, zip-pairs sliding window (R30/P8/P12); reconcile spec/runtime primitive lists (K53). +Gate: R16, R19, R21, R22, R23, R25, R29, R30, R31. Status: OPEN (K18/K20 done). + +### W8. Render pipeline +Findings: K16 (infinite recursion no depth guard), K48 (attr-name injection — XSS class), K50 (aser +list kwargs), K51 (dom/html attr parity = C19), K82 (bool-attr truthiness footgun), K83 (dead +is-render-expr? / html: tags), K84 (script/style escaping), K87 (float render — shared W7), C19 (=K51), +C20 (CSRF cross-origin), S14 (deep nested-list flatten html vs aser), S9 (SPA boosted-nav fragility). +K49 [DONE]. Approach: depth limit + cycle guard (K16); attr-name validation (R36/K48); quote aser +list kwargs (R38/K50); align 4 adapters on bool-attr contract (R36/C19/K51); script/style raw-text +error-on-breakout (R37/K84); wire or delete is-render-expr? (R37/K83); depth-2 aser/html parity test +(S14). CSRF cross-origin (C20) + SPA manifest staleness (S9, overlaps W11 stale-bundle). +Gate: R36, R37, R38. Status: OPEN (K49 done). + +### W9. Strict typing +Findings: K26 (HO callbacks bypass), K27 (apply bypasses), K58 (unknown type names match all), K59 +(keyword type dead / components untypeable), K60 (component &key misalign), K93 (name-keyed, evaded), +K94 (set-prim-param-types! no validation), K95 (too-few args skip checks), K96 (`(:as type)` unenforced), +K97 (paper cuts). Approach (R20): move checks to continue-with-call/vm_call chokepoints (covers HO, +apply, components, => receivers); validate type names at declaration; real "component" branch, remove +dead "keyword" (R18); `(:as type)` as the declaration channel; merge+validate set-prim-param-types!; +strict errors catchable (R7, shared W1). Return types explicitly out of scope. +Gate: R7, R18, R20. Status: OPEN. + +### W10. Signals & coroutines +Findings: K28 (dispose-computed no-op), K29 (batch wedge on exception), K61 (identity-not-equality +change detection), K62 (diamond glitch), K98 (batch unusable on server / coroutines inert), K99 +(effect cleanup double-invoke), K109 (SUSPECTED coroutine non-yield wedge), K110 (SUSPECTED VM no +strict — shared W9/W11). Approach (R39): `=`-based change detection (needs W7 R19); unwind-safe batch +(shared W2 mechanism); two-phase/topological notify for glitch-freedom; fix dispose-computed + effect +cleanup; make batch/coroutines work outside run_tests (bind batch-begin!/end! + cek hooks in real +envs, or fold into kernel). Zero test coverage today — add suites. +Gate: R39. Status: OPEN. + +### W12. Python bridge & boundary *(load-bearing in production)* +Findings: C24 (boundary validation dead — [DONE-partial]: now warns; full revival needs tier-1 +declarations recreated + zero-violation proof since SX_BOUNDARY_STRICT=1 is live), C28 (two SxExpr +classes double-quote), C29 (reader-macro auto-resolve broken), C30 (standalone Python host dead — +GATE D1: delete), C31 (14/33 test files broken + 5 live failures), S-bridge (coroutine-cancel +desync, no timeouts, dead _restart), S-bridge2 (numeric-result-as-epoch ambiguity), K42 (values — +shared W5). C25/C26/C27 live in W6 (parser). Approach: finish C24 (recreate declarations, prove +clean, re-enable); single SxExpr class (C28); fix OcamlSync.start→_ensure (C29); bridge timeouts + +working _restart (S-bridge); robust (ok N V) parse (S-bridge2); fix/retire broken tests (C31). +Gate: D1 (C30). Status: OPEN (C24 partial). + +### W11. JIT correctness (serving-JIT re-enable preconditions) +Findings: J1 (`->` miscompile), J2 (fallback re-runs whole call — double side effects), J3 (macro +args eager), J4 (VM component kwargs misparse), J5 (specialized opcodes freeze redefs), J6 (compiler- +used prim redef poisons), J7 (data-first deopt — shared W4), J10 (stale Sx_compiler stub), J11 (JIT +debug paths diverge), K33 (set! split brain — shared W2), K19 (harness drift — shared W14), C10 +(browser compiler one fix behind), C11 (stale module-manifest.sx), C12 (dead SOURCE_MAP paths), C14 +(stale dist/ bundle). J12 = positive (perform/resume fixed). Currently MITIGATED (JIT gated OFF in +both epoch and — post-batch — HTTP mode). Approach: fix compile-thread-step (J1); fallback-before- +side-effects or compile-time reject of fallback-prone forms (J2); macro-aware compile (J3); keyword +tagging in constant pool (J4); redefinition invalidation (J5/J6); one browser-compiler sync pipeline ++ single bundle dir (C10/C11/C12/C14). Do NOT re-enable serving-JIT until the CEK-vs-JIT differential +(W14) is green. +Gate: W14 differential. Status: DEFERRED (mitigated; only unblock if serving-JIT is wanted). + +### W13. JS host *(GATE D1 — likely "delete")* +Findings: C0a (hollow bundle), C0b (2490 fail gate), JS1 (define-record-type/makeRtd), JS2 (host- +callback type tag), JS3 (arithmetic drops args), JS4 (`.` symbol), JS5 (runner shims), JS6 (str nil), +JS7 (no qq emission), JS8 (stale metadata). If D1 retires the JS bundle: delete `hosts/javascript`, +remove from `sx-build-all.sh`/CI, keep only the WASM kernel path. If kept: this is a ~2500-test +revival project. Gate: D1. Status: BLOCKED on D1. + +--- + +## Phase 3 — Hygiene & docs + +### W15. Hygiene & documentation +Findings: C8 (triplicated hosts/ocaml/hosts/ tree), C13 (test_platform.js stale path), C15 (tracked +stale wasm blob), C16 (orphaned hosts/native), C17 (sx-platform-2.js + 23 dead .sxbc.json), C18 +(spa-debug.js + root clutter), C2 (r7rs string->number radix shadow), F14 (doc drift — batch fixed +canonical-ref + island rules; suite counts + case-syntax + primitives-header still stale), F15 +(sha3 stub / test.sx dead filename), F13 (regen reproducibility — [DONE] as batch side effect). +K105/K73 [DONE]. Approach: delete dead trees/blobs/files; fix r7rs shadow (C2); finish CLAUDE.md +(suite counts, case syntax); regen-diff CI check (F13 → make it a gate in W14). +Gate: D1 (some deletions). Status: OPEN (K105/K73/F13 done). + +--- + +## Suggested execution shape (maps to the loop workflow) + +Four loops, mostly independent after Phase 0: +1. **loops/sx-gate** (W14 + W15 hygiene) — the enabler. Start FIRST. Pins tests for the dc7aa709 + batch, builds the WASM corpus runner + differential battery, unifies runner env, cleans dead code. +2. **loops/sx-kernel** (W1 + W2 + W5) — condition system, scope integrity, special forms. Single + owner (touches evaluator.sx + regen). TDD off W14's pinned tests. K01 first. +3. **loops/sx-runtime** (W3 HTTP safety + W12 Python bridge) — production robustness; can run + parallel to kernel since it's mostly host OCaml + Python, not spec. +4. **loops/sx-families** (W4, W6, W7, W8, W9, W10) — one family at a time, each gated by its rulings + + the new batteries. W6/W7 pay the sx-pub CID debt. +W11 (JIT) and W13 (JS) are decision-gated and sit out until D1 + a green differential exist. + +**Sequencing rule:** no semantic fix merges before (a) its pinning test exists, (b) the relevant +ruling is ratified, (c) native + WASM both run it. D1/D2/D3 are the only hard blockers. + +--- + +## Coverage ledger — every finding accounted for + +Status key: DONE (dc7aa709) · OPEN · PARTIAL · DEFERRED · GATE(Dn) · dup→(primary). Workstream in []. + +### Core (K01–K110) +- K01 [W1] OPEN — guard/handler re-raise hang (CRITICAL, highest value) +- K02 [W1] DONE — signal-return frame key +- K03 [W1] OPEN — shift-k nested cek-run (CRITICAL) +- K04 [W2] OPEN · K05 [W2] OPEN · K06 [W2] OPEN · K07 [W2] OPEN (=J8) +- K08 [W5] OPEN — cond dual grammar +- K09 [W5] DONE · K10 [W1] OPEN · K11 [W5] DONE +- K12 [W1] OPEN (=W4 threading) · K13 [W4] OPEN · K14 [W4] OPEN · K15 [W4] OPEN +- K16 [W8] OPEN · K17 [W7] OPEN — append! · K18 [W7] DONE · K19 [W14] PARTIAL · K20 [W7] DONE +- K21 [W6] OPEN · K22 [W6] OPEN · K23 [W6] OPEN · K24 [W6] OPEN · K25 [W6] OPEN +- K26 [W9] OPEN · K27 [W9] OPEN · K28 [W10] OPEN · K29 [W10] OPEN +- K30 [W2/W3] OPEN — emit! cross-request (=S2 dir) +- K31 [W2] OPEN · K32 [W2] OPEN · K33 [W2/W11] OPEN — set! split brain +- K34 [W5] OPEN · K35 [W5/W6] OPEN · K36 [W1/W5] OPEN · K37 [W5] OPEN · K38 [W5] OPEN +- K39 [W5] DONE · K40 [W2] OPEN · K41 [W1] OPEN · K42 [W5/W12] PARTIAL (forms registered; `values` prim runner-only) +- K43 [W4] OPEN · K44 [W4] OPEN · K45 [W4] OPEN · K46 [W4] OPEN · K47 [W4] OPEN +- K48 [W8] OPEN · K49 [W8] DONE · K50 [W8] OPEN · K51 [W8] OPEN (=C19) +- K52 [W7] OPEN · K53 [W7] OPEN · K54 [W7] OPEN · K55 [W7] OPEN · K56 [W7] OPEN +- K57 [W1/W9] OPEN · K58 [W9] OPEN · K59 [W9] OPEN · K60 [W9] OPEN +- K61 [W10] OPEN · K62 [W10] OPEN · K63 [W6] OPEN · K64 [W6/W7] OPEN — char `=` +- K65 [W6] OPEN · K66 [W6] OPEN · K67 [W6] OPEN · K68 [W6] OPEN · K69 [W6] OPEN +- K70 [W5] OPEN · K71 [W5] OPEN · K72 [W5] OPEN · K73 [W15] DONE +- K74 [W2] OPEN (component &key false→nil; R5) · K75 [W2] OPEN (trailing kw; R5) +- K76 [W5] OPEN · K77 [W5] OPEN · K78 [W4] OPEN · K79 [W4] OPEN · K80 [W4] OPEN · K81 [W4] OPEN +- K82 [W8] OPEN · K83 [W8] OPEN · K84 [W8] OPEN · K85 [W7] OPEN · K86 [W7] OPEN · K87 [W7/W8] OPEN +- K88 [W7] OPEN · K89 [W7] OPEN — keys order, GATED R29 (breaks render tests, see RULINGS note) +- K90 [W7] OPEN · K91 [W7] OPEN · K92 [W7] OPEN — apply spread +- K93 [W9] OPEN · K94 [W9] OPEN · K95 [W9] OPEN · K96 [W9] OPEN · K97 [W9] OPEN +- K98 [W10] OPEN · K99 [W10] OPEN · K100 [W6] OPEN · K101 [W6] OPEN · K102 [W6] OPEN · K103 [W6] OPEN +- K104 [W14] OPEN · K105 [W15] DONE +- K106 [W1] OPEN (SUSPECTED nested-eval boundaries) · K107 [W2] OPEN (SUSPECTED) +- K108 [W6] OPEN (SUSPECTED CID nondeterminism) · K109 [W10] OPEN (SUSPECTED) · K110 [W9/W11] OPEN (SUSPECTED) + +### Hosts — JIT (J1–J12) +- J1 [W11] DEFERRED (mitigated: JIT gated off) · J2 [W11] DEFERRED · J3 [W11] DEFERRED +- J4 [W11] DEFERRED · J5 [W11] DEFERRED · J6 [W11] DEFERRED · J7 [W11/W4] DEFERRED +- J8 [W2] OPEN dup→K07 · J9 [W11/W14] DEFERRED · J10 [W11] DEFERRED · J11 [W11] DEFERRED +- J12 POSITIVE (no action — perform/resume verified fixed) + +### Hosts — kernel/protocol/build (C*) +- C0a [W13] GATE(D1) · C0b [W13/W14] GATE(D1) · C1 [W3] DONE · C1b [W3] DONE +- C2 [W15] OPEN · C3 [W14] OPEN · C4 [W14] OPEN · C5 [W14] OPEN · C6 [W14] OPEN · C7 [W14] OPEN +- C8 [W15] OPEN · C9 [W14] OPEN · C10 [W11] DEFERRED · C11 [W11] DEFERRED · C12 [W11/W15] OPEN +- C13 [W15] OPEN · C14 [W11/W15] OPEN · C15 [W15] OPEN · C16 [W15] OPEN · C17 [W15] OPEN · C18 [W15] OPEN +- C19 [W8] OPEN dup→K51 · C20 [W8] OPEN · C21 [W14] OPEN · C22 [W14] OPEN · C23 [W14] OPEN +- C24 [W12] PARTIAL · C25 [W6] OPEN · C26 [W6] OPEN · C27 [W6/W7] OPEN dup→P9 +- C28 [W12] OPEN · C29 [W12] OPEN · C30 [W12] GATE(D1) · C31 [W12] OPEN + +### Hosts — JS host (JS1–JS8) +- JS1–JS8 [W13] all GATE(D1) — delete if JS retired, else ~2500-test revival + +### Hosts — cross-host parity (P1–P12, PY) +- P1 [W7] OPEN · P2 [W7] OPEN · P3 [W7] OPEN · P4 [W7] OPEN · P5 [W7] OPEN · P6 [W7] OPEN +- P7 [W7] GATE(D1) · P8 [W7] OPEN · P9 [W6/W7] OPEN (=C27) · P10 [W7] OPEN · P11 [W7] OPEN · P12 [W7] OPEN +- PY [W13] GATE(D1) dup→C30 + +### Hosts — HTTP/suspected (S1–S14, S-bridge*) +- S1 [W3] OPEN (LIVE CRASH) · S2 [W3/W2] OPEN · S3 [W3] OPEN · S4 [W3] DONE · S5 [W3] OPEN +- S6 [W14] OPEN · S7 [W14/W1] OPEN (unify eval/IO paths) · S8 [W13/W8] OPEN (browser env prims) +- S9 [W8/W11] OPEN · S10 [W1] OPEN · S11 [W3] OPEN · S12 [W3] OPEN · S13 [W3] OPEN · S14 [W8] OPEN +- S-bridge [W12] OPEN · S-bridge2 [W12] OPEN + +### Conformance (F1–F15, conf-S1–S5) +- F1 [W7] OPEN dup→K18/P4 (WASM int wrap) · F2 [W14] OPEN · F3 [W7/W6] OPEN (apply + dict order) · F4 [W13/W14] GATE(D1) +- F5 [W14] OPEN (host-neutral corpus) · F6 [W14] OPEN (directories one-host-gated) · F7 [W14] OPEN dup→K42 +- F8 [W14] OPEN (differential battery) · F9 [W7/W14] OPEN (primitive parity) dup→K53 · F10 [W14] OPEN (skip hs) +- F11 [W12] OPEN dup→C24 · F12 [W6] OPEN dup→C25/26/27 · F13 [W15] DONE · F14 [W15] PARTIAL · F15 [W15] OPEN +- conf-S1 [W14] OPEN (native-vs-WASM web-stack diff) · conf-S2 [W14] OPEN (hyperscript unverifiable) +- conf-S3 [W11] OPEN (import path browser vs test) · conf-S4 [W14] OPEN (float golden precision) · conf-S5 [W11] OPEN (JS build-flag ADT divergence) + +### Tally +~213 finding-instances. DONE: 13 (dc7aa709). PARTIAL: 4 (K19, K42, C24, F14). DEFERRED: 12 (W11 JIT). +GATE(D1): ~16 (JS host + Python standalone). OPEN: the rest, distributed across W1–W12/W14/W15. diff --git a/plans/sx-review/README.md b/plans/sx-review/README.md new file mode 100644 index 00000000..dd9422b3 --- /dev/null +++ b/plans/sx-review/README.md @@ -0,0 +1,21 @@ +# SX Review — 2026-07-03 + +Findings from three parallel review sessions of the SX language/runtime, plus the master +remediation plan. + +| File | What | +|------|------| +| **PLAN.md** | Master remediation plan: 15 workstreams (W1–W15), execution order, and a full per-finding coverage ledger. Start here. | +| **RULINGS.md** | 40 draft normative rulings (R1–R40). Phase-0 gate — ratify before the semantics fixes. | +| core.md | Language core / spec semantics lane (K01–K110). | +| hosts.md | Per-host implementations + FFI lane (J*, C*, JS*, P*, S*, PY). | +| conformance.md | Cross-host agreement + test adequacy lane (F1–F15, S1–S5). | + +**Status:** the quick-wins batch (commit dc7aa709) landed 13 fixes + 4 partials; suite at baseline +5762p/274f (fail set byte-identical). Everything else is OPEN/GATE/DEFERRED per PLAN.md's ledger. + +**Highest-value open items:** K01 (guard/handler re-raise hang — DoS-able, server+browser), +S1 (live HTTP crash under load), K03 (shift-k double-execution), and W14 (test gate — the enabler +that makes all other fixes verifiable). + +**Blocking decisions (maintainer):** D1 host lineup, D2 ratify rulings, D3 gate definition. diff --git a/plans/sx-review/RULINGS.md b/plans/sx-review/RULINGS.md new file mode 100644 index 00000000..9f123d0f --- /dev/null +++ b/plans/sx-review/RULINGS.md @@ -0,0 +1,396 @@ +# SX RULINGS — normative decisions on every ambiguity surfaced by the 2026-07-03 review + +DRAFT for ratification. Each ruling: STATUS `PROPOSED` → flip to `RATIFIED` / `REJECTED` / +`AMENDED: `. Once ratified, this file moves to `spec/RULINGS.md` and becomes the +authority the conformance batteries pin against. Evidence citations: core.md finding names, +hosts.md J/C/JS/P/S codes, conformance.md F codes. + +**Default posture used for recommendations** (override per-ruling as you see fit): +1. Prefer an ERROR over any silent behavior (silent drop/no-op/misparse caused the worst findings). +2. Prefer R7RS/standard semantics where churn is low; prefer current-behavior-plus-documentation + where churn is high and behavior is defensible. +3. Every ruling lands with conformance rows that run on BOTH production kernels (native + WASM). + +**Companion decisions (not language rulings, restated for context):** +- D1 host lineup — recommended: kernel family (native OCaml + WASM) are the only evaluator + targets; hosts/javascript and hosts/python standalone retired; shared/sx/parser.py shrunk to a + wire-subset with a parity suite. Rulings below marked [D1] simplify to kernel-only if ratified. +- D3 gate — recommended: native corpus green (hs-upstream skip-listed) + same corpus on WASM + + cross-kernel differential battery + CEK-vs-JIT differential (when JIT on) + sx_ref.ml regen diff. + +--- + +## A. Bindings & scope + +### R1. `set!` on an unbound name +- Current: silently creates a root binding (tested intent, test-scope.sx:196) — but BOTH spec docs + say error (eval-rules.sx:112, special-forms.sx:141), and under JIT it writes a different global + table than the interpreter (split brain). +- RECOMMENDATION: **ERROR** ("set!: is not bound — use define"). Typo'd set! is a bug-hider; + the docs already promise this. Flip test-scope.sx:196; sweep the corpus for reliance (expected + small — the idiom is define-then-set!). Either way the JIT/interpreter split MUST die. +- Churn: low-medium. Findings: core set!-unbound; hosts J-globals split. STATUS: PROPOSED + +### R2. The ~60 special-form/HO names (`map`, `filter`, `bind`, `match`, `do`, `case`, `->`, …) +- Current: `define`/`let`/`defmacro` of these names is silently accepted but ignored in call + position (CEK); the VM honors them (J8) — worst of both worlds. +- RECOMMENDATION: **reserved words** — `define`/`let`/`set!`/`defmacro` of any dispatch-table name + is a load-time ERROR. Publish the list in spec. Align the VM. (Full lexical honoring is more + Schemely but taxes every list-head dispatch and rescues little real code.) +- Churn: low (error surfaces existing dead definitions). Findings: core unshadowable-names; J8. STATUS: PROPOSED + +### R3. `let` semantics +- Current: sequential (`let*`), body = implicit begin, on BOTH engines (tested intent). CLAUDE.md + island rules claim the opposite (describes a dead evaluator). +- RECOMMENDATION: **ratify current behavior**: `let` ≡ `let*`; body sequences. Fix CLAUDE.md. + Document (or forbid) the observed letrec-ish quirk that binding-init lambdas capture the shared + frame (`(let ((f (fn () a)) (a 5)) (f))` → 5). +- Churn: zero (docs only). Findings: core let-docs; hosts handoff let-sequential. STATUS: PROPOSED + +### R4. `letrec` +- Current: parallel (all inits evaluated, then bound); read-before-init yields nil silently; PLUS + two outright bugs (names injected into foreign lambdas' closures = global contamination; + named-let loop name leaks into and clobbers the enclosing frame). +- RECOMMENDATION: **letrec\* semantics** (sequential init) with ERROR on read-before-init + (pre-bind to an "uninitialized" sentinel that faults on read). Named-let binds its loop name in + a fresh frame, invisible after the form. The closure-injection and frame-leak are bugs to fix + regardless of ruling. +- Churn: low. Findings: core letrec-parallel/-injection/named-let. STATUS: PROPOSED + +### R5. Component `&key` conventions +- Current: `:flag false` is coerced to nil (indistinguishable from omitted); trailing keyword with + no value silently binds nil; `&key` on plain fn/defmacro silently misbinds. +- RECOMMENDATION: `false` is a legal &key value (bind via has-key, not `(or …)`); trailing keyword + without a value = ERROR; `&key` in fn/defmacro either implemented identically to components or + ERROR at definition (recommend: implement — one binding path for all three). +- Churn: low. Findings: core &key-false / trailing-kw / defmacro-&key. STATUS: PROPOSED + +## B. Errors & conditions + +### R6. Handler installation semantics +- Current: a handler runs with ITSELF still installed → any raise/error inside a guard clause or + handler-bind handler loops forever (crit 1; WASM-verified). +- RECOMMENDATION: **R7RS/CL semantics** — handlers run with the OUTER handler set; guard clause + bodies evaluate after the escape (the no-match auto-reraise path already does this correctly — + make it the only path). +- Churn: zero for correct code (only un-hangs broken cases). Findings: crit 1, guard family. STATUS: PROPOSED + +### R7. What is catchable +- Current: only guest `(raise …)` reaches guard; host primitive errors, undefined-symbol, arity + errors, and strict type errors all blow through every handler. +- RECOMMENDATION: **everything is a condition.** Host/primitive/strict/undefined-symbol errors are + raised as structured condition dicts ({:type :message :op …}) through the same channel guard + sees. Reserve a non-catchable class only for kernel panics. +- Churn: low-medium (code that "relied" on uncatchability is unlikely). Findings: core + host-errors-uncatchable, strict-uncatchable; enables sane server error pages. STATUS: PROPOSED + +### R8. `raise-continuable` / `signal-condition` +- RECOMMENDATION: ratify R7RS: handler's value returns to the signal site (the current + whole-program-result behavior is crit 2's frame-key bug, not a semantic choice). STATUS: PROPOSED + +## C. Special forms + +### R9. `cond` grammar — kill the dual-mode heuristic +- Current: flat pairs documented; undocumented Scheme clause mode auto-detected iff every arg is a + 2-element list → silent side-effect drops, mode flips, wrong values (core cond-ambiguity). +- RECOMMENDATION: **flat pairs only**: `(cond t1 r1 t2 r2 … :else d)`. Multi-expression results + use explicit `(do …)`. Support arrow as a flat triple `t => receiver`. A clause-shaped arg list + as a test position is just evaluated — no mode detection ever. Migrate the cond-arrow suite + (test-r7rs.sx:135-145) and any clause-mode usage (sweep needed). +- Churn: medium (sweep + migrate clause-mode call sites). Findings: core cond-ambiguity, + guard-multi-expr (inherits). STATUS: PROPOSED + +### R10. `case` +- RECOMMENDATION: ratify the flat evaluated-datums form and document it (vals ARE evaluated, + first-match, structural `=`); `:else`/`else` legal ONLY in final position (else ERROR); Scheme + datum-list clause syntax → clear parse-time ERROR ("use flat pairs"). Keyword/string punning + follows R21 and gets documented. +- Churn: low. Findings: core case-else-position / case-dialect. STATUS: PROPOSED + +### R11. `do` +- Current: `do` is a begin-alias EXCEPT when its first form's head is a list — then it's a Scheme + do-loop → IIFE misparse. +- RECOMMENDATION: **`do` = begin alias, always.** Scheme do-loop moves to a distinct name + (`do-loop`) or is dropped (named let covers it). Kills the heuristic. +- Churn: low (sweep for real do-loop usage; expected rare). Findings: core do-IIFE. STATUS: PROPOSED + +### R12. Quasiquote +- RECOMMENDATION, four sub-rulings: + a. `unquote-splicing` becomes an alias of `splice-unquote` (one-line; kills the silent + zero-splice trap; rename the misleadingly-named tests). + b. Implement standard **depth tracking** (nested quasiquote raises quote depth; `,,x` works). + Hosts agree current shallow behavior is consistent-but-nonstandard — fix at spec level. + c. Quasiquote **traverses dict literals** (`{:k ,v}` works). + d. Splicing a non-list and malformed splice arity → ERROR. +- Churn: low (b is the only subtle one). Findings: core qq-longhand/-depth/-dicts/-splice-nonlist. STATUS: PROPOSED + +### R13. Threading `->` / `->>` +- RECOMMENDATION: (a) steps evaluate in CEK frames (bug: guard/IO broken through threading); + (b) a lambda literal as a step = expand-time ERROR; (c) keyword step sugar: `(-> x :k :j)` ≡ + `(-> x (get :k) (get :j))` — cheap, expected, kills the `Not callable: nil` trap; (d) remove the + dead `|>` dispatch branch (parser rejects `|` anyway); (e) fix reduce-seeding via R15. +- Churn: low. Findings: core threading-nested-CEK/-lambda-literal/|>-dead/keywords-as-getters. STATUS: PROPOSED + +### R14. `match` +- RECOMMENDATION: `(pattern (when cond))` guard clauses either implemented or ERROR — never + silently read as a structural pattern (current). Recommend: implement (small, high value). + Document let-match as dict-destructuring-only with a clear error for list patterns. +- Churn: low. Findings: core match-guards. STATUS: PROPOSED + +## D. Calling convention + +### R15. Higher-order forms +- RECOMMENDATION, six sub-rulings: + a. Arg-order swap happens ONLY when exactly one argument is callable (components count as + callable); both-callable or neither → ERROR "map: cannot determine function/collection". + b. `(reduce f coll)` (2-arg) = Clojure-style fold (first element as init, empty coll → error + unless f has identity? keep simple: empty → ERROR); `(reduce init f coll)` and threaded + `(-> init (reduce f coll))` work via the one-callable rule in (a). + c. Data-first with extra args = ERROR (today silently dropped). + d. Multi-collection map coerces every collection with seq-to-list (strings/vectors), zips to + shortest (already); map over a dict iterates `(k v)` pairs. + e. HO names are first-class: `map` etc. in value position resolve to real closures so + `(define f map)` / `(apply map …)` work. + f. Zero/one-arg HO calls = arity ERROR (today silently `()`). + Also fix the O(n²) accumulation (implementation, not semantics). +- Churn: medium — (a) changes behavior for ambiguous calls, sweep needed. Findings: core reduce-2arg / + reduce-swap / swap-drops-args / HO-not-first-class / ho-cryptic-errors / multi-coll / zero-arg; + J7 (VM parity). STATUS: PROPOSED + +### R16. `apply` +- Current: native never spreads; WASM spreads 2-arg; test runner has a third behavior (three-way + divergence, F-3 + core corrected finding). +- RECOMMENDATION: **R7RS**: `(apply f a b … rest-list)` spreads, leading args prepended. All + surfaces align; strict checks fire through apply (R25). +- Churn: low (today it mostly errors). STATUS: PROPOSED + +### R17. Arity checking (too-few args) +- Current: missing params silently nil-fill (this is load-bearing: 1-arg `(assert x)` works only + via nil-fill); too-many errors. +- RECOMMENDATION: **ERROR on too-few** as well, with `&optional`/`&key`/`&rest` as the explicit + mechanisms. Sweep required (harness `assert`, any nil-fill reliance). If the sweep turns up + heavy reliance, fallback position: keep nil-fill but document it loudly and make strict mode + error. Primary recommendation stands: error. +- Churn: **high** — flagged as the riskiest ruling; do the sweep before ratifying. Findings: core + strict-too-few / harness-assert nit. STATUS: PROPOSED + +## E. Keywords, equality, types + +### R18 (=R21 referenced above). Keywords +- RECOMMENDATION: ratify current model — keywords self-evaluate to their string name; keyword-ness + exists only in unevaluated AST. Consequences made explicit: `(keyword-name :k)` needs a quote; + `"keyword"` is REMOVED from the strict type system; case/dict punning documented. NOT callable + (R13c covers the getter idiom). +- Churn: zero (docs + removing a dead type branch). STATUS: PROPOSED + +### R19. Equality +- RECOMMENDATION (low-churn variant, chosen deliberately over full R7RS split): + a. `=` stays deep structural equality (alias equal?) — ubiquitous in the corpus; add the missing + **Char arm** (today `(= #\a #\a)` → false) and any other missing type arms; document that + `(= 1 1.0)` → true (numeric value equality inside =). + b. Add real `eqv?` (identity + exact numeric/char equality) and `eq?` (alias identical?) as + kernel primitives — they are spec-declared today but implemented NOWHERE. + c. Comparisons `< > <= >=` become n-ary chained (R7RS); `=` stays 2+-ary deep. + d. If content-addressing ever needs exactness-distinguishing equality, that's `eqv?`, not `=`. +- Churn: low. Findings: core eq?/eqv?-missing, =-binary, char-equality; P5. STATUS: PROPOSED + +### R20. Strict typing +- RECOMMENDATION: (a) checks move to the continue-with-call/vm_call chokepoints → HO callbacks, + apply, components, => receivers all covered; (b) unknown type name at declaration = ERROR; + (c) `"component"` becomes a real type branch; `"keyword"` removed (R18); (d) `(:as type)` param + annotations become the declaration channel (deprecate the name-keyed global dict, which is + trivially evaded and inherited by shadowers); (e) strict errors are catchable conditions (R7); + (f) set-prim-param-types! merges and validates; (g) return types: explicitly out of scope now. +- Churn: low-medium. Findings: core strict-* family (8 findings). STATUS: PROPOSED + +## F. Numbers + +### R21. Integer model & overflow +- Current: native = int63 with overflow-promote-to-float on + and * but silent WRAP on expt; + WASM = 32-bit silent wrap (F-1 — production browsers!); JS bundle = float64. +- RECOMMENDATION: spec defines SX integers as **exact within ±2^53** (the portable range); + arithmetic that exceeds the host's exact range **promotes to float** (never wraps) — `expt` + included. WASM must be fixed to match (js_of_ocaml int64/boxed or explicit overflow checks) — + hosts lane feasibility-checks the mechanism; silent 32-bit wrap is a bug under any ruling. + Values beyond 2^53 must not be trusted exact across the wire. +- Churn: low at spec level; WASM fix is real hosts work. Findings: F-1, P4, core expt. STATUS: PROPOSED + +### R22. Division & zero +- RECOMMENDATION: integer `/`, `mod`, `quotient`, `remainder` by zero = catchable SX condition + (today: raw OCaml Division_by_zero for mod/quotient, silent `inf` for /); float ops keep IEEE + (inf/nan). `/` doc fixed: returns int when exact, float otherwise (current behavior ratified). +- Churn: low. Findings: core div-by-zero, /-doc. STATUS: PROPOSED + +### R23. Float text & wire +- RECOMMENDATION: **shortest-round-trip printing everywhere** (native `%g` 6-sig-digit printing is + a wire-corruption bug — P1); `inf`/`-inf`/`nan` are THE wire tokens on all hosts (P10); `round` + stays half-away-from-zero, documented (R7RS banker's rejected: churn without benefit); + `inexact->exact` rounding behavior kept + documented; `str 1.0` → keep `"1"` but canonical/wire + serializers must preserve the float/int distinction (`1.0` serializes as `1.0`). +- Churn: low. Findings: P1, P10, core round/float-rendering; canonical CID determinism. STATUS: PROPOSED + +### R24. Rationals +- RECOMMENDATION: `string->number` parses `"1/2"`; `(/ 1 3)` stays float (rationals remain opt-in + via make-rational) — documented; radix arg restored by fixing the r7rs.sx shadow (C2). +- Churn: low. STATUS: PROPOSED + +## G. Strings + +### R25. Unit semantics +- Current: native counts UTF-8 bytes (substring can split codepoints → invalid UTF-8); JS counts + UTF-16 units; constructors are codepoint-aware. Project style mandates UTF-8 text everywhere. +- RECOMMENDATION: **codepoint semantics** for length/substring/index/ref at the spec level; kernel + implements UTF-8-aware ops. Accept the perf cost (or add byte-* variants for hot paths later). +- Churn: medium (kernel work + any code relying on byte counts). Findings: core UTF-8 family, P6. STATUS: PROPOSED + +### R26. Case mapping +- RECOMMENDATION: kernel `upcase`/`downcase`/`upper`/`lower` are **ASCII-only, documented** (full + Unicode case tables deferred; JS's full-Unicode behavior dies with D1). Aliases exist on all + surfaces (P11). +- Churn: zero. STATUS: PROPOSED + +### R27. `split` and escapes +- RECOMMENDATION: `split` = literal substring separator, keeps empties, empty separator → chars + (ratifies native; pin with the multi-char test that history shows is needed). String escape + table is normative: `\n \t \r \\ \" \uXXXX(validated: 4 hex digits, scalar value, else ERROR)`; + **unknown escape = parse ERROR** (kills the native-keeps-backslash vs guest-drops-it silent + divergence, C25 direction fight). +- Churn: low. Findings: core split note, \u family, unknown-escape divergence; C25. STATUS: PROPOSED + +## H. Collections, nil, dicts + +### R28. nil vs empty list +- Current: distinct values in the reader/serializer; `(cons 1 nil)` → `(1)` on native (nil-as- + empty in constructors); read ops inconsistent (`first nil` → nil but `reverse nil` → error). +- RECOMMENDATION: nil and `()` remain **distinct values**; collection READ ops uniformly + **nil-pun** (treat nil as empty: first/rest/nth/last/reverse/len/empty? all accept nil); + constructors keep nil-as-empty seeding (cons/append onto nil). `nil?` ≠ `empty?` preserved. +- Churn: low (only un-errors cases). Findings: core nil-tolerance; P7/P8 arms. STATUS: PROPOSED + +### R29. Dict ordering +- RECOMMENDATION: **insertion order preserved** — iteration, keys/vals, and serialization (OCaml + Hashtbl replaced with an insertion-indexed structure; keys-reversed bug dies). CANONICAL form + always sorts keys independently (already true in the CBOR/CID layer). Duplicate literal keys: + last-wins, documented. +- EMPIRICAL NOTE (quick-wins batch, 2026-07-03): an interim sorted-keys change broke 4 render + tests — attr emission order flows through dict_keys and the tests PIN source-order attributes + (`width` before `height` etc.). So the current reverse-ish order is load-bearing for render; + any change here must land together with the render-attr ordering contract. Reverted; do not + change keys order except via this ruling. +- Churn: medium (kernel dict rework) but pays across wire/golden/cache findings C27/P9/core-keys. STATUS: PROPOSED + +### R30. Small-primitive contract fixes (spec already says; hosts violate) +- RECOMMENDATION: ratify the spec text and fix: `contains?` on dicts = key check; `merge` skips + nil; `into` native on the kernel; `sort` takes an optional comparator, compares int/float + numerically, stable; `get` returns a STORED nil (default only when key absent); `zip-pairs` = + sliding window per spec (kernel currently chunks); `(max)`/`(min)` zero-arg = ERROR. +- Churn: low each. Findings: core contains?/sort/keys; P2/P3/P8/P12; JS `get` arm. STATUS: PROPOSED + +### R31. `append!` and mutation +- Current: silently no-ops on ANY derived list (map/filter/rest/reverse output) — worst silent- + data-loss finding in the primitives sweep. +- RECOMMENDATION: `append!` **ERRORS on non-mutable lists** immediately (honest), and is + deprecated in favor of persistent `append` + a real mutable vector/buffer for accumulator + idioms. Sweep the corpus (it's a known accumulator idiom in loops). +- Churn: medium (idiom sweep). Findings: core append!. STATUS: PROPOSED + +## I. Parser & wire + +### R32. One token grammar +- RECOMMENDATION: publish the normative ident/number classifier in spec/parser.sx and make every + surface bind THE SAME table (today: four divergent tables → same source, different ASTs). + Specific token rulings: maximal-munch then classify (`1+`, `a,b` are symbols — ratifies native); + hex/binary/octal `#x/#o/#b`-style and `0x10` accepted, documented; `inf`/`nan`/`-inf` are number + literals (reserved, not idents); `1e` and other malformed numbers = parse ERROR (never nil); + unicode identifiers **allowed** (UTF-8 letters — the docs mandate UTF-8 text; native reader + extends its charset); `$`/`|` NOT ident chars; `.` IS a valid symbol (ratifies native; JS4 dies + with D1); `#t`/`#f` = boolean literals on all surfaces. +- Churn: medium (native reader charset + guest table sync). Findings: core parser-divergence + family; C1b (unicode symbol kills server — fixed by charset + C1 try-wrap); JS4. STATUS: PROPOSED + +### R33. Reader extensibility & comments +- RECOMMENDATION: implement the `#name` reader-macro registry on the kernel (spec documents it; + only JS has it today) — small, and sx-pub extensibility wants it. `#;` datum comment valid + before `)` and at EOF (standard). `#|…|` stays a RAW STRING (documented loudly as not-a-block- + comment); no block comments. +- Churn: low. Findings: core reader-macro/datum-comment/raw-string. STATUS: PROPOSED + +### R34. Dict literals & serializer round-trip +- RECOMMENDATION: dict literal keys must be keyword/string/symbol — anything else is a parse ERROR + on every parser (guest currently stringifies `{1 2}` silently); odd form count gets a "dict + needs key-value pairs" error. Serializer: dict keys escaped/round-trippable (today unparseable + output for non-ident keys — also a CID hazard); chars serialize by codepoint (`#\é` readable + back once R25 lands); PROPERTY TEST: `parse(serialize(x)) = x` for the full value lattice, run + on both kernels. +- Churn: low. Findings: core serializer-dict-keys / multibyte-chars / dict-edges. STATUS: PROPOSED + +### R35. Canonical form & CIDs (sx-pub-critical) +- RECOMMENDATION: the **native CBOR/CID path is normative** (key-sorted, verified native==WASM, + F-3). The canonical TEXT form is defined as: sorted keys, shortest-round-trip floats with + preserved int/float distinction, fully-escaped strings, and is a fixed point + (canonical(parse(canonical(x))) = canonical(x)) — property-tested cross-kernel. spec/canonical.sx + either becomes a tested mirror of the native path (fix its runner-only helpers) or is deleted; + two silently-diverging implementations is the one unacceptable state. +- Churn: low-medium. Findings: core canonical family; F-3; P9/C27 (via R29). STATUS: PROPOSED + +### R40. Primitive naming & small-default unification (answers the hosts handoff list) +- RECOMMENDATION: one canonical name registry in spec/primitives.sx; per-host aliases die (with + D1 most of these resolve to "make it native on the kernel"): `json-encode`/`json-parse` are + KERNEL primitives (not IO-bridge helpers — today unavailable sandboxed); `regex-*` is the + canonical family name; `parse`/`sx-parse` — `sx-parse` canonical, `parse` alias documented; + 1-arg `(range n)` = 0..n-1 (ratifies native); `parse-int`/`string->number` on failure → nil + (ratifies native, never 0); `format` and the stdlib move for real (the primitives.sx header + claims a stdlib migration that never happened — make the header true or revert it) and + spec/stdlib.sx loads in production (today `format` is unresolved on the server). +- Churn: low. Findings: F-9 naming splits, P7 arms, core spec-drift / stdlib-header. STATUS: PROPOSED + +## J. Render contracts + +### R36. Attribute contract (all four adapters) +- RECOMMENDATION: one contract, HTML-mode's as base: boolean-registry attrs — false/nil omit, + anything else emits bare name (SX truthiness, documented footgun stands); non-boolean attrs — + value stringified INCLUDING `"true"`/`"false"` (DOM adapter aligns — C19/core, found by both + lanes); attribute NAMES validated `[A-Za-z_:][A-Za-z0-9_:.-]*` else ERROR (kills spread-dict + injection); nil attr value omits the attribute. +- Churn: low. Findings: core attr-name-injection / bool-footguns / dom-html-parity; C19. STATUS: PROPOSED + +### R37. Raw-text elements & voids +- RECOMMENDATION: `` breakout (good) but breaks real inline code; `raw!` is the workaround. +- Coverage: only script attrs tested, never content + +### [low] [CONFIRMED] Comparison/equality strictly binary; = is deep structural equality conflating exactness +- Repro: `(< 1 2 3)`/`(= 1)` → unstructured arity error (matches spec, deviates from Scheme); `(= {:a 1} {:a 1})` → true; `(= 1 1.0)` → true (dedup-key hazard). + +### [low] [CONFIRMED] Rounding half-away-from-zero, not banker's; inexact->exact rounds; (sqrt -1) → nan +- Repro: `(round 2.5)` → 3 (R7RS: 2); `(inexact->exact 1.5)` → 2 (locked by test-numeric-tower.sx:115 — intended but R7RS-divergent); `(sqrt -1)` → nan silently. + +### [low] [CONFIRMED] Float/nil rendering inconsistencies across str/format/render +- Repro: `(str 1.0)` → `"1"` (float/int distinction lost — also `(div 1.0)` renders `1`); `(str nil)` → `""` but `(format "~a" nil)` → `"()"`; `(format "~d" 3.7)` → `"3"` (silent truncation). + +### [low] [CONFIRMED] Inconsistent nil/empty tolerance across list ops +- Repro: `(first nil)` → nil, `(rest nil)` → `()`, `(nth (list 1 2) 5)` → nil silently — but `(last nil)`, `(reverse nil)`, `(nth nil 0)` all raise. + +### [low] [CONFIRMED] keys returns strings in reverse insertion order +- Repro: `(keys {:a 1 :b 2 :c 3})` → `("c" "b" "a")`. Determinism footgun for serialization/content-addressing. + +### [low] [CONFIRMED] keyword-name unusable on evaluated keywords +- Repro: `(keyword-name :kw)` → error (`:kw` self-evaluates to `"kw"`); only `(keyword-name ':kw)` works. + +### [low] [CONFIRMED] string->number: no rational/whitespace parsing +- Repro: `"1/2"` → nil (despite make-rational), `" 5 "` → nil, `"1e3"` → 1000, garbage → nil (good). + +### [medium] [CONFIRMED — CORRECTED after cross-lane check] `apply` does not spread AT ALL on the native production surface +- Location: continue-with-call native-call path / apply primitive +- What: originally reported as "leading-args form missing, two-arg form works" — WRONG. Re-verified on fresh sx_server: `(apply + (list 1 2))` → `Unhandled exception: "Expected number, got list: "`. The list is passed as a single argument, never spread — `(apply str (list 1 2 3))` → `"(1 2 3)"` (str of the list itself). The earlier "works" observation came from a test-runner/harness environment with its own apply. Conformance lane F-3 independently found this AND that the WASM kernel spreads the 2-arg form (→ 6) while native errors — the same kernel family disagrees with itself on apply. +- Repro: `(apply + (list 1 2))` → error; `(apply + (list 1 2 3))` → error; `(apply str (list 1 2 3))` → `"(1 2 3)"` (fresh sx_server, verified 2026-07-03). +- Coverage: not covered on the production surface (runner env has a different apply — see the values/call-with-values finding for the same pattern). + +### [low] [CONFIRMED] Strict checks are name-keyed at the call site — trivially evaded, and shadowers inherit checks +- Repro: `(let ((zz hh)) (zz "a"))` → unchecked; computed heads `((mk) "bad")` → unchecked; conversely a user fn shadowing a typed name gets the declared checks applied to it. First-class function flow is entirely unchecked. +- Coverage: not covered. + +### [low] [CONFIRMED] set-prim-param-types! replaces wholesale; no validation; malformed specs fail cryptically and uncatchably +- What: second call wipes all earlier declarations (no merge); nonexistent prim names accepted silently; `{"positional" "oops"}` errors at call time with "Expected list, got string" (uncatchable, doesn't name the spec as culprit); `{"name" "not-a-dict"}` silently checks nothing; declaring types for HO-form names never fires (HO dispatch intercepts before the arg frame). +- Coverage: only the nil-reset path tested. + +### [low] [CONFIRMED] Too-few args never error and their declared types are silently skipped +- What: user lambdas nil-fill missing params (`(f2 1)` → `(1 nil)` with b typed number, no error); strict-check-args guards `idx < len(args)` so unsupplied params skip checking. Too-many args DO error. `foreign-check-args` has the mirror asymmetry (extra args unchecked; code-level). +- Coverage: not covered. + +### [low] [CONFIRMED] `(:as type)` parameter annotations are never enforced — even in strict mode +- Location: eval-rules.sx documents `(:as type)` in the lambda rule; spec/signals.sx uses them pervasively (`(s :as signal)`) +- Repro: `(define tf (fn ((x :as number)) x))` `(tf "not-a-number")` → returns the string, strict on or off. The natural per-param channel is decorative; strict mode reads only the global name-keyed dict. +- Coverage: not covered. + +### [low] [CONFIRMED] Strict-machinery paper cuts +- Return types unsupported anywhere (params only). Rest-arg errors index from 0 within the rest section ("rest arg 0" is overall arg 2). `set-strict!` is one global OCaml ref — not per-env, not captured by continuations; toggling mid-program retroactively affects existing lambdas. Dead shadowed duplicates `_strict_ref`/`_prim_param_types_ref` at sx_ref.ml:18-19 (transpiler cruft, no desync). Host surface inconsistency: sx_server binds set-strict!/set-prim-param-types! but not value-matches-type?; the harness binds none. Positive: error message quality is good (names function, param, expected, actual, value). + +### [low] [CONFIRMED] `batch` unusable on the server host; coroutines module inert outside the test runner +- What: `batch` calls `(batch-begin!)` on non-client hosts; `batch-begin!`/`batch-end!` are bound only in run_tests.ml:564 — on sx_server `(batch ...)` → `Undefined symbol: batch-begin!` (which, per the wedge finding, also leaves `*batch-depth*` stuck). Separately, spec/coroutines.sx lacks the trailing `(import (sx coroutines))` re-export that signals.sx/harness.sx have — loading it binds nothing globally; tests work only via explicit import + run_tests-only cek-* hooks. +- Coverage: not covered. + +### [low] [CONFIRMED] `effect` stale cleanup double-invocation +- Location: spec/signals.sx `effect`/`run-effect` — cleanup-fn invoked at each re-run start but never cleared; only overwritten when a run returns a new callable +- Repro: effect returns cleanup only when v=0: after two resets, cleanup-calls = 2. Expected 1. +- Coverage: not covered. + +### [low] [CONFIRMED] Guest parse errors carry no source locations; native has line/col on only 2 of ~8 error types +- Location: spec/parser.sx (all error sites location-free); sx_parser.ml (locations only for "Unexpected end of input"/"Unexpected char"; unterminated string/list/dict etc. location-free) +- Repro: `(sx-parse "(a (b)")` → just `"Unterminated list"`. Also test-source-locations.sx tests a parser-combinator library, NOT spec/parser.sx, and its cols are 0-based vs native 1-based. +- Coverage: no reader-location tests exist. + +### [low] [CONFIRMED] Dict literal edges: odd form count → misleading error; duplicate keys silently last-win +- Repro: `{:a}` → `Unexpected character: }` (no mention of pairing); `{:a 1 :a 3}` → `{:a 3}` silently (both parsers). +- Coverage: not covered. + +### [low] [CONFIRMED] `#|...|` is a raw string to the first `|`, not a block comment; `#|a|#` leaves a dangling `#` +- Repro: `(sx-parse "#|hello world|")` → `("hello world")`. Documented, but a Scheme-expectation trap with no test for the `|#` suffix case. + +### [low] [CONFIRMED] Keyword edge tokens: `:` parses as keyword with empty name; `::a` is a keyword named ":a" +- Coverage: numeric-suffix/consecutive keywords tested; `:`/`::` not. + +### [low] [CONFIRMED] Harness contract nits +- A throwing mock leaves no IO-log entry (append happens after the mock returns) — failed calls invisible to assert-io-called. `(assert cond)` one-arg form works only via the evaluator-wide nil-fill of missing params. + +### [low] [CONFIRMED] CLAUDE.md points at a deleted canonical spec (`shared/sx/ref/*.sx`) +- What: CLAUDE.md instructs reading `shared/sx/ref/eval.sx`/`parser.sx`/`primitives.sx`/`render.sx` as "authoritative SX semantics"; the directory contains only `BOUNDARY.md` + Python cache. Live spec is `spec/*.sx`. Together with the island-authoring-rules drift (let/body semantics above), the project docs actively mislead on core semantics. + +--- + +## SUSPECTED findings (reasoning only, not reproduced) + +### [medium] [SUSPECTED] More nested-eval boundaries: `expand-macro`, `sf-let-values`, `sf-define-values`, `qq-expand` unquotes all evaluate via `(trampoline (eval-expr ...))` instead of CEK frames +- Location: spec/evaluator.sx, expand-macro (1548-1580), sf-let-values (1411-1417), sf-define-values (1443-1445), qq-expand unquote eval +- Reasoning: same structural pattern as the three CONFIRMED nested-run bugs (shift-k invoke, threading, signal) — continuation capture, `perform`/IO suspension, or raise-to-outer-handler inside a macro body, let-values initializer, or unquote crosses a nested trampoline the outer kont cannot see. let-values untestable at runtime (`values` missing — see medium finding); macro-expansion capture is expansion-time and rare. +- Coverage: not covered. + +### [low] [SUSPECTED] env_merge is_descendant depth cap (>100) silently flips scoping semantics +- Location: hosts/ocaml/lib/sx_types.ml:394 (`if depth > 100 then false`) +- Reasoning: call-site env chains deeper than 100 frames false-negative the descendant check, activating the caller-frame-copy branch (the dynamic-scoping leak above) in code that was previously purely lexical. Rare (needs ~100 nested closure/let layers), silent flip. Code-read only. +- Coverage: not covered. + +### [medium] [SUSPECTED] Canonical serialization is not cross-host deterministic — CIDs can differ between OCaml and JS +- Location: spec/canonical.sx (`canonical-number` uses host `str`; string case uses host `escape-string`) +- Reasoning + partial confirmation: OCaml `(canonical-serialize 1e-7)` → `"1e-07"` (verified live) while JS `String(1e-7)` → `"1e-7"` (code-read) — different canonical text → different sha3 CIDs for the same value. Also: sx_server escapes `\r` (sx_server.ml:1275), JS platform does not (platform.py:2628); integers beyond 2^53 exact on OCaml, unrepresentable in JS. Full cross-host CID comparison not run. +- Coverage: test-canonical.sx never canonicalises exponent-form floats, CR strings, or big ints. (Dict-key sorting IS implemented and idempotence holds for tested classes.) + +### [medium] [SUSPECTED] Coroutine performing a non-yield effect is permanently wedged +- Location: spec/coroutines.sx, `coroutine-handle-result` — for a suspension with op ≠ "coroutine-yield" it does `(perform request)`: forwards outward but **discards both the answer and the coroutine's suspension**; state stays "running" and `coroutine-resume` has no "running" branch → "unexpected state: running" +- Reasoning: code-level; not reproducible outside run_tests (needs cek-step-loop/cek-resume hooks bound only in run_tests.ml:951-955). Correct forwarding would cek-resume the suspension with the outer answer in a loop. +- Coverage: test-coroutines.sx (27 tests) has zero `perform` usage. + +### [low] [SUSPECTED] VM/JIT execution path has no strict checking +- Location: sx_vm.ml — zero callers of `strict_check_args` (repo-wide grep: only sx_ref.ml) +- Reasoning: any call executed as compiled bytecode bypasses checks. Could not confirm live — lazy JIT never engaged in CLI probes (bytecode-inspect after 300 calls: "no compiled bytecode"). +- Coverage: not covered. + +--- + +## Checked, NOT reproducible (negative results correcting project memory) + +- **"Short helper names (name/dyad) hang the runtime"**: does NOT reproduce — `(define name …)`/`(define dyad …)` work. The `guard` case is the unshadowable-name finding (error, not hang). +- **"split is char-class not substring"**: harness/guest-worktree only. Real sx_server `(split "a--b" "--")` → `("a" "b")` substring, keeps empties. Multi-char delimiter untested in spec/tests — worth a pinning test. +- **"let is parallel / bodies evaluate only last expr / effects need inner let"** (CLAUDE.md island rules): all false for the spec evaluator — let is sequential, bodies are implicit begin (tested intent). Likely describes the separate OCaml SSR island path → doc fix + cross-lane check. + +## Clean areas verified + +**CEK core**: TCO through all special forms (named-let 200k, mutual 100k, non-tail 100k heap-safe); +call/cc escape/multi-shot/independence; shift/reset delimiting + multi-shot composable k; shift +without reset → clean error; escape from HO callbacks; multi-shot resume INTO map frames (no +accumulator leakage); raise through dynamic-wind one-shot (after exactly once, 50k-frame unwind); +`(and)`/`(or)`/`(begin)`/`(cond)`/if-no-else edge values; cond `=>`; head-position exprs; +parameterize; restart-case/invoke-restart. + +**Env/scope**: closure sharing + isolation both directions; define local-in-lambda vs top-level +redefine; set! write-through 1-2 levels; `(let ((x x)) x)` → outer; letrec mutual recursion +(lambda case); emit!/emitted ordering/extent/nesting/TCO-survival/no-leak (correct but ZERO test +coverage — gap worth closing given the scope/provide bugs); provide/context/peek normal-flow +nesting (well covered); component &rest/kwarg interleaving; component set! does not write back +to caller; primitive shadowing works for genuine primitives. + +**HO forms**: HoSetupFrame stages both args exactly once, left-to-right, both orders; map over +list-of-functions picks sane reading; guest raise mid-map caught cleanly; some/every?/filter/ +for-each/map-indexed semantics sane (0/"" truthy — internally consistent); no double-eval in +threading (quoted-value splice protects data both paths); as->; ->> normalizes via swap; nested +map-in-map; reduce 100k in 0.3s; multi-map zips to shortest (covered). + +**Special forms**: when/begin/do sequencing; and/or/if falsiness fully consistent (only false/nil +falsy); short-circuit verified; defmacro recursive expansion, &rest + `,@` templates, ~name heads +in qq; guard happy paths incl. R7RS auto-reraise; ->/set! interplay; eval-rules.sx accurate except +set!-error claim, cond clause mode, case evaluated-vals; `unless` intentionally userland. + +**Render**: text + attr-value escaping correct; raw!/SxExpr single-escape guarantee (no double- +escape); registered void elements self-close, drop children silently; boolean-attr registry (23) +correct for true/false/nil; numbers/booleans/nil as children; aser wire semantics (components +unexpanded, control flow evaluated, string/dict args round-trip incl. quotes/unicode); recursive- +with-base-case components; fragment/nil/string/number component returns; &rest spliced flat. + +**Primitives**: quotient/remainder/modulo signs R7RS-correct; substring clamping; replace; +trim/index-of/starts-with?/ends-with?; assoc/dissoc/merge/has-key?; range/flatten/chunk-every; +rationals (normalization, contagion, zero-denominator errors); vectors/sets/ports/chars/string- +buffers basics; dict-set! vs assoc; truthiness consistent; format directives; max/min zero-arg +errors clean. Not probed (dedicated green suites): zip-pairs, bitwise, bytevectors, regexp. + +**Parser/serializer**: basic escapes correct + exact round-trips (quotes/backslashes/newlines/ +multibyte strings); quote sugar nesting incl. before `)`/EOF; 10k-deep nesting + 10k-char tokens +parse fine (heap frames, no hangs on any adversarial input — every failure errors rather than +loops); serializer round-trips for number/keyword/symbol/list/nested-dict(ident keys)/bool/nil; +nil vs () vs {} distinct; canonical-dict key sorting + idempotence (tested classes); -0.0 → "0"; +negative numbers vs `-` symbol; `5.`/`1e10`/`-1.5e-3`; comments at EOF; dotted pairs cleanly +rejected on all hosts; keyword AST round-trip. + +**Strict typing**: value-matches-type? core semantics correct (number/string/boolean/nil/list/ +dict; empty list not a dict; nullability exclusively via "type?" suffix — consistent; floats+ints +both "number"; quoted symbols; lambdas). ->/->> threading IS strict-checked (re-dispatches a real +call form). Recovery after a strict error works. Error messages high quality. + +**Signals**: effect does not re-run on unrelated signals; effect's dispose-fn unsubscribes +correctly; batch dedups multiple resets of one signal (when it works — see wedge finding). + +**Harness (spec/harness.sx)**: interceptors log args/result/op correctly; arity fan-out 0-3 + +apply; custom-platform merge over defaults; assertion messages descriptive. + +--- + +## Handoffs to other lanes + +- **HOSTS**: hosts/ocaml/bin/mcp_tree.ml maintains its own primitive table, drifted from + sx_primitives.ml (empty?/get/split/contains?/equal?/keyword-name differ — details in the + harness-divergence finding). Also: sx_harness_eval is a shared persistent image, not a fresh + sandbox; sx_read_subtree ignores `path`; sx_read_tree ignores `max_lines`. +- **HOSTS/CONFORMANCE — JIT vs interpreter divergence**: three confirmed behavior flips between + VM-compiled and interpreted paths: (1) set!-unbound writes vm.globals vs root env (split brain); + (2) env_merge caller-frame leak exists only interpreted ("VM undefined" under JIT); (3) named-let + leaked loop name reads as lambda interpreted / nil under JIT. Parity suite has no coverage. +- **HOSTS (Python shell)**: aser output embedded into `` (aser HTML-escapes text children, + but attr/raw paths unverified). +- **CONFORMANCE**: run_tests.ml injects bindings absent from the real runtime — `values`/ + `call-with-values` (test-values.sx), `contains-char?`/`trim-right` (canonical.sx), + `batch-begin!`/`batch-end!` (signals), cek-step-loop/cek-resume (coroutines). Whole suites are + green only in-runner; test-env vs runtime-env parity needs a systematic sweep. +- **CONFORMANCE — parser fleet**: three parser implementations (native OCaml reader, spec guest + parser over per-host primitive bindings, JS transpiled spec) with four ident/number classifier + tables that were never reconciled (details in the AST-divergence finding). Guest-parser platform + primitives (`parse-number`, `char-code`, `contains-char?`, `trim-right`, `reader-macro-get`, + `escape-string`) drift per host because each host re-binds them ad hoc. Suites only exercise + the intersection — that's why everything stays 1080/1080 green. +- **HOSTS (JS)**: JS parser silently corrupts invalid `\uXXXX` escapes (garbage string, no error) + where OCaml raises; JS `reader-macro-get` registry exists but OCaml's doesn't. +- **DOCS**: CLAUDE.md island-authoring rules describe non-spec semantics (parallel let, last-expr + bodies); CLAUDE.md canonical-reference section points at deleted files. +- **TOOLING incident log**: mid-review another session polluted the shared MCP image (`inc` + redefined to a constant, breaking guest parsing with spurious "Unterminated" errors); the parser + agent restored it. Underlines the harness-not-fresh finding — harness state is shared across + concurrent sessions. diff --git a/plans/sx-review/hosts.md b/plans/sx-review/hosts.md new file mode 100644 index 00000000..001cac9f --- /dev/null +++ b/plans/sx-review/hosts.md @@ -0,0 +1,1103 @@ +# SX Hosts Review — findings (hosts axis) + +Reviewer lane: per-target host implementations + FFI seam. +Targets: hosts/ocaml (kernel, VM/JIT, epoch protocol, server env, HTTP path), +hosts/javascript, hosts/python, hosts/native, WASM browser kernel, web/ adapters. + +Status: COMPLETE — all 7 verification agents done (OCaml VM/JIT, sx_server epoch/env/HTTP, +JS host, Python host, web-FFI seam, cross-host parity, native/WASM). OCaml run_tests +baseline + `--jit` differential both done. The highest-severity findings were reproduced +live and self-verified by the lead reviewer (marked "self-verified"). + +Live HTTP-path verification (follow-up): booted `sx_server --http` on free ports (3 bounded runs, +own PIDs only — never pkill) and drove concurrent load. This UPGRADED three suspected HTTP-race +items to CONFIRMED — **S1** (multi-Domain render pool crashed intermittently under concurrent load; +empty responses, no exception, OOM ruled out at ~292 MB peak RSS), **S4** (routing failures served +as HTTP 200 and cached: cold 2.02s → warm 0.0005s), and **S5** (cache key ignores cookies/query — +three different `session=` cookies returned one identical cached body). **S2/S3** stay SUSPECTED: +their code paths were verified live but the stock static-docs app has no request-varying full-page +render to exhibit a visible symptom. Logs: /tmp/sx-review/http-server{,2,3}.log. + +Finding tally: ~65 CONFIRMED (incl. S1/S4/S5 upgraded via live repro) + ~13 SUSPECTED +(incl. S2/S3) + 2 positive/informational. Full detail below. + +TOP-LINE (most severe; CONFIRMED unless noted): +- **Production serving-JIT silently miscompiles** (J1): `(-> …)` in argument position and + user-macro args (J3) yield wrong values; JIT-fallback double-applies side effects (J2). The + http server registers the JIT hook UNCONDITIONALLY (sx_server.ml:4163) — live on sx.rose-ash.com. +- **One malformed line kills the whole sx_server process** (C1) — unguarded command-channel parse; + a non-ASCII byte does the same (C1b). +- **JS bundle is hollow** (C0a): render/engine/router/signals/all adapters transpile to 0 bytes + because the transpiler doesn't recurse into `define-library`; JS conformance is 2490 failing (C0b). +- **Python↔OCaml boundary + parser diverge**: NUL/escape corruption (C25), boundary validation is a + permanent no-op (C24), bridge desync is unrecoverable (S-bridge). +- **Cross-host primitive parity is broken** in ~30 confirmed ways (P1–P12): OCaml lossy float wire, + mixed-number sort, `into` needs a bridge; JS lenient-coercion cluster + missing equal?/eq?. +- **HTTP serving path breaks under concurrency (live-verified on a booted server):** the + multi-Domain render pool crashed intermittently under concurrent load — a genuine data race on + unsynchronized cross-Domain Hashtbls, OOM ruled out (S1, CONFIRMED); routing failures are served + as HTTP 200 and cached indefinitely (S4, CONFIRMED); the response cache key ignores cookies/query, + so it would serve one user's response to another on any cookie/auth-varying app (S5, CONFIRMED). + Still SUSPECTED: per-request globals read by queued workers (S2) and `expand-components?` removed + on the shared env by AJAX renders (S3) — both code paths verified live, but need a request-varying + app (not the static docs site) to exhibit a visible symptom. + +Baseline (no JIT): `run_tests.exe` → **5762 passed, 274 failed**. 273 failures are `hs-*` +hyperscript suites (in-progress guest project — baseline caveat, not a host bug; note the +`can-map-an-array` / "map with block" line prints with a blank suite label due to C9 but belongs +to a hyperscript compat suite). The single genuine non-hyperscript host failure is C2 (r7rs +string->number shadow). `--jit` run: 5760p/276f, with 2 deterministic JIT-only divergences (J9). Logs: +/tmp/sx-review/ocaml-run_tests-baseline.log, /tmp/sx-review/ocaml-run_tests-jit.log + +Numbering note: findings carry mnemonic prefixes (J* serving-JIT, C* OCaml/kernel/protocol, +JS* JavaScript host, P* cross-host parity, S* suspected). Order within CONFIRMED is roughly +severity-desc but read the whole section — prefixes were assigned as agents reported, not in rank order. + +--- + +## CONFIRMED (most severe first) + +> **Serving-JIT correctness cluster (J1–J8).** The OCaml `http_mode` (the production +> HTTP server behind sx.rose-ash.com) registers the JIT hook **unconditionally** +> (sx_server.ml:4163) — it is NOT gated by SX_SERVING_JIT (that env var only gates the +> epoch/persistent serving mode, sx_server.ml:5011). So every named top-level lambda +> invoked during page rendering is JIT-compiled on first call, and the divergences below +> are reachable in production for any rendered lambda that hits these patterns. The default +> `run_tests --jit` uses a *different, low-hit-rate* JIT path and surfaced only 2 divergent +> tests (see J9); the serving-JIT path is where the silent-wrong-value bugs live. All J1–J8 +> reproduced live at the kernel level. +> +> **IMPORTANT reconciliation:** there is a belief on record (project memory / a comment at +> sx_server.ml:5001-5011) that "serving-JIT is OPT-IN via SX_SERVING_JIT=1, default OFF." That gate +> applies ONLY to the epoch/persistent serving mode. The actual HTTP server (`http_mode`) calls +> `register_jit_hook env` UNCONDITIONALLY at sx_server.ml:4163 with no env-var check. So anyone who +> concludes "JIT is off in production, therefore J1–J8 don't apply" is wrong — the http server that +> serves sx.rose-ash.com runs with JIT on. Confirmed by reading the http_mode entry and by the live +> boot logs (`[jit] … compile in …s` lines during every page render). + +### J1. `->` threading miscompiles under serving-JIT — silent wrong value + duplicated side effects +- severity: critical +- confidence: CONFIRMED (reproduced live, self-verified) +- where: lib/compiler.sx:934-951 (compile-thread-step); stale twin hosts/ocaml/lib/sx_compiler.ml:178 +- what: For `(-> v f g)` with non-empty rest-forms, compile-thread-step compiles `call-expr` + (pushing a value that's never popped) then recurses with `call-expr` re-embedded, compiling + it AGAIN. Each non-final step is evaluated once per remaining step (side effects duplicated) + and leaves a stack residue. When the `->` is argument ≥2 of a primitive call, CALL_PRIM pops + the residue instead of the sibling arg → silently wrong value. In arg position of a user call + the slot misaligns → "not callable" → self-heals via CEK fallback. +- repro: with lib/compiler.sx loaded, serving-JIT on: + `(+ 1 (-> 2 inc inc))` on CEK → `5`; JIT'd `(define (t1) (+ 1 (-> 2 inc inc)))` called → `7` + (self-verified: `(t1 ×5)` → `(7 7 7 7 7)`; same source with JIT off → 5). bytecode-inspect + shows the residue CONST/CALL_PRIM that's never popped. + +### J2. JIT-fallback re-runs the entire call on the CEK → side effects execute twice (persist/counter/emit double-apply) +- severity: high +- confidence: CONFIRMED (reproduced live, self-verified) +- where: sx_server.ml:1630-1638, 1674-1680 (catch-all `| e -> … l_compiled <- jit_failed; None`); + sx_vm.ml:461-464, 490-493 (Component/Island `with _ -> cek_call_or_suspend` re-run) +- what: Any exception during VM execution of a JIT'd lambda (data-first HO form, macro call, + raise/error, call/cc, undefined global) marks it jit-failed and re-runs the WHOLE call under + the CEK. Mutations performed before the failure point are applied twice. A code comment claims + idempotence "for the host's durable reads" — but it is NOT idempotent for writes (persist + appends, counters, emits). The Component/Island path catches even VmSuspended, so a component + that performed IO can have that IO re-issued. +- repro (self-verified): `(do (define fb-count 0) (define (fb) (set! fb-count (+ fb-count 1)) + (map (list 1 2) (fn (x) x))) (fb) fb-count)` → serving-JIT: `2` after ONE call; JIT off: `1`. + +### J3. JIT'd functions eagerly evaluate arguments of user-macro calls before falling back +- severity: high +- confidence: CONFIRMED (reproduced live, self-verified) +- where: lib/compiler.sx compile-list/compile-call (no macro-expansion pass); sx_vm.ml:497 (vm_call has no Macro case) +- what: The compiler treats a user-macro call as an ordinary call — compiles and evaluates ALL + arguments, then vm_call raises "not callable: " and the hook re-runs on the CEK. Code a + macro guards from evaluation executes once in the VM. +- repro (self-verified): `my-unless` macro that expands `true` → skip body; JIT'd `(um)` that + does `(my-unless true (set! hit (+ hit 1)))` → serving-JIT: `hit = 1`; JIT off: `hit = 0`. + +### J4. VM component calls misparse positional string args equal to a param name +- severity: high +- confidence: CONFIRMED (agent-reproduced) +- where: sx_vm.ml:144-156 (parse_keyword_args), fed by compile-expr lowering `:kw` to its string name +- what: The compiler erases the keyword/string distinction (keywords become their string names in + the constant pool), so parse_keyword_args treats ANY string matching a declared param as a + keyword marker and consumes the next value → silent wrong props/children, no error, no fallback. +- repro: `(defcomp ~box2 (&key title &rest children) (list "BOX" title children))`; + `(~box2 :title "T" "title" "x")` CEK → `("BOX" "T" ("title" "x"))`; JIT'd → `("BOX" "x" ())`. + +### J5. Specialized opcodes freeze primitive semantics at compile time — redefinition ignored by JIT'd code +- severity: medium +- confidence: CONFIRMED (agent-reproduced) +- where: lib/compiler.sx:1061-1077 (specializes + - * / = < > cons 2-arg, not len first rest 1-arg → opcodes 160-172); sx_vm.ml:804-914 +- what: CALL_PRIM resolves through vm.globals at run time (redefinitions respected), but the 12 + specialized opcodes inline the original semantics. A function JIT'd before `(define + …)` keeps + the old `+` forever while the CEK sees the new one → CEK/JIT divergence. +- repro: `(define (q) (+ 1 2))` JIT'd → 3; `(define + (fn (a b) 999))`; `(q)` → 3 (stale OP_ADD) + vs fresh CEK → 999. + +### J6. Redefining a primitive the compiler itself uses poisons subsequent JIT compilation +- severity: medium +- confidence: CONFIRMED (agent-reproduced) +- where: sx_vm.ml:732-766 (CALL_PRIM globals-first lookup) + the compiler running as VM bytecode +- what: The SX compiler's own bytecode calls +, first, nth… through CALL_PRIM → globals. After a + user redefines one, every subsequent JIT compile computes with the user's fn — observed as + corrupted pools/offsets ("CONST index 999 out of bounds"). Currently every hit errors → CEK + fallback (correct), but a redefinition returning plausible numbers could emit structurally-valid + wrong bytecode silently. + +### J7. Data-first HO forms `(map coll fn)` / `-> coll (map fn)` always fail under the VM → permanent deopt (+ J2 double-effect on first call) +- severity: medium (perf/coverage; correctness preserved by fallback, but triggers J2) +- confidence: CONFIRMED (agent-reproduced) +- where: sx_primitives.ml:1584-1637 (map/filter/reduce/some/every?/for-each accept fn-first only); CEK ho-swap-args has no VM counterpart +- what: Every JIT'd function using the documented data-first arg order (or threading a collection + into an HO form) errors on first VM call and runs on the CEK forever. Results correct, but never + benefits from JIT and the first call incurs the J2 double-side-effect hazard. + +### J8. Local bindings shadow HO-form names in the VM but not the CEK +- severity: medium-low +- confidence: CONFIRMED (agent-reproduced) +- where: compiler compile-call scope resolution (locals win) vs CEK HO-form dispatch (form wins) +- what: `(let ((map (fn (a b) 42))) (map 1 2))` → JIT'd: `42`; CEK inline: error "rest: 1 list arg" + (dispatches as HO map). Divergent semantics for the 7 HO-form names. + +### J9. `run_tests --jit` diverges from interpreter on 2 hyperscript scoping tests (deterministic) +- severity: medium +- confidence: CONFIRMED (self-verified, 3× each mode, deterministic) +- where: default run_tests --jit path (guard-installing lambdas interpret-only); hs-upstream-core/scoping suite +- what: `hs-upstream-core/scoping > behavior scoping is isolated from other behaviors` and + `… from the core element scope` PASS interpreter-only but FAIL under `--jit` with an empty + result (`Expected 20, got `). Deterministic across 3 runs each mode. This is the low-hit-rate + default-JIT path (distinct from serving-JIT J1–J8), so it's a second, independent JIT/CEK + divergence surface. (Baseline 5762p/274f → --jit 5760p/276f; the hs-runtime-e2e "def function" + entry fails in BOTH and is not a JIT regression.) + +### J10. Stale/incomplete Sx_compiler stub still bound as `compile`; JIT hook registered even if lib/compiler.sx fails to load +- severity: medium-low +- confidence: CONFIRMED (binding + staleness) / SUSPECTED (reachability) +- where: sx_server.ml:1315 (bind), comment at :5014 ("incomplete stub"); hosts/ocaml/lib/sx_compiler.ml (May-7, no POP after LOCAL_SET, no rest-arity/guard/match/scope) +- what: If lib/compiler.sx fails to load in serving-JIT mode the code warns but still registers + the JIT hook, so jit_compile_lambda would compile with the stub whose missing LOCAL_SET/POP + discipline can silently corrupt stacks (let-in-argument-position) rather than merely erroring. + +### J11. JIT debug/aux paths diverge from the real VM (vm-trace, code_from_value locals-scan, resume-error handler, threshold) +- severity: low (tooling / edge) +- confidence: CONFIRMED (by reading; J-threshold self-observed) +- where: sx_vm.ml:1436-1632 (trace_run), :230-252 (code_from_value), :990-1005 (resume handler); sx_server.ml:1640-1656 (hook ignores jit_threshold) +- what: (a) vm-trace resolves CALL_PRIM via get_primitive BEFORE globals (real run does opposite → + user redefs invisible in trace), pushes Nil for non-NativeFn callees, inline ops 160-175 handle + only Number, OP_EQ uses OCaml structural `=` → traces can show different values than execution. + (b) code_from_value's max-local scan mis-walks operands for opcodes 8/33/35/144/51 (masked by +16 + headroom). (c) resume-path error handler delivers `String msg` instead of the raised condition + value (only reachable via extension opcodes / hand-built bytecode). (d) the CEK-side hook ignores + jit_threshold=4 and compiles on the very FIRST call (why J1's t1 was already wrong on call #1); + lambdas defined inside let-wrapped library modules never JIT at all (e.g. all of lib/highlight.sx). + +### J12. (verification, positive) perform/resume stack-misalignment from repro_jit_resume.ml is FIXED +- confidence: CONFIRMED — all 9 repro cases (direct/multiframe/map-callback/pending_cek/reduce/ + nested-map perform) produce expected values; the "drop all-but-first in map/for-each" hazard from + the memory note does NOT reproduce at this level. Also verified matching (no divergence): closures + with set! on captured locals, &rest under/over-application, define in do/let, named let, letrec, + TCO + non-tail at 100k, quasiquote unquote/splice incl nested, dict literals, cond/case/when + multi-expr, and/or, string/arith edge prims, keyword equality, apply, guard (correctly JIT-excluded), + call/cc fallback, map over rest/filter/append lists, CEK↔VM global visibility both ways. + +### C0a. JS transpiler is blind to `define-library` — the browser/CI bundle is hollow (render/engine/router/signals/all adapters = 0 bytes) +- severity: critical +- confidence: CONFIRMED (reproduced live, self-verified — fresh build) +- where: hosts/javascript/platform.py:15-35 (extract_defines), bootstrap.py:205-226 +- what: extract_defines only extracts top-level `(define …)` / `register-special-form!`. The + spec+web files were since wrapped in R7RS `(define-library … (begin …))` (spec/render.sx, + web/adapter-html/-dom/-sx.sx, web/engine.sx, web/router.sx, spec/signals.sx, web/deps.sx, + web/page-helpers.sx, web/orchestration.sx, web/web-signals.sx, lib/freeze.sx), so those files + now transpile to ZERO bytes. The bundle still references the missing functions (renderToHtml + in its own public API; dom-* guarded by `typeof` checks that silently skip), so rendering, + signals, router, engine are simply ABSENT. spec/content.sx no longer exists and is silently + skipped. This is the JS host's single biggest defect and reframes its status. +- repro (self-verified): `python3 hosts/javascript/cli.py --output /verify-js.js` then + per-section byte counts → render/adapter-html/adapter-sx/adapter-dom/engine/orchestration/deps/ + page-helpers/router/signals/freeze/content all = 6 bytes (marker only); evaluator 140KB, + signals-web 99KB, boot 14KB, web-forms 8KB, parser 12KB survive (not define-library-wrapped). + +### C0b. `node run_tests.js` fails 2490/5086 — the JS conformance/CI gate is structurally red +- severity: critical +- confidence: CONFIRMED (reproduced live, self-verified) +- where: hosts/javascript/run_tests.js; .gitea/run-ci-tests.sh (CI gates on this) +- what: `node hosts/javascript/run_tests.js` → **2596 passed, 2490 failed** (exit 1); `--full` + → 2453p/3203f. A fresh rebuild produces identical failures — structural, not stale artifact. + Same corpus on OCaml → 5762p/274f. Failure signature: 1190× `Undefined symbol: dom-set-inner-html`, + 538× hs-to-sx*, 229× `renderToHtml is not defined`, 13× `makeRtd is not defined` — the hollow + bundle (C0a) plus a runner that never learned to load the newer lib deps (parser-combinators, + gql, hyperscript) the OCaml runner loads. Memory's "957/957 standard, 1080/1080 full" is stale. + The JS host is alive as a CI/test host (sx-build-all.sh, run-tests.sh, .gitea CI) but DEAD as a + served artifact (the live shell injects wasm/sx_browser.bc.wasm.js, not scripts/sx-browser.js) — + so this drift breaks CI/conformance, not visitors. +- repro (self-verified): `timeout 400 node hosts/javascript/run_tests.js` → `Results: 2596 passed, 2490 failed`. + +### C1. Malformed command line crashes the whole persistent/site sx_server process +- severity: critical +- confidence: CONFIRMED (reproduced live) +- where: hosts/ocaml/bin/sx_server.ml:5048 (persistent mode) and :4950 (site_mode) +- what: The dispatch loop calls `Sx_parser.parse_all line` outside any `try`; the + enclosing handler only catches `End_of_file`, so `Sx_types.Parse_error` from one + malformed line kills the entire process (the shared command channel used by + bridges/conformance runners). site_mode has the identical unguarded parse. +- repro: `printf '(epoch 2)\n(eval "(+ 1 2"\n(epoch 3)\n(eval "99")\n' | timeout 60 hosts/ocaml/_build/default/bin/sx_server.exe` + → `Fatal error: exception Sx_types.Parse_error("Unterminated list")`; subsequent + commands never run. Same with a plain-garbage line (`not an s-expr ]]] {{{`). + +### C10. Served browser JIT compiler is one fix behind lib/compiler.sx — the HO-loop desugar parity fix never shipped to the browser +- severity: high +- confidence: CONFIRMED (drift) / SUSPECTED (user impact) +- where: shared/static/wasm/sx/compiler.sx + compiler.sxbc vs lib/compiler.sx:390-443 +- what: Commit 921db09f (2026-06-30, "jit: HO-loop desugar + --jit == CEK parity") added + map/filter/reduce/for-each lambda-inlining to lib/compiler.sx, but the copy served to + the browser (content from 59ac51a8, 2026-06-29) lacks all 54 lines; the .sxbc was + rebuilt later (1e2ff387, Jul 2) but from the stale source. The browser JIT therefore + misses the exact parity fix for the known "map/for-each drops all-but-first" bug class + fixed server-side. +- repro: `diff lib/compiler.sx shared/static/wasm/sx/compiler.sx | head` → `390,443d389`; + `grep -c "_hml" lib/compiler.sx shared/static/wasm/sx/compiler.sx` → 6 vs 0. + +### C11. Client reads module-manifest.sx that nothing generates anymore — stale lazy-dep graph +- severity: high +- confidence: CONFIRMED +- where: consumer hosts/ocaml/browser/sx-platform.js:575 (fetches `sx/module-manifest.sx`); + generator hosts/ocaml/browser/compile-modules.js:504 (writes `module-manifest.json` only) +- what: `loadWebStack()` drives client module loading from `module-manifest.sx`, but the + generator now only emits `.json`; no `.sx` writer exists. The served `.sx` manifest is + from 2026-04-16 while modules are from Jul 1-2. Concretely `_entry :lazy-deps` is + missing `hs-worker` and `hs-prolog`, so those plugins never lazy-load in the browser; + any future dep/export change is invisible to the client. +- repro: `grep '_entry' shared/static/wasm/sx/module-manifest.sx` → lazy-deps ends at + `hs-integration hs-htmx`; module-manifest.json includes `hs-worker`, `hs-prolog`. + +### C1b. Non-ASCII byte on the top-level command channel also crashes the sx_server process (2nd instance of C1) +- severity: critical +- confidence: CONFIRMED (reproduced live, self-verified) +- where: hosts/ocaml/lib/sx_parser.ml:38 (is_symbol_char is ASCII-only) + the unguarded + command-channel parse (C1, sx_server.ml:5048) +- what: A non-ASCII byte reaching the kernel's top-level command reader raises an uncaught + `Parse_error` that kills the subprocess — same unguarded-parse root cause as C1, but reachable + from any client that forwards a unicode-named symbol. The Python parser *accepts and serializes* + bare unicode symbols (shared/sx/parser.py:158 accepts €-￿), so a Python-bridge caller + can trigger this. Inside `(eval "…")` it's a catchable error; on the command channel it's fatal. +- repro: `printf '(epoch 1)\n(eval (quote café))\n(epoch 2)\n(eval "99")\n' | timeout 30 hosts/ocaml/_build/default/bin/sx_server.exe` + → `Fatal error: exception Sx_types.Parse_error("Unexpected char: \195 …")`; epoch-2 never runs. + +### JS1. `define-record-type` broken on JS: `makeRtd` platform constructor missing +- severity: high +- confidence: CONFIRMED (agent-reproduced) +- where: spec/evaluator.sx:2584 calls `(make-rtd …)`; no `makeRtd` in hosts/javascript/platform.py +- what: The transpiled evaluator's record support calls makeRtd, which the JS platform never + defines. Any define-record-type throws; test-cross-lang-types.sx (18 deftests) aborts. +- repro: `(define-record-type point (make-point x y) point? (x point-x) (y point-y))` → + JS `ERROR: makeRtd is not defined`; OCaml `(point-x (make-point 3 4))` → 3. + +### JS2. `host-callback` / `host-await` never fire for SX lambdas (wrong type tag) +- severity: high +- confidence: CONFIRMED (agent-reproduced) +- where: hosts/javascript/platform.py:2332-2342 (host-callback), 2354-2361 (host-await) +- what: Both check `fn._type === "lambda"`, but Lambda instances are tagged `_lambda === true` + (platform.py:889); the check never matches, so SX lambdas wrapped as JS callbacks become + silent no-op functions. Any DOM event/promise callback registered through these does nothing. +- repro: JS `(let ((cb (host-callback (fn (x) (* x 2))))) (cb 21))` → nil; OCaml → 42. + +### JS3. JS arithmetic silently drops extra arguments (arity not checked) +- severity: high +- confidence: CONFIRMED (agent-reproduced) +- where: hosts/javascript/platform.py:1021 (-), 1045 (/), 1086-1101 (< > <= >=) +- what: Non-rational `-` returns `args[0]-args[1]` regardless of arity; `/` is strictly 2-arg; + comparisons ignore args past the 2nd. OCaml errors on wrong arity, so buggy guest code passes + silently on JS with wrong values. +- repro: JS `(- 5 1 2)` → 4 (OCaml 2); `(/ 24 2 3)` → 12 (OCaml errors); `(< 1 2 3)` → true (OCaml errors). + +### JS4. Parser rejects `.` as a symbol — whole hyperscript-parser test file aborts +- severity: high +- confidence: CONFIRMED (agent-reproduced) +- where: hosts/javascript/platform.py:2622 (_identStartRe lacks `.`) +- what: `.` is a valid symbol on OCaml but "Unexpected character: ." on JS. test-hyperscript-parser.sx + uses `(quote .)`, so the file fails to parse and all 93 deftests are skipped (counted as 1 failure). +- repro: JS `(str (quote .))` → ERROR; OCaml → `"."`, type-of → "symbol". (Handoff: spec must rule + whether `.` is a legal symbol.) + +### JS5. run_tests.js harness undercounts + shadows semantics +- severity: medium +- confidence: CONFIRMED (agent-reproduced) +- where: hosts/javascript/run_tests.js:457-467, 78-79, 88-96, 126 +- what: (1) Each test file is wrapped in one try/catch — the first throw abandons the rest of the + file and counts as ONE failure (~111 deftests silently skipped this run). (2) `env-bind!` and + `env-set!` are defined identically (`e[k]=v`) — no scope-chain walk for env-set!, contradicting + the platform's own envSet (platform.py:2098) and the spec bind/set distinction, so tests + exercising that semantics test the wrong thing. (3) `sha3-256` is a fake 32-bit hash padded to + 64 hex — cross-host content-address comparison is meaningless. (4) lib/harness load errors are + logged but never counted. + +### JS6. Transpiled `(str …)` doesn't skip nil (emission vs runtime disagree) +- severity: medium +- confidence: CONFIRMED (agent, emission rule read + runtime verified opposite) +- where: hosts/javascript/transpiler.sx js-emit-list str clause vs runtime platform.py:1138-1144 +- what: The transpiler emits `String(a)+String(b)` for str, so transpiled spec code concatenates + `String(NIL)` → "nil", while the CEK runtime skips nils. Identical expressions disagree between + transpiled internals and guest code. + +### JS7. Transpiler has no quasiquote emission; quoted dicts emit garbage +- severity: low +- confidence: CONFIRMED (code read) +- where: hosts/javascript/transpiler.sx js-emit-quote (no dict clause → `[object Object]`); no quasiquote case in js-emit-list +- what: Only a hazard if a transpiled spec file uses top-level quasiquote or quoted dict literals; + runtime quasiquote via CEK is correct on both hosts. + +### JS8. Stale JS host metadata + stale checked-in sx-full-test.js +- severity: low +- confidence: CONFIRMED (agent) +- where: bootstrap.py:2-11 docstring (refs deleted js.sx/run_js_sx.py/platform_js.py/shared/sx/ref/*), + bootstrap.py:287 (bad default output path), cli.py:17 (_PROJECT one level too high → /root on sys.path), + manifest.py (reports the hollow bundle as healthy); shared/static/scripts/sx-full-test.js (Apr 26, predates open-input-string → 35 --full failures) + +### C24. Python↔OCaml boundary validation is permanently a no-op (imports a nonexistent module) +- severity: high +- confidence: CONFIRMED (reproduced live, self-verified) +- where: shared/sx/boundary.py:34 +- what: `_load_declarations()` does `from .ref.boundary_parser import …` but + `shared/sx/ref/boundary_parser.py` does not exist (real file is hosts/python/boundary_parser.py). + The ModuleNotFoundError is swallowed by a bare `except Exception` and NOT cached, so every + validate_primitive/validate_io/validate_helper/validate_boundary_value silently returns without + checking — even under SX_BOUNDARY_STRICT=1. The whole SX-boundary enforcement subsystem is dead. + NOTE: shared/sx/ is load-bearing in production (Quart services render SX via the bridge), so this + is not dead-host code. +- repro: `SX_BOUNDARY_STRICT=1 python3 -c "from shared.sx import boundary; boundary.validate_primitive('definitely-not-a-primitive-xyz')"` + → `pure: 0` and `NO ERROR RAISED`. + +### C25. Parser string-escape divergence corrupts data across the Python↔OCaml boundary (\0, \x, unknown escapes) +- severity: high +- confidence: CONFIRMED (reproduced live, self-verified) +- where: shared/sx/parser.py:108 (_ESCAPE_MAP) / :471 (serialize) vs hosts/ocaml/lib/sx_parser.ml:61-79 +- what: Python maps `\0`→NUL and drops the backslash on unknown escapes (`\x`→`x`); OCaml treats + `\0` as literal backslash+zero and preserves the backslash on unknown escapes (`\x`→`\x`). + Python serialize() emits `\0` for a NUL char, so a 3-char Python string round-trips to a + 4-char OCaml string — silent corruption on any path that serializes a NUL-bearing string and + re-parses it in the kernel. ` …))` — SyntaxError) and can't produce a working + evaluator. Nothing imports it except its own internal bootstrap→platform. The hosts/python/tests/ + dir referenced by memory/RESTRUCTURE_PLAN.md does not exist. Transpiler also warns its spec has + drifted (`eval.sx dispatches forms not in special-forms.sx: provide, scope, define-record-type …`). +- repro: `python3 hosts/python/bootstrap.py > /tmp/t.py; python3 -c "import t"` → SyntaxError at line 1564. + +### C31. shared/sx/tests/ is 14/33 files broken; test_ocaml_bridge.py has 5 live failures +- severity: medium +- confidence: CONFIRMED (agent-reproduced) +- where: shared/sx/tests/ (dangling `shared.sx.ref.sx_ref` imports); test_ocaml_bridge.py +- what: 14/33 test files fail collection (import the deleted shared.sx.ref.sx_ref). test_ocaml_bridge.py + has 5 failures: `test_parse_response_ok_number` (the `(ok N)` epoch-ambiguity below) and 4 + `test_render_*` that still use flat `(let (x v) body)` binding syntax the kernel now rejects + (real Python-test/kernel drift). + +### C2. lib/r7rs.sx shadows `string->number`, dropping the spec's optional radix — spec test fails on the OCaml runner +- severity: medium +- confidence: CONFIRMED (baseline test failure + root cause isolated + differential vs bare server) +- what: spec/primitives.sx:1104 declares `string->number` with `&rest (radix :as number)`, + and the native prim (hosts/ocaml/lib/sx_primitives.ml:486) accepts 1–2 args. But + lib/r7rs.sx:83 redefines it as `(fn (s) ...)` (1 param) and re-exports to the global + namespace ("backward compatibility" import at end of file). Every environment that + loads r7rs.sx (run_tests does, at run_tests.ml:3711) loses the radix arg: + `(string->number "ff" 16)` → `string->number expects 1 args, got 2`. + Note the adjacent `number->string` wrapper (r7rs.sx:76) does this correctly by + delegating to the captured native prim for the radix case — `string->number` just + wasn't given the same treatment. The shadow also changes 1-arg semantics + (parse-int/parse-float vs the native parser). +- repro: baseline log line 4179 `FAIL: > string->number: string->number expects 1 args, got 2` + (spec/tests/test-math.sx:112). Bare server WITHOUT r7rs loaded: + `printf '(epoch 1)\n(eval "(string->number \"ff\" 16)")\n' | timeout 30 hosts/ocaml/_build/default/bin/sx_server.exe` → `255` (correct). + +### C12. compile-modules.js SOURCE_MAP has 5 dead source paths that silently no-op +- severity: medium +- confidence: CONFIRMED +- where: hosts/ocaml/browser/compile-modules.js:38-49 +- what: `signals.sx→web/signals.sx`, `hypersx.sx→web/hypersx.sx`, `tw-layout/tw-type/tw.sx→web/tw-*.sx` + all point at files that no longer exist (moved to web/web-signals.sx, web/lib/hypersx.sx, + shared/sx/templates/tw-*.sx). `fs.existsSync` guards make the skips silent, so edits to + the real sources are never synced by this script — the rot vector that plausibly produced + the C10 compiler drift. +- repro: `for f in web/signals.sx web/hypersx.sx web/tw-layout.sx web/tw-type.sx web/tw.sx; do [ -f $f ] || echo MISSING $f; done` → all 5 MISSING. + +### C13. test_platform.js broken — references deleted web/signals.sx (browser platform test tier unrunnable) +- severity: medium +- confidence: CONFIRMED +- where: hosts/ocaml/browser/test_platform.js:65 +- what: The "full WASM + platform stack" Node test crashes at load: web/signals.sx was + renamed to web/web-signals.sx and the file list was never updated. +- repro: `timeout 60 node hosts/ocaml/browser/test_platform.js` → `Error: ENOENT … web/signals.sx`. + +### C14. hosts/ocaml/browser/dist/ is an 8-week-stale duplicate bundle that sx-build-all.sh still feeds +- severity: medium +- confidence: CONFIRMED +- where: hosts/ocaml/browser/dist/ (72 files, May 7) vs shared/static/wasm/ (101 files, Jul 1-2); + scripts/sx-build-all.sh:19-32 +- what: Runtime serving is exclusively from shared/static/wasm/ (shared/sx/helpers.py:1080, + sx_server.ml:3349/4525), but bundle.sh/compile-modules.js default to dist/ and + sx-build-all.sh still syncs into and compiles from it — a full run can compile .sxbc + from May-7-era sources for anything its sync list misses. Two half-overlapping bundle + dirs are the root cause of C10/C11. +- repro: `ls hosts/ocaml/browser/dist/sx | wc -l` → 72 vs 101; dist mtimes 2026-05-07. + +### C15. Git-tracked stale kernel artifacts in hosts/ocaml/shared/static/wasm (5MB dead blob) +- severity: medium +- confidence: CONFIRMED +- where: hosts/ocaml/shared/static/wasm/{sx_browser.bc.js (5.0MB Apr 9), sx_browser.bc.wasm.js, sx-platform.js (Apr 16)} +- what: A wrong-cwd copy of the deployed wasm dir, committed to git, months stale + (missing newer kernel exports like resetStepCount/setStepLimit); nothing references it. +- repro: `git ls-files hosts/ocaml/shared/` → 3 files. + +### C19. HTML vs DOM adapter disagree on non-boolean attributes with raw boolean values (SSR/hydration mismatch) +- severity: medium +- confidence: CONFIRMED (HTML side run; DOM side unambiguous in source) +- where: spec/render.sx:render-attrs (~263-283) vs web/adapter-dom.sx:render-dom-element (~356-366) +- what: For an attr NOT in BOOLEAN_ATTRS with value `true`: HTML emits `attr="true"`, + DOM emits `attr=""`. With value `false`: HTML emits `attr="false"`, DOM omits the + attribute. Any `data-*`/aria attr driven by a boolean diverges between SSR and + client render → hydration mismatch + needless sync-attrs morph churn. +- repro: `sx_harness_eval (render-to-html (quote (div :data-x false :data-y true :data-z nil)))` + → `
`; DOM branch produces `data-y=""` and drops `data-x`. + +### C20. CSRF token attached to cross-origin fetches; computed `cross-origin` flag is dead +- severity: medium +- confidence: CONFIRMED (code path) / SUSPECTED (browser-level exploitability) +- where: web/orchestration.sx:do-fetch (line 183 header injection, 207-208 cross-origin flag); + consumer web/lib/boot-helpers.sx:fetch-request (354-390) +- what: do-fetch adds `X-CSRFToken` unconditionally (no origin gate) and computes a + `cross-origin` field that fetch-request never reads (nor sets credentials/mode). + Net: CSRF token leaks to third-party endpoints; the cross-origin classification is inert. + No test exercises CSRF header injection or origin gating. + +### C21. Test harness IO model bypasses the real perform/suspend path (structural coverage gap) +- severity: medium +- confidence: CONFIRMED +- where: spec/harness.sx default-platform (line 34), make-interceptor (52), install-interceptors (55) +- what: The harness binds IO ops as plain synchronous NativeFns, never as perform-suspending + primitives — so no harness test can exercise CEK suspend/cek-resume or the VM inline-settle + path. The known HO+perform element-drop class (S10) is structurally invisible to the harness: + `(map (fn (u) (fetch u)) …)` under a mock returns all elements because fetch is a direct call. +- repro: sx_harness_eval of `(map (fn (u) (get (fetch u) "status")) (list "a" "b" "c"))` with + fetch mock → `(200 200 200)`, all logged — real serving path is the one at risk. + +### C22. Harness interceptor logs IO call only after the mock returns — throwing mocks invisible +- severity: low +- confidence: CONFIRMED +- where: spec/harness.sx:make-interceptor (line 52) +- what: `append!` to the IO log happens after mock-fn returns; a raising mock leaves no + log entry, so assert-io-called/count falsely report "never invoked" on error-path tests. + (Positive side-finding: host IO errors DO surface as catchable SX errors.) +- repro: try-catch around `(fetch "http://a")` with mock raising "boom-io" → caught, but + IO log shows "(no IO calls)". + +### C23. adapter-dom test suite tests membership predicates only — zero render-output tests +- severity: low +- confidence: CONFIRMED +- where: web/tests/test-adapter-dom.sx (18 deftests) +- what: All 18 tests assert `*-is-a-render-form?`/RENDER_HTML_FORMS membership; none test + actual render-to-dom output (boolean attrs, on-*/bind/ref/key, reactive attrs, void + elements, hydration cursor). The 1512-line DOM adapter holding C19 + hydration handoff + is the thinnest-tested adapter relative to size. + +### C3. Dead stale-io-response guard in the epoch loop (13-char literal vs 14-char substring) +- severity: low +- confidence: CONFIRMED (reproduced live) +- where: hosts/ocaml/bin/sx_server.ml:5043-5046 +- what: `String.sub line 0 14 = "(io-response "` compares a 14-byte substring to a + 13-byte literal — never true, guard never fires. A stray `(io-response …)` line + falls through to dispatch and emits an extra `(error N "Unknown command …")` reply + instead of being silently discarded — an unexpected extra response for the client. +- repro: `printf '(epoch 1)\n(io-response 1 42)\n(eval "5")\n' | timeout 60 …/sx_server.exe` + → `(error 1 "Unknown command: (io-response 1 42)")` then the eval reply. + +### C4. Malformed `(epoch)` / `(epoch foo)` doesn't update the epoch — responses tagged stale +- severity: low +- confidence: CONFIRMED (reproduced live) +- where: hosts/ocaml/bin/sx_server.ml:5051-5054 +- what: Only `(epoch )` updates `current_epoch`; malformed epoch markers fall + through as unknown commands and the old epoch keeps tagging subsequent responses. + A client whose epoch line was mangled will discard every following response as + stale → hang. +- repro: `printf '(epoch)\n(eval "2")\n(epoch foo)\n(eval "3")\n' | …` → all replies + tagged epoch 0. + +### C5. No monotonic-epoch enforcement despite protocol comment claiming it +- severity: low +- confidence: CONFIRMED (reproduced live) +- where: hosts/ocaml/bin/sx_server.ml:5051-5054, :4952-4955 (comment at :259-262) +- what: `current_epoch := n` unconditionally; decreasing/repeated epochs accepted + silently, so client bugs/reordering mis-tag responses instead of being detected. +- repro: `printf '(epoch 9)\n(epoch 3)\n(eval "42")\n' | …` → `(ok-len 3 2)`. + +### C6. Two commands on one line → both dropped with a single error (client desync) +- severity: low +- confidence: CONFIRMED (reproduced live) +- where: hosts/ocaml/bin/sx_server.ml:5055-5056 +- what: A line parsing to >1 expr returns one `(error … "Expected single command, got 2")` + and executes neither — a pipelining client gets a response-count desync. Related: + a command with no preceding `(epoch N)` is answered under the previous epoch tag. + +### C7. vm-trace with compiler not loaded errors as "Not callable: nil" +- severity: low +- confidence: CONFIRMED (reproduced live) +- where: hosts/ocaml/bin/sx_server.ml:2193-2201 (vm-trace dispatch) +- what: If `lib/compiler.sx` isn't loaded (e.g. `load` failed because it is cwd-relative + and the server was started from hosts/ocaml), `(vm-trace "…")` fails with the opaque + `(error N "Not callable: nil")` instead of "compiler not loaded". Minor DX. +- repro: from hosts/ocaml: `printf '(epoch 1)\n(load "lib/compiler.sx")\n(epoch 2)\n(vm-trace "(+ 1 2)")\n' | timeout 60 _build/default/bin/sx_server.exe` + → `(error 1 "File error: …")` then `(error 2 "Not callable: nil")`. + +### C8. Stray triplicated source tree hosts/ocaml/hosts/ocaml/hosts/ocaml/ +- severity: low +- confidence: CONFIRMED +- what: A nested duplicate of the OCaml tree (154 KB Apr-9 copy of sx_server.ml etc.) + sits inside hosts/ocaml. Not built by dune, but a grep/edit-wrong-file trap. + +### C16. hosts/native is an orphaned but healthy PoC — builds green, referenced by nothing +- severity: low +- confidence: CONFIRMED +- where: hosts/native/ (own dune-workspace; lib_sx → ../../hosts/ocaml/lib symlink) +- what: SDL2/Cairo native SX renderer, single commit f0d8db9b (2026-04-09), unreferenced + by scripts/CI/docs. NOT bit-rotted: compiles against current `sx` library and its smoke + test passes (parse → 25-node render tree → layout → Cairo paint → PNG). Because it has + its own dune-workspace, the main build never compiles it, so silent breakage would go + unnoticed. Positive: it shares the evaluator by tracked symlink — no fork drift possible. +- repro: `cd hosts/native && eval $(opam env) && timeout 280 dune build ./test/test_render.exe` + → exit 0; run → `=== All OK! ===`. + +### C17. Tracked-but-dead browser files: sx-platform-2.js and 23 *.sxbc.json +- severity: low +- confidence: CONFIRMED (dead refs) / SUSPECTED (.sxbc.json fully unused) +- where: shared/static/wasm/sx-platform-2.js (Apr 16); shared/static/wasm/sx/*.sxbc.json +- what: sx-platform-2.js referenced by nothing. .sxbc.json written by mcp_tree.ml:1211 and + copied by compile-modules.js:355, but the client loads the `.sxbc` SX-text format — no + runtime reader found. All ~2.5 months stale vs their .sxbc twins. + +### C18. Untracked repo-root clutter: spa-debug.js, scripts/sx-sessions-restore.sh +- severity: low +- confidence: CONFIRMED +- what: spa-debug.js is a leftover Playwright compare of direct-load vs boosted SPA nav of + relate-picker (Jun 30 debugging); sx-sessions-restore.sh is an ops script embedding + session UUIDs + `claude --dangerously-skip-permissions` invocations. Commit, relocate, + or remove. + +### C9. run_tests spec suites print with empty suite label +- severity: low +- confidence: CONFIRMED (baseline log) +- what: spec test suites (e.g. test-math.sx) report as `FAIL: > string->number` with a + blank suite name in run_tests output — harness labeling defect, makes triage harder. + +--- + +## CROSS-HOST PARITY (CONFIRMED divergences — same expr, different result per host) + +Enumeration (900-name universe): spec declares 202 primitives; OCaml resolves 536, JS 668, +Python 112. Only 97 names common to all three. Spec primitives MISSING: OCaml 9 +(`eq? eqv? escape format format-date parse-datetime pluralize strip-tags sx-parse`), JS 6 +(`downcase eq? eqv? equal? json-encode upcase`), Python 108 (whole modules — see PY block). +Differential battery files in scratchpad: battery*.json, parity-js.js. + +### P1. OCaml float serialization is lossy — SX-text wire round-trip destroys precision +- severity: high +- confidence: CONFIRMED (self-verified) +- where: hosts/ocaml/lib/sx_types.ml:415-423 (format_number uses `%g` = 6 sig digits for fractional floats) +- what: All fractional floats print with 6 sig digits; JS prints shortest-round-trip. Any value + crossing the SX text wire (aser, epoch protocol, persisted SX) from OCaml silently loses precision. + Integral floats ≥1e16 use `%.17g` (lossless) — inconsistent within the same host. This is the + PRODUCTION evaluator, so it affects live wire payloads. +- repro (self-verified): OCaml `(number->string (/ 1 3))` → `"0.333333"`; `(+ 0.1 0.2)` → `"0.3"`; + `(= (parse-number (number->string (/ 1 3))) (/ 1 3))` → `false`. JS same → true / 0.3333333333333333. + +### P2. OCaml `sort` breaks on mixed int/float lists (polymorphic compare on constructor tag) +- severity: high +- confidence: CONFIRMED (self-verified) +- where: hosts/ocaml/lib/sx_primitives.ml:1290-1292 (`List.sort compare l`) +- what: OCaml polymorphic `compare` orders Integer before Number by variant tag, so numerically + mixed lists sort wrongly and silently. +- repro (self-verified): OCaml `(sort (list 1.5 10 2))` → `(2 10 1.5)`; JS → `(1.5 2 10)`. + +### P3. `into` is not native on the OCaml kernel — routed over the IO helper bridge, fails standalone +- severity: medium +- confidence: CONFIRMED (self-verified) +- where: OCaml resolves `into` but execution emits `(io-request N "helper" "into" …)` +- what: A spec core.dict primitive requires a live Python-side helper on OCaml; JS/Python have it + native. Any pure-OCaml context (tests, CLI, and notably the `eval`/`load-source` command paths + that don't service helper IO) can't use `into`. +- repro (self-verified): OCaml `(into {} (list (list :a 1)))` → io-request then + `error … "IO bridge: stdin closed while waiting for io-response"`; JS → `{:a 1}`. + +### P4. Exact-integer semantics diverge (OCaml int63 Integer vs JS float64) +- severity: high +- confidence: CONFIRMED (agent) +- where: hosts/ocaml/lib/sx_types.ml:47-48 (Integer distinct from Number); JS has only doubles +- what: Arithmetic differs beyond 2^53; literals above int63 silently become floats on OCaml with + different text form; exactness is host-dependent and not wire-preserved (both serialize `1.0` as + `"1"`, which OCaml re-reads as exact Integer). +- repro: `(+ 9007199254740992 1)` → OCaml 9007199254740993, JS …992; `(* 111111111 111111111)` → + OCaml 12345678987654321, JS …320; `(float? 1.0)` → OCaml true, JS false. + +### P5. JS `=` is not deep on dicts (spec: alias for equal?); `equal?`/`eq?`/`eqv?` missing from JS bundle +- severity: high +- confidence: CONFIRMED (agent + JS agent) +- where: hosts/javascript/platform.py:841 (sxEq falls through to false for plain-object dicts); + no PRIMITIVES["equal?"/"eq?"/"eqv?"] in the JS bundle (run_tests.js:126 patches them only into the test env) +- what: `=` deep-compares lists but reference-compares dicts on JS; OCaml deep-compares both. And the + spec equality trio is absent from the browser bundle — harness assertions (spec/harness.sx:82,85,88 + call equal?) and any component using equal? crash client-side. OCaml also lacks eq?/eqv? (has + non-spec identical? instead). +- repro: `(= {:a 1} {:a 1})` → OCaml true, JS false; `(equal? (list 1 2) (list 1 2))` → OCaml true, JS ERROR. + +### P6. String length/indexing units differ across hosts (UTF-8 bytes vs UTF-16 code units) +- severity: high +- confidence: CONFIRMED (agent) +- where: OCaml String.length (bytes) vs JS .length (UTF-16, platform.py:1386) +- what: Neither counts codepoints; any substring/index arithmetic on non-ASCII text diverges + cross-host (and both differ from user-perceived length). +- repro: `(len "héllo")` → OCaml 6, JS 5; `(len "🎉")` → OCaml 4, JS 2. + +### P7. JS lenient-coercion cluster vs OCaml strictness (silent wrong values on the client) +- severity: medium (aggregate; several individually medium/high) +- confidence: CONFIRMED (agent + JS agent) +- where: hosts/javascript/platform.py (str 1138, range 1380, len 1386, cons 1391, get 1385, parse-int 1500, split 1148, round 1058, slice 1167, max/min 1065) +- what: JS coerces where OCaml errors/strict → divergent results the client renders silently: + `(str (list 1 2))` JS "1,2" (OCaml "(1 2)"); `(str {:a 1})` JS "[object Object]" (OCaml "{:a 1}"); + `(range 3)` JS `()` (OCaml `(0 1 2)`); `(len nil)` JS 2 (OCaml 0); `(cons 1 nil)` JS `(1 nil)` + (OCaml `(1)`); `(get {:a nil} :a 42)` JS nil (OCaml 42); `(get "abc" 1)` JS "b" (OCaml nil); + `(parse-int "abc")` JS 0 (OCaml nil); `(parse-int "42px")` JS 42 (OCaml nil); `(split "abc" "")` + JS `("abc")` (OCaml `("a" "b" "c")`); `(round -2.5)` JS -2 (OCaml -3); `(slice "hello" -3)` JS "llo" + (OCaml "hello"); `(max)` JS -Infinity (OCaml errors); `(+ 1 "2")` JS "12" (OCaml 3); `(mod 10 0)` + JS NaN (OCaml Division_by_zero). + +### P8. nil/list-strictness family diverges, some arms violating the spec on OCaml's side +- severity: medium +- confidence: CONFIRMED (agent) +- where: OCaml sx_primitives.ml strict arms vs JS lenient (platform.py:1391-1392) +- what: `(merge nil {:a 1})` → OCaml ERROR (spec says skip nil → OCaml BUG) vs JS `{:a 1}`; + `(contains? {:a 1} :a)` → OCaml ERROR "contains?: 2 args" (spec: dict key check → OCaml BUG) vs JS true; + `(append nil (list 1))` → OCaml `(1)` vs JS ERROR; `(cons 1 2)` → OCaml ERROR vs JS `(1 2)`; + `(reverse "abc")` → OCaml ERROR vs JS "cba". + +### P9. Dict key ordering differs per host (Hashtbl vs insertion order) — SX text wire not canonical +- severity: medium +- confidence: CONFIRMED (agent; also self-verified in the Python block C27) +- where: OCaml dicts are Hashtbl (arbitrary bucket order); JS objects preserve insertion; Python preserves insertion +- what: keys/vals iteration order and serialized dict text differ per host — anything comparing or + hashing serialized SX text cross-host diverges. CBOR/CID layer is safe (sx_cbor.ml:67 sorts keys). +- repro: `(keys {:z 1 :a 2 :m 3})` → OCaml `("m" "z" "a")`, JS `("z" "a" "m")`. + +### P10. NaN/Infinity are not portable wire tokens; JS can't re-read its own output +- severity: medium +- confidence: CONFIRMED (agent) +- where: sx_types.ml:416-418 prints nan/inf/-inf (OCaml re-reads all forms) vs JS prints Infinity/NaN (resolves neither) +- repro: `(/ 1 0)` → OCaml inf, JS Infinity; eval `Infinity` → OCaml inf, JS ERROR undefined symbol. + +### P11. `upper`/`lower` Unicode behavior differs; `upcase`/`downcase` aliases missing on JS; `round` half-mode differs +- severity: medium +- confidence: CONFIRMED (agent) +- where: sx_primitives.ml:890-897 (uppercase_ascii) vs platform.py:1145 (full Unicode); JS lacks upcase/downcase; round half-mode +- repro: `(upper "héllo")` → OCaml "HéLLO", JS "HÉLLO"; `(upcase "x")` → OCaml "X", JS ERROR; + `(round -2.5)` → OCaml -3, JS -2. + +### P12. OCaml `zip-pairs` chunks pairs; spec + lib/stdlib.sx define a sliding window (kernel diverges from its own spec) +- severity: medium +- confidence: CONFIRMED (JS agent handoff) +- where: OCaml native zip-pairs vs spec/primitives.sx:663 + lib/stdlib.sx:207 (JS matches spec) +- what: OCaml native `zip-pairs` returns `((1 2)(3 4))` but the spec + stdlib define sliding window + `((1 2)(2 3)(3 4))`. Handoff to core, but the OCaml *host* is the one diverging from the spec. + +### PY (Python host). Standalone hosts/python/ cannot evaluate the current spec at all +- severity: critical (as a supported target) — see C30 for the standalone-host summary +- confidence: CONFIRMED (agent) +- where: hosts/python/bootstrap.py:119 (_mangle), generated lines 1564/3482/3498; spec `match` form +- what: Four independent breaks in the Python bootstrapper: (1) _mangle RENAMES lacks `*winders*` + → SyntaxError; (2) no rule for `->` in names → `def string_>symbol` invalid; (3) the `match` + special form is emitted as an eager function call → undefined `match`, and even shimmed it + evaluates all branches eagerly (core dispatch unrunnable); (4) references missing natives + make-char/aser-call/aser-fragment. Python resolves only 112 of the 900-name universe and lacks + 8 primitive modules (math/bitwise/bytevectors/vectors/regexp/sets/rational/hash-table = 108 spec + primitives). This overlaps/extends C30. NOTE: shared/sx/ Python (parser+bridge) IS load-bearing + in production; the standalone evaluator host is what's dead. +- repro: `python3 hosts/python/bootstrap.py > /tmp/x.py && python3 -c "import x"` → SyntaxError line 1564; + after mangle-patch, `(+ 1 2)` → `name 'match' is not defined`. + +--- + +### (informational) ok-len framing is byte-accurate and desync-resistant +- confidence: CONFIRMED — length prefixes are byte counts; newlines/framing patterns in + payloads are escaped; multibyte UTF-8 counted correctly. (Positive result.) + +### (informational) WASM browser kernel boots green; kernel is single-source with server +- confidence: CONFIRMED — test_boot.sh → `WASM boot: OK`; test_wasm.sh → 29/29; + test_kernel.js → 24/24; deployed kernel artifacts byte-identical to _build outputs. + Browser links the identical `sx` OCaml library as sx_server (no evaluator fork); + only sx_browser.ml FFI glue is browser-specific. (Positive result.) + +--- + +## SUSPECTED (reasoning only) — NOTE: S1/S4/S5 UPGRADED to CONFIRMED via live HTTP repro + +> Follow-up live testing (bounded `sx_server --http` on free ports, 3 runs) upgraded three of the +> HTTP-race items and clarified two more: +> - **S1 CONFIRMED** — observed a live intermittent crash (empty responses, no exception, OOM ruled +> out at ~292 MB peak) under concurrent multi-Domain load. +> - **S4 CONFIRMED** — routing-failure page served as HTTP 200 and cached (cold 2.02s → warm 0.0005s). +> - **S5 CONFIRMED** — cache key ignores cookies (3 different `session=` → identical cached body) and query. +> - **S2/S3 remain SUSPECTED** — mechanism verified live in the code path, but no visible symptom is +> triggerable in the stock static-docs app (no full-page route renders request args / no +> expansion-sensitive component to observe). They become observable on request-varying apps. +> Their entries below carry updated per-finding confidence lines. + +### S1. HTTP mode: shared mutable state raced across parallel render-worker Domains — LIVE CRASH OBSERVED +- severity: high +- confidence: **CONFIRMED (live crash reproduced once; race is real by construction)** — was SUSPECTED +- where: hosts/ocaml/bin/sx_server.ml:4717 (`Domain.spawn` × `max 4 (recommended_domain_count)`), + :4312/:4677 (`Hashtbl.replace response_cache` from workers + main), :4654 (main reads it), + :2666-2670 (`env.bindings` bind+remove), :345 (`_stream_mutex` created but NEVER locked) +- what: `http_mode` spawns 4 real `Domain.spawn` workers rendering full-page requests on the + single shared env, concurrently with the accept-domain doing AJAX renders. There is no lock + around rendering. Workers + main mutate process-global Stdlib.Hashtbls (`response_cache`, + `env.bindings`) concurrently; OCaml 5 Stdlib.Hashtbl is NOT domain-safe (memory corruption on + rehash → segfault, no catchable exception). The comment at :4204 claiming cache writes happen + "only during single-threaded startup" is false (workers write at runtime); the :343-345 comment + admits "concurrent CEK evaluations corrupt shared state". +- **LIVE REPRO (self-verified).** Booted `sx_server --http ` on a free port (3 runs). Under a + burst of ~300 concurrent AJAX + ~150 concurrent full-page requests to distinct uncached paths + (all 4 worker Domains + main thread writing `response_cache`/`env.bindings` simultaneously), + **run 2 CRASHED**: the process vanished mid-render — no OCaml exception logged (the try/with at + :4315 would have printed `[render] Error`; none present), no "Binary changed" restart, and a + concurrent poll of a known-good cached page began returning EMPTY bodies (md5 of "") for 59/60 + reads as the server died. **OOM ruled out**: run 3 under an even heavier burst survived with peak + RSS ~292 MB (baseline ~200 MB) — memory never approached exhaustion. The crash is INTERMITTENT + (run 1 clean, run 2 crashed, run 3 survived) — the classic signature of a data race, not a + deterministic fault. Net: the production HTTP server can crash under concurrent load; I couldn't + capture a core dump (apport core_pattern, restricted) to pin the exact fault address, so the + "Hashtbl-rehash-segfault" mechanism is inferred from the crash profile + the unsynchronized-access + code, not a stack trace. Logs: /tmp/sx-review/http-server{,2,3}.log. + +### S2. HTTP mode: per-request state is process-global → cross-request contamination +- severity: high +- confidence: SUSPECTED (mechanism confirmed live; visible symptom not triggerable in the stock docs app) +- where: sx_server.ml:4335-4354 (main domain sets `_req_method/_req_query/_req_headers/_req_body` + + `Hashtbl.clear/replace _request_cookies`), :4703 (full-page render QUEUED to a worker with only + `(fd, path, headers)` — NOT the request context), consumed by request-* primitives at :3772-3829 +- what: I confirmed live that full-page misses are **queued** to worker Domains carrying only the + path, while the worker's render reads the process-global `_req_query`/`_req_body`/`_request_cookies` + that the main loop overwrites on the very next request. So a queued render can read a *subsequent* + request's query/body/cookies (plus a plain main-writes-while-worker-reads data race). I could NOT + produce a visible crossed-value symptom because the sx-docs app has no full-page route that renders + `(request-arg …)`/`get-cookie`/body into its output — the docs pages are static. The contamination + is therefore structurally present and live-reachable but only becomes observable for an app whose + full-page renders vary by request args (e.g. any of the real Quart-replacement domains). Kept + SUSPECTED honestly: mechanism verified, symptom not exhibited on the stock app. + +### S3. `expand-components?` global binding removed mid-flight by AJAX render +- severity: medium +- confidence: SUSPECTED (bind/remove verified live in code path; visible symptom not isolated) +- where: sx_server.ml:4157 (bound globally = true for ALL renders), :2666/:2670 (AJAX branch binds + then `Hashtbl.remove env.bindings "expand-components?"`) +- what: http_mode binds `expand-components?` once globally so full-page worker renders expand + components. The AJAX branch of http_render_page binds and then REMOVES it on the SAME shared env — + so after any AJAX request the global binding is gone, and a concurrent (or subsequent) full-page + aser render can observe it vanished (components silently not expanded), plus the remove races + worker lookups. Confirmed the bind/remove-on-shared-env code path executes during live AJAX + requests (the log shows "[sx-http] … (SX) aser=" for every AJAX hit); did not isolate a + visibly-unexpanded component in output because the docs pages' aser output is SSR-expanded through + a different path. Mechanism confirmed; symptom not pinned. + +### S4. Response cache stores soft error pages (routing failures cached as HTTP 200) — LIVE CONFIRMED +- severity: medium +- confidence: **CONFIRMED (self-verified live)** — was SUSPECTED +- where: sx_server.ml:2635-2658 (`error_page_ast` returned as `Some body`), cached at :4312/:4677 +- what: A routing failure renders `error_page_ast` returned as a normal `Some body` and served with + HTTP **200**; the worker/ajax paths cache any `Some body`. So the not-found/error page is cached + and served from cache to every subsequent visitor until restart. (Hard 500s from raised exceptions + at :4315 are NOT cached; soft error pages ARE.) The "transient cross-service failure" sub-case + (a fetch error rendered into the page, then cached) is reasoned from the same code path but needs + the Python bridge to trigger, so it stays inferred. +- repro (self-verified): `GET /sx/` → HTTP 200 containing the "404"/not-found + page; a 2nd GET of the same path returned byte-identical bytes served from cache — cold render 2.02s + vs warm 0.0005s (~4000×), and the server log shows only ONE render line for that path. + +### S5. Response cache keying ignores cookies/query/auth (except one hardcoded cookie) — LIVE CONFIRMED +- severity: medium +- confidence: **CONFIRMED (self-verified live)** — was SUSPECTED +- where: sx_server.ml:4296/:4651-4654 (`cache_key` = path, +ajax:/htmx: prefix), :4649 (only + `sx-home-stepper` bypasses the cache), :4331-4332 (query stripped before keying) +- what: The cache key is the path only; cookies (except the hardcoded `sx-home-stepper`) and the + query string are absent from it. Any page whose output varied by cookie/session/auth or query would + be cached under a shared key and served across users. Benign for the current public docs (pages + don't vary by cookie), but a real footgun for the cookie/auth-dependent Quart-replacement domains. +- repro (self-verified): `GET /sx/(geography.(hypermedia))` with three different `Cookie:` headers + (different `session=` values) → three byte-identical responses from one cache entry; same path with + `?u=1` vs `?u=2` → byte-identical (query stripped from key). + +### S8. Browser env missing spec platform constructors + a small primitive tail — latent break for shared modules +- severity: medium +- where: sx_browser.ml:626+ (91 server-only binds vs 22 browser-only; lists in scratchpad + server-binds.txt / browser-binds.txt) +- what: `make-lambda/make-component/make-macro/make-thunk`, `now`, `json-encode`, + `parse-safe`, `pretty-print`, `into` are bound only in sx_server's env, absent + client-side. Currently harmless (spec/evaluator.sx isn't shipped to the browser; the + only consumer of the latter group is web/io.sx, not bundled) — but any shared .sx + module that starts using them silently breaks in the browser only. + +### S9. Aser/SPA boosted-nav component expansion still fragile — hinges on boot-manifest eager-load fed by stale artifacts +- severity: medium +- where: web/adapter-sx.sx; sx-platform.js:649-660 (page-manifest boot array); + lib/host/sx/relate-picker.sx; bundle.sh:74-77 +- what: The April blocker was fixed (ac65666f) and Jun-29 commits added server-embedded + `data-sx-manifest` eager-loading, but the untracked Jun-30 spa-debug.js shows boosted-nav + expansion of the picker was still being debugged the next day — and both the stale served + compiler (C10) and stale module-manifest.sx (C11) sit directly on this code path. + +### S10. VM inline-resolve of IO inside HO-primitive callbacks cannot suspend/resume (async-IO correctness cliff) +- severity: high +- confidence: SUSPECTED (documented + code-confirmed; not reproducible through the harness — see C21) +- where: hosts/ocaml/lib/sx_vm.ml:call_closure_reuse (~349-375); `_cek_io_resolver` installed at sx_server.ml:4166 +- what: When `perform` fires inside a JIT/VM HO-primitive callback (map/filter/reduce/for-each), + the native OCaml loop sits between perform and resume, so it can't unwind-and-resume; it + resolves IO inline via a local `settle` loop. The in-code comment states the alternative + "would drop the remaining elements — corrupting the stack so the next CALL_PRIM sees wrong + args." This works only because durable-KV reads are synchronous; a genuinely async IO op + inside an HO callback on the serving path is not correctly supported. This is the same + hazard class as the memory note about serving JIT dropping all-but-first in map/for-each. + +### S11. Server page routing evaluates the URL as SX — any env-bound function is URL-invokable +- severity: medium +- confidence: SUSPECTED +- where: web/request-handler.sx (sx-url-to-expr, sx-auto-quote, sx-eval-page/call-page); + driven by sx_server.ml:http_render_page:2620 (sx-handle-request) +- what: sx-url-to-expr strips `/sx/`, splits on `.`, joins with spaces; call-page parses and + invokes it. sx-auto-quote only quotes *unbound* symbols, so a bound symbol as list head is + called with attacker-supplied (recursively evaluated) args. A path with parens (a legal + path char) can reach `(env-get env "some-fn")` + apply for any render-env binding. cek-try + swallows errors (→ nil/error page) limiting impact, but this is an SSR eval surface worth + whitelisting to `page:`-prefixed bindings. Couldn't confirm whether the OCaml front door + pre-sanitizes the path. + +### S12. Island hydration replaceChildren-then-render → hydration cursor reuses no SSR DOM +- severity: low +- confidence: SUSPECTED +- where: web/boot.sx:hydrate-island (~333-350) +- what: hydrate-island calls `(host-call el "replaceChildren")` (empties the element) then + renders the body in `sx-hydrating` mode. But the hydration matcher reads childNodes.item(idx) + on the just-emptied element, so it reuses nothing — effectively clear-and-fresh-render. If + intended, the sx-hydrating scope push is dead work; if not, element-rooted islands never + reuse SSR DOM. Not runnable through the DOM path to confirm. + +### S13. SSR/client parity relies on island body being pure over serialized kwargs (no dev-mode check) +- severity: low +- confidence: SUSPECTED +- where: web/adapter-html.sx:render-html-island + serialize-island-state (608) vs web/boot.sx:hydrate-island +- what: Only kwargs are serialized into data-sx-state; a body reading request-scoped/ + non-deterministic IO (now, request-arg, random) or scope/context values not in kwargs + produces SSR HTML differing from the client render → hydration mismatch. Inherent to the + design, but there's no guard or dev-mode mismatch check and no test for a non-pure body. + +### S14. Deep (≥2-level) nested-list children flatten differently between HTML and aser +- severity: low +- confidence: SUSPECTED +- where: spec/render.sx render-list path (recursive) vs web/adapter-sx.sx:aser-call/aser-fragment (one-level flatten) +- what: HTML recursively flattens arbitrarily nested child lists to text; aser flattens one + level and serializes deeper lists structurally. Single-level map output agrees (tested), + but 2+-level raw nesting could serialize differently between modes. No test covers depth ≥ 2. + +### S-bridge. Python↔OCaml bridge desync on coroutine cancellation is unrecoverable (dead _restart, no timeouts) +- severity: high +- confidence: SUSPECTED (bridge-side reasoning; kernel-side stale-message mechanism self-verified) +- where: shared/sx/ocaml_bridge.py:99 (_restart, 0 callers), :613 (_read_until_ok), :139 (async with self._lock) +- what: render/eval/aser hold self._lock while blocked in _read_until_ok awaiting a kernel io-response. + If the awaiting coroutine is cancelled (client disconnect / timeout), the `async with` releases the + lock while the kernel is still blocked in read_io_response for the OLD epoch. The next coroutine + acquires the lock and sends `(epoch N+1)\n(command)`; the kernel — still inside read_io_response — + consumes both lines as "stale messages" and keeps blocking, so the new command never runs and the + new coroutine hangs on _readline forever. No asyncio.wait_for wraps bridge calls, and _restart() + pipe-recovery is never invoked, so the bridge can't self-heal — it stays wedged until the process + is killed. Under a shared multi-Domain HTTP server (S1/S2) this is a real liveness hazard. +- repro (self-verified, kernel-side mechanism): `printf '(epoch 1)\n(eval "(helper \"w\" 1)")\n(epoch 2)\n(eval "(+ 1 2)")\n' | timeout 20 hosts/ocaml/_build/default/bin/sx_server.exe` + → `(io-request 1 "helper" "w" nil)`, then `[io] discarding stale message (…)` ×2 eating the epoch-2 + command, then the epoch-2 eval never runs. + +### S-bridge2. `_parse_response` treats a numeric result value as an epoch (ambiguity) +- severity: low +- confidence: CONFIRMED (agent) +- where: shared/sx/ocaml_bridge.py:915, shared/sx/ocaml_sync.py:115 +- what: `(ok EPOCH VALUE)` parsing strips a leading number as the epoch, but the guard + `line[4:-1].isdigit()` classifies `(ok 3)` as epoch-only-no-value and returns `('ok', None)` instead + of `('ok','3')`. Latent (kernel always emits `(ok EPOCH VALUE)`), but the digit-sniffing epoch strip + is fragile for any value that begins with a digit. test_ocaml_bridge.py has 5 live failures incl this. + +### S6. io_request-based primitives under `eval` can consume subsequent command lines +- severity: low +- where: sx_server.ml:617-669 (`query`/`action`/`helper` → blocking `read_io_response`), + `eval` dispatch at :1816 +- what: `query`/`action`/`helper` block reading the next stdin line as an IO response; + under a client that doesn't speak the IO sub-protocol, the next queued command is + consumed → desync/hang. + +### S7. Multiple eval entry points with divergent IO semantics for the same source +- severity: low +- where: `eval` (:1816-1849, import+persist inline), `load-source` (:1772-1783, no + IO handling), `load` (:1755, full `eval_expr_io`), http render (:2575, import→nil) +- what: The same source behaves differently depending on the wrapping command — + the known "unify eval/JIT paths" TODO surfacing as behavioral drift. + +--- + +## Notes / baseline caveats + +- 272/274 baseline run_tests failures are the in-progress hyperscript (`hs-*`) suites — + guest-language project, not host defects. Undefined symbols: `hs-is-set?` (16), + `eval-hs-error` (18), `hs-ref-eq` (4), `host-hs-normalize-exc` (2), plus DOM-mock + assertion failures. +- run_tests JIT is opt-in (`--jit`, run_tests.ml:4035). Baseline ran interpreter-only + (`[jit] calls=3045835 hit=0 miss=0 skip=3045835`). A `--jit` differential run is in + progress to diff pass/fail sets — results pending. +- `sx_scope.ml` does not exist on this branch (only in sibling worktrees); memory/docs + referencing it are stale for `architecture`. Scope prims live in sx_primitives.ml + as process-globals (see S1/handoffs). + +## Handoffs to other lanes + +- core: served browser JIT compiler missing 921db09f's HO-loop desugar (see C10) — the + browser-side `--jit == CEK` parity fix needs a re-sync + recompile pipeline fix. +- conformance: module-manifest format split (C11) — regenerate `.sx` or point + sx-platform.js at `.json`; until then hs-worker/hs-prolog never lazy-load client-side. +- conformance: hosts/native has a working test nothing runs; test_platform.js needs + `web/signals.sx` → `web/web-signals.sx` to restore the browser platform test tier. +- hygiene: unify dist/ vs shared/static/wasm/ behind one sync script; delete the ignored + hosts/ocaml/hosts/ nest, tracked stale hosts/ocaml/shared/static/wasm blobs, + sx-platform-2.js, and (if reader-less) *.sxbc.json. +- core (spec/render.sx parity): render-attrs renders non-BOOLEAN_ATTRS boolean-valued attrs + as `attr="true"`/`attr="false"` while adapter-dom emits `attr=""`/omits — pick one contract + and align all four adapters; add conformance cases (see C19). +- conformance (spec/harness.sx): (1) log the IO call before invoking the mock so throwing + mocks are recorded (C22); (2) add a harness mode that routes IO through real + perform/cek-resume so the HO+perform element-drop cliff (S10) becomes testable (C21). +- host (sx_vm.ml): add a targeted conformance test performing IO inside a JIT-compiled HO + callback over a function-produced list — the inline-settle path is a correctness cliff + for any future genuinely-async seam IO op (S10). +- core: `_scope_stacks` / `_request_cookies` / `_req_*` being process-global means + scope/provide/emit! and request primitives are not isolable per concurrent flow; + spec scope semantics assume single-flow evaluation. Needs a per-flow story if the + OCaml HTTP server stays multi-Domain. +- core: eval vs load-source vs load vs http-render resolve `import`/durable IO + differently — converge on one IO-resolution policy. +- core/eval: `scope`/`emit!` appears to LEAK across invocations — `(scope (emit! :k 1) + (emit! :k 2) (len (emitted :k)))` returns 2, 4, 6… cumulatively across epoch-server calls, + with JIT disabled too (so not a VM bug — looks like a scope-stack state leak in scope-in-frames + or the epoch env's scope stack). Flagged by the VM/JIT agent. +- core/docs: `let` is SEQUENTIAL (let*) in BOTH CEK and VM (`(let ((x 2) (y x)) y)` → 2 with + outer x=1) — contradicts the documented "let is parallel" authoring rule. Engines agree; docs don't. +- core/spec: nested quasiquote evaluates inner unquotes at depth 2 on both OCaml and JS + (```` `(a `(b ,(c))) ```` errors) — nonstandard but consistent; needs a spec ruling. +- core/spec: `=` deep-on-dicts, `eq?`/`eqv?`/`equal?`, `upcase`/`downcase`, `json-encode`, + 1-arg `range`, empty-sep `split`, `parse-int` failure, `get` stored-nil-vs-default, `(max)`, + `round` half-negative, string-length units, exactness-across-wire — all need normative rulings; + then each host aligned. OCaml wire float printing (`%g`) should be shortest-round-trip; OCaml + `sort` numeric-compare Integer/Number; OCaml `into`/`contains?`-on-dict/`merge`-skip-nil made + native+conformant; OCaml `zip-pairs` reconciled with its own spec. +- conformance: NO differential CEK-vs-JIT parity suite exists — all four silent-wrong-value JIT + bugs (J1 `->`, J3 macro-arg, J4 component-kwargs, J5 stale opcodes) pass the existing tests. + A harness that runs each conformance expr both interpreted and through a forced-JIT wrapper + would catch them. Also add a cross-host differential battery (scratchpad battery*.json) — the + spec corpus currently runs green only against OCaml; the JS bundle is 2490 failing and the + Python standalone host can't load. +- conformance: add a protocol-fuzz suite for the epoch loop (malformed line must not + crash; `(epoch)`/`(epoch foo)`; stray `(io-response …)`; two-exprs-per-line). +- conformance: no suite catches the r7rs `string->number` radix shadow — the spec + test fails only in the OCaml runner env; JS/Python runner r7rs load status differs. + +## Recommended triage / next steps + +This is a reviewer's suggested reading order, NOT a verdict — severity/confidence tags on each +finding are the ranking signal; a maintainer decides what blocks a release. No new claims here; every +item points back to the evidence above. + +**Confirmed-severe clusters, in suggested blocker order:** + +1. **Production serving-JIT silently miscompiles (J1–J8).** Highest concern because it produces + *wrong values with no error* on the live sx.rose-ash.com HTTP path (JIT hook is unconditional at + sx_server.ml:4163 — see the reconciliation note; the "opt-in/default-off" belief does not apply to + http_mode). J1 (`->` in arg position) and J3 (macro args) return wrong results; J2 double-applies + side effects on fallback (a data-integrity risk for persist/counter writes). Blast radius depends + on whether rendered lambdas hit those patterns — a fleet audit of served pages, or disabling the + http_mode JIT hook until fixed, is the pragmatic call. Fix lives in lib/compiler.sx (compile-thread-step + at 934-951 for J1) + macro handling in the compiler/VM. + +2. **Process-killing parse crash (C1 / C1b).** One malformed or non-ASCII line on the command channel + kills the whole sx_server process — it's the shared channel for the Python bridge and conformance + runners. Small, localized fix (wrap the command-channel parse at sx_server.ml:5048/:4950 in a + handler that catches Parse_error). Cheap to fix, high blast radius; good early win. + +3. **HTTP serving path breaks under concurrency (S1 / S4 / S5).** S1 crashed the live server + intermittently (data race on unsynchronized cross-Domain Hashtbls, OOM ruled out) — needs either a + lock around cache/env mutation or per-Domain state. S4 (error pages cached as 200) and S5 (cache + key ignores cookies/query) are cheap keying fixes but S5 becomes a cross-user data-leak on any + cookie/auth-varying domain, so fix it before the Quart-replacement domains go live on this server. + +4. **Hollow JS bundle + red JS CI (C0a / C0b).** Breaks conformance/CI, not visitors (the live site + serves the wasm kernel). Fix is a define-library-aware `extract_defines` (hosts/javascript/platform.py:15) + plus loading the newer lib deps + make-rtd/dom prims — or, if the JS host is being retired, stop CI + gating on it. A product/ownership decision as much as a code fix. + +**Not-yet-symptomatic, needs a real app to exercise:** +- S2 (queued workers read process-global request context) and S3 (`expand-components?` removed on the + shared env) are mechanism-confirmed but only exhibit a visible symptom on a full-page route that + varies by request args/cookies. Re-run the S1-style concurrent-load repro against one of the real + Quart-replacement domains (not the static docs site) to turn these CONFIRMED or clear them. + +**Cross-cutting, lower urgency:** +- Cross-host parity (P1–P12) and the Python boundary/bridge issues (C24–C31, S-bridge) matter most if + the JS/Python hosts are still targets; if OCaml is the sole production evaluator, prioritize the + OCaml-side ones that affect the wire (P1 lossy float, P9 dict order, P3 `into`). +- Hygiene items (C8, C14–C18) are safe cleanups, no behavior change. + +**Handoffs above** route the core/conformance-owned items (scope leak, let-sequential docs, the missing +CEK-vs-JIT differential suite) to the sibling review lanes — not this lane's to fix.