plans: SX review master remediation plan + evidence

Consolidates the three-lane review (core K01-K110, hosts J*/C*/JS*/P*/S*, conformance F1-F15) into plans/sx-review/: - PLAN.md — 15 workstreams, phased execution, full per-finding coverage ledger (every ~213 finding-instances mapped to a workstream + status) - RULINGS.md — 40 draft normative rulings (Phase-0 gate) - core.md / hosts.md / conformance.md — the lane evidence files dc7aa709 quick-wins batch marked DONE in the ledger; K01 (guard re-raise hang), S1 (live HTTP crash), K03 (shift-k), and W14 (test gate) flagged as the highest-value open work. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-07-03 21:28:41 +00:00
parent 72a3989fed
commit 4f766ea4f1
6 changed files with 2768 additions and 0 deletions
--- a/plans/sx-review/PLAN.md
+++ b/plans/sx-review/PLAN.md
@@ -0,0 +1,343 @@
+# SX Review — Master Remediation Plan
+
+Consolidates every finding from the three parallel review sessions (2026-07-03):
+- `core.md` — language core / spec semantics (K01–K110)
+- `hosts.md` — per-host implementations + FFI (J*, C*, JS*, P*, S*, PY)
+- `conformance.md` — cross-host agreement + test adequacy (F1–F15, conf-S1–S5)
+- `RULINGS.md` — 40 draft normative rulings (R1–R40) that gate the ambiguity fixes
+
+**How to read this.** Findings are grouped into workstreams (W1–W15). Each workstream lists the
+finding IDs it resolves, the approach, what ratified ruling(s) it needs, and status. The full
+per-ID coverage ledger is at the bottom — every finding maps to a workstream + status, so nothing
+is silently dropped. `[DONE]` = landed in commit dc7aa709 (quick-wins batch). `[GATE]` = blocked on
+a Phase-0 decision. `[dup→Kxx]` = same defect found by another lane, fixed once.
+
+**Prime directive from the review:** the verification infrastructure currently cannot tell you
+whether a fix works (runner envs diverge from production, the WASM kernel never runs the corpus,
+the JS gate is structurally red, one test passed *because of* the bug it tested). So Phase 1 Track A
+(gate repair) comes before the bulk of the semantic work — otherwise fixes land blind.
+
+---
+
+## Phase 0 — Decisions (BLOCKING; maintainer; no code)
+
+Nothing in Phases 2+ that changes observable semantics should merge before the relevant ruling is
+ratified. These three decisions unblock ~40 findings.
+
+### D1. Host lineup
+Evidence: the JS-transpiled bundle is hollow (C0a: define-library files → 0 bytes) and its gate is
+red (C0b: 2490/5086 fail); nothing serves it. The standalone Python host cannot load (C30/PY).
+Production = OCaml native + WASM kernel (one OCaml library) + the load-bearing Python parser/bridge
+in `shared/sx/`.
+**Recommendation:** declare the kernel family the only evaluator targets; retire `hosts/javascript`
+ `hosts/python` standalone; shrink `shared/sx/parser.py` to a wire-subset with a parity suite.
+→ Ratifying this **closes W13 entirely** (C0a/C0b/JS1–JS8 become "delete") and simplifies W6/W7.
+
+### D2. Ratify RULINGS.md (R1–R40)
+Each ruling is one normative answer + one mechanical fix. Ratify in a pass; four need a
+pre-ratification usage sweep because they're high-churn: **R17** (arity: kill nil-fill), **R9**
+(cond flat-only), **R31** (append! errors on derived lists), **R15a** (HO swap only when
+unambiguous). See RULINGS.md for the per-ruling recommendation.
+
+### D3. Define the merge gate
+Recommendation: (a) native `run_tests` green with hs-upstream skip-listed; (b) same corpus on the
+WASM kernel; (c) cross-kernel differential battery output-identical; (d) CEK-vs-forced-JIT
+differential when JIT is on; (e) `sx_ref.ml` regen + diff. This is W14's definition of done.
+
+---
+
+## Phase 1 — Trustworthy verification + stop the bleeding
+
+### W14. Test gate & conformance infrastructure  *(do FIRST — everything else verifies against it)*
+Findings: C0b, C9, C21, C22, C23, C3, C4, C5, C6, C7, F2, F3, F4, F5, F6, F7, F8, F9, F10, F11,
+F12, K19 (harness/runtime primitive drift, partial from batch), K104 (harness log-before-mock).
+Approach:
+1. **Unify runner env with production env** — delete or productionize every runner-only binding:
+   `values`/`call-with-values` (F7, K42), the JS runner's fake sha3/equal?/apply/env-set! shims
+   (JS5, F7). Rule: if the spec needs it, it's a kernel primitive; if not, the test can't have it.
+2. **WASM corpus runner** in CI (F2) — promote conformance's `run_wasm.js` prototype.
+3. **MCP harness honesty** (K19): `mcp_tree.ml` drops its parallel primitive table and links real
+   `sx_primitives` (batch aligned 8 entries as a stopgap); make `sx_harness_eval` fresh per call.
+4. **Harness fixes**: log IO before invoking the mock (C22/K104); real perform/suspend mode (C21);
+   adapter-dom render-output tests (C23).
+5. **Epoch-loop protocol fuzz suite** (C3/C4/C5/C6/C7) + skip-list hs-upstream (F10) + empty suite
+   label (C9).
+6. **Test-debt ledger**: pin every confirmed finding with a failing test FIRST — the three lane
+   files are a ready-made corpus of minimal reprs. **Batch gap to close: dc7aa709's fixes have no
+   pinning tests** (except crit-2, now non-vacuous). Add tests for K09, K11, K18, K20, K39, K49,
+   C1/C1b, S4 before further evaluator work.
+Gate: none (this IS the gate). Status: OPEN — highest priority.
+
+### W1. Condition system & delimited continuations  *(the kernel criticals)*
+Findings: K01 (guard/handler re-raise hang — CRITICAL), K03 (shift-k nested cek-run double-exec —
+CRITICAL), K10 (dynamic-wind re-entry + sibling winder corruption), K12 (`->` non-HO steps in
+nested CEK), K36 (guard multi-expr clause body — inherits cond fix W5), K41 (host errors uncatchable
+by guard), K57 (strict errors uncatchable), K106 (SUSPECTED: expand-macro/let-values/qq nested-eval
+boundaries), S10 (VM inline-IO in HO callbacks can't suspend). K02 [DONE].
+Root cause (shared by K03/K12/K106/S10): evaluation crosses a **nested `cek-run`/`trampoline
+(eval-expr)` boundary** the outer continuation can't see. One architectural fix — invoke
+continuations and evaluate these sub-expressions via CEK frames, not nested runs — resolves the
+cluster. K01 is separate: run handlers with the OUTER handler set (unwound kont must EXCLUDE the
+matched frame); make guard clause bodies evaluate after the escape (the no-match auto-reraise path
+already does this — make it the only path). K10: common-ancestor before/after algorithm + winders
+stored per-continuation, not one global length-keyed stack. K41/K57: raise host/primitive/strict
+errors as structured catchable conditions (needs R7).
+Gate: R6 (handler installation), R7 (what's catchable), R8 (raise-continuable). Status: OPEN —
+**K01 is the single highest-value fix left** (DoS-able hang, server + browser).
+
+### W3. HTTP-mode concurrency & serving safety  *(production robustness — lib/host is LIVE)*
+Findings: S1 (multi-Domain render race — LIVE CRASH), S2 (per-request globals read by queued
+workers), S3 (`expand-components?` bind/remove on shared env), S5 (cache key ignores cookies/query),
+S11 (URL evaluated as SX — any env binding invokable), S12 (island hydration reuses no SSR DOM),
+S13 (SSR/client purity, no dev-mode check), K30 (emit!/emitted cross-request — shared with W2). S4
+[DONE], J1/J2/J3 mitigated by the batch JIT gate.
+Approach: serialize or isolate rendering (S1: lock `_stream_mutex` or per-Domain env/cache);
+per-request state carried with the request not process-global (S2/S3/K30); include query in cache
+key + cookie policy (S5); whitelist URL-routable bindings to a `page:` prefix (S11); hydration cursor
+ dev-mode purity check (S12/S13). Pairs with W2 (per-flow scope stacks).
+Gate: none for S1/S4/S5 (safety). Status: OPEN — S1 is a live crash.
+
+---
+
+## Phase 2 — Correctness families (each = ruling + fixes + conformance rows)
+
+### W2. Environment & scope integrity
+Findings: K04 (caller frame leaks into interpreted lambda + JIT disagrees), K05 (letrec injects into
+foreign closures — global contamination), K06 (named-let leaks loop name), K07 (~60 unshadowable
+names; = J8 VM-honors-CEK-doesn't), K30 (emit! cross-request — shared W3), K31 (provide leak on
+raise/shift), K32 (provide! ambient global), K33 (set! unbound creates + JIT/interp split brain),
+K40 (scope :value dead + dead frame type), K107 (SUSPECTED env_merge depth-100 flip).
+Approach: fresh frames for letrec/named-let (K05/K06); drop the top-frame copy in env_merge
+(K04/K107); reserved-words error for dispatch names, aligning VM+CEK (K07/J8); unwind-safe +
+invocation-scoped dynamic state — one mechanism for provide/emit!/batch (K30/K31/K32); set!-unbound
+per R1 + kill the JIT/interp global split (K33); remove dead scope :value/frame (K40).
+Gate: R1 (set!), R2 (reserved names). Status: OPEN.
+
+### W4. Higher-order forms & threading
+Findings: K13 (2-arg reduce returns coll), K14 (reduce init-swap), K15 (data-first drops extra
+args), K43 (O(n²) map/filter), K44 (HO names not first-class), K45 (cryptic uncatchable HO errors),
+K46 (multi-coll rejects strings/vectors), K47 (thread lambda literal), K78 (component in HO → zeros),
+K79 (dead `|>`), K80 (keyword getters in HO/->), K81 (zero-arg HO silent ()), J7 (VM data-first
+deopt — shared W11).
+Approach: implement R15 sub-rulings (swap only when one arg callable + error otherwise; reduce
+arities; drop-extra→error; multi-coll seq-to-list parity; HO first-class; zero-arg→error); fix O(n²)
+via reversed-cons accumulation; delete dead `|>`.
+Gate: R13 (threading), R15 (HO forms). Status: OPEN.
+
+### W5. Special forms & macros
+Findings: K08 (cond dual-grammar — silent side-effect drops), K34 (qq depth), K35 (qq dict
+traversal), K37 (&key misbind on fn/defmacro), K38 (splice non-list/malformed), K70 (case else any
+position), K71 (case dialect + punning), K72 (letrec parallel + ref-before-init), K76 (defmacro
+unhygienic vs "hygiene" test name), K77 (match guard clauses silently structural), K42 (values —
+special forms now registered [DONE-partial]; `values` primitive still runner-only). K09/K11/K39 [DONE].
+Approach: cond flat-only + explicit begin (R9); qq depth tracking + dict traversal + splice arity
+errors (R12); &key one binding path for fn/defmacro/component (R5/K37); case final-else + evaluated-
+datum doc + clause-syntax error (R10); letrec* + ref-before-init error (R4); match guards implemented
+or error (R14); make `values` a real kernel primitive (finish K42).
+Gate: R4, R5, R9, R10, R12, R14. Status: OPEN (K09/K11/K39 done).
+
+### W6. Parser, serializer, canonical form & CIDs
+Findings: K21 (canonical.sx runner-only helpers), K22 (serializer dict-key escaping + CID fixed-
+point), K23 (four divergent ident/number classifiers), K24 (`1e`→nil), K25 (guest rationals throw),
+K63 (`#;` before `)`), K64 (`=` no Char arm — shared W7), K65 (`#\a` mcp crash), K66 (multibyte char
+literals), K67 (`\uXXXX` validation), K68 (unknown-escape divergence), K69 (`#name` reader macro
+unimpl on OCaml), K100 (parse error locations), K101 (dict literal edges), K102 (`#|` raw string),
+K103 (`:`/`::` keyword edges), K108 (SUSPECTED cross-host CID nondeterminism), C25 (Py↔OCaml escape
+corruption), C26 (Py unicode symbols), C27 (Py dict order — shared W7/P9).
+Approach: ONE normative ident/number classifier bound by every surface (R32); \u validation +
+unknown-escape error + datum-comment fix (R27/R33); native `#name` reader macro registry;
+canonical path = native CBOR/CID normative, spec/canonical.sx tested mirror or deleted, property-
+test `parse(serialize(x))=x` and canonical fixed-point cross-kernel (R34/R35). CID determinism
+(K108/K35-in-canonical) is sx-pub-critical.
+Gate: R27, R32, R33, R34, R35. Status: OPEN.
+
+### W7. Numbers, equality, strings, collection primitives
+Findings: K17 (append! silent no-op), K52 (byte-based strings), K53 (spec/runtime primitive drift),
+K54 (div-by-zero inconsistency), K55 (`/` doc), K56 (sort no comparator), K64 (char equality), K85
+(binary `=`, exactness conflation), K86 (rounding/inexact->exact/sqrt), K87 (float/nil rendering),
+K88 (nil/empty tolerance), K89 (keys reverse order — GATED, see R29 note: breaks render tests), K90
+(keyword-name on evaluated kw), K91 (string->number), K92 (apply doesn't spread), P1 (lossy float
+wire), P2 (sort mixed int/float), P3 (into needs bridge), P4 (int63 vs float64), P5 (= not deep on
+JS dicts, missing eq?/eqv?), P6 (string units), P7 (JS coercion cluster — GATE D1), P8 (nil/list
+strictness), P10 (NaN/Inf wire tokens), P11 (upcase/round), P12 (zip-pairs). K18/K20 [DONE].
+Approach: append! errors on non-mutable lists + deprecate (R31); codepoint string semantics (R25);
+implement eq?/eqv?, add `=` Char arm, n-ary comparisons (R19); exact-±2^53 + overflow-promote (R21);
+shortest-round-trip float printing + inf/nan wire tokens (R23); div-by-zero catchable (R22); apply
+spreads (R16); sort comparator + numeric compare (R30/P2); into native, contains?-on-dict [done],
+merge-skip-nil, zip-pairs sliding window (R30/P8/P12); reconcile spec/runtime primitive lists (K53).
+Gate: R16, R19, R21, R22, R23, R25, R29, R30, R31. Status: OPEN (K18/K20 done).
+
+### W8. Render pipeline
+Findings: K16 (infinite recursion no depth guard), K48 (attr-name injection — XSS class), K50 (aser
+list kwargs), K51 (dom/html attr parity = C19), K82 (bool-attr truthiness footgun), K83 (dead
+is-render-expr? / html: tags), K84 (script/style escaping), K87 (float render — shared W7), C19 (=K51),
+C20 (CSRF cross-origin), S14 (deep nested-list flatten html vs aser), S9 (SPA boosted-nav fragility).
+K49 [DONE]. Approach: depth limit + cycle guard (K16); attr-name validation (R36/K48); quote aser
+list kwargs (R38/K50); align 4 adapters on bool-attr contract (R36/C19/K51); script/style raw-text
+error-on-breakout (R37/K84); wire or delete is-render-expr? (R37/K83); depth-2 aser/html parity test
+(S14). CSRF cross-origin (C20) + SPA manifest staleness (S9, overlaps W11 stale-bundle).
+Gate: R36, R37, R38. Status: OPEN (K49 done).
+
+### W9. Strict typing
+Findings: K26 (HO callbacks bypass), K27 (apply bypasses), K58 (unknown type names match all), K59
+(keyword type dead / components untypeable), K60 (component &key misalign), K93 (name-keyed, evaded),
+K94 (set-prim-param-types! no validation), K95 (too-few args skip checks), K96 (`(:as type)` unenforced),
+K97 (paper cuts). Approach (R20): move checks to continue-with-call/vm_call chokepoints (covers HO,
+apply, components, => receivers); validate type names at declaration; real "component" branch, remove
+dead "keyword" (R18); `(:as type)` as the declaration channel; merge+validate set-prim-param-types!;
+strict errors catchable (R7, shared W1). Return types explicitly out of scope.
+Gate: R7, R18, R20. Status: OPEN.
+
+### W10. Signals & coroutines
+Findings: K28 (dispose-computed no-op), K29 (batch wedge on exception), K61 (identity-not-equality
+change detection), K62 (diamond glitch), K98 (batch unusable on server / coroutines inert), K99
+(effect cleanup double-invoke), K109 (SUSPECTED coroutine non-yield wedge), K110 (SUSPECTED VM no
+strict — shared W9/W11). Approach (R39): `=`-based change detection (needs W7 R19); unwind-safe batch
+(shared W2 mechanism); two-phase/topological notify for glitch-freedom; fix dispose-computed + effect
+cleanup; make batch/coroutines work outside run_tests (bind batch-begin!/end! + cek hooks in real
+envs, or fold into kernel). Zero test coverage today — add suites.
+Gate: R39. Status: OPEN.
+
+### W12. Python bridge & boundary  *(load-bearing in production)*
+Findings: C24 (boundary validation dead — [DONE-partial]: now warns; full revival needs tier-1
+declarations recreated + zero-violation proof since SX_BOUNDARY_STRICT=1 is live), C28 (two SxExpr
+classes double-quote), C29 (reader-macro auto-resolve broken), C30 (standalone Python host dead —
+GATE D1: delete), C31 (14/33 test files broken + 5 live failures), S-bridge (coroutine-cancel
+desync, no timeouts, dead _restart), S-bridge2 (numeric-result-as-epoch ambiguity), K42 (values —
+shared W5). C25/C26/C27 live in W6 (parser). Approach: finish C24 (recreate declarations, prove
+clean, re-enable); single SxExpr class (C28); fix OcamlSync.start→_ensure (C29); bridge timeouts +
+working _restart (S-bridge); robust (ok N V) parse (S-bridge2); fix/retire broken tests (C31).
+Gate: D1 (C30). Status: OPEN (C24 partial).
+
+### W11. JIT correctness (serving-JIT re-enable preconditions)
+Findings: J1 (`->` miscompile), J2 (fallback re-runs whole call — double side effects), J3 (macro
+args eager), J4 (VM component kwargs misparse), J5 (specialized opcodes freeze redefs), J6 (compiler-
+used prim redef poisons), J7 (data-first deopt — shared W4), J10 (stale Sx_compiler stub), J11 (JIT
+debug paths diverge), K33 (set! split brain — shared W2), K19 (harness drift — shared W14), C10
+(browser compiler one fix behind), C11 (stale module-manifest.sx), C12 (dead SOURCE_MAP paths), C14
+(stale dist/ bundle). J12 = positive (perform/resume fixed). Currently MITIGATED (JIT gated OFF in
+both epoch and — post-batch — HTTP mode). Approach: fix compile-thread-step (J1); fallback-before-
+side-effects or compile-time reject of fallback-prone forms (J2); macro-aware compile (J3); keyword
+tagging in constant pool (J4); redefinition invalidation (J5/J6); one browser-compiler sync pipeline
+ single bundle dir (C10/C11/C12/C14). Do NOT re-enable serving-JIT until the CEK-vs-JIT differential
+(W14) is green.
+Gate: W14 differential. Status: DEFERRED (mitigated; only unblock if serving-JIT is wanted).
+
+### W13. JS host  *(GATE D1 — likely "delete")*
+Findings: C0a (hollow bundle), C0b (2490 fail gate), JS1 (define-record-type/makeRtd), JS2 (host-
+callback type tag), JS3 (arithmetic drops args), JS4 (`.` symbol), JS5 (runner shims), JS6 (str nil),
+JS7 (no qq emission), JS8 (stale metadata). If D1 retires the JS bundle: delete `hosts/javascript`,
+remove from `sx-build-all.sh`/CI, keep only the WASM kernel path. If kept: this is a ~2500-test
+revival project. Gate: D1. Status: BLOCKED on D1.
+
+---
+
+## Phase 3 — Hygiene & docs
+
+### W15. Hygiene & documentation
+Findings: C8 (triplicated hosts/ocaml/hosts/ tree), C13 (test_platform.js stale path), C15 (tracked
+stale wasm blob), C16 (orphaned hosts/native), C17 (sx-platform-2.js + 23 dead .sxbc.json), C18
+(spa-debug.js + root clutter), C2 (r7rs string->number radix shadow), F14 (doc drift — batch fixed
+canonical-ref + island rules; suite counts + case-syntax + primitives-header still stale), F15
+(sha3 stub / test.sx dead filename), F13 (regen reproducibility — [DONE] as batch side effect).
+K105/K73 [DONE]. Approach: delete dead trees/blobs/files; fix r7rs shadow (C2); finish CLAUDE.md
+(suite counts, case syntax); regen-diff CI check (F13 → make it a gate in W14).
+Gate: D1 (some deletions). Status: OPEN (K105/K73/F13 done).
+
+---
+
+## Suggested execution shape (maps to the loop workflow)
+
+Four loops, mostly independent after Phase 0:
+1. **loops/sx-gate** (W14 + W15 hygiene) — the enabler. Start FIRST. Pins tests for the dc7aa709
+   batch, builds the WASM corpus runner + differential battery, unifies runner env, cleans dead code.
+2. **loops/sx-kernel** (W1 + W2 + W5) — condition system, scope integrity, special forms. Single
+   owner (touches evaluator.sx + regen). TDD off W14's pinned tests. K01 first.
+3. **loops/sx-runtime** (W3 HTTP safety + W12 Python bridge) — production robustness; can run
+   parallel to kernel since it's mostly host OCaml + Python, not spec.
+4. **loops/sx-families** (W4, W6, W7, W8, W9, W10) — one family at a time, each gated by its rulings
+   + the new batteries. W6/W7 pay the sx-pub CID debt.
+W11 (JIT) and W13 (JS) are decision-gated and sit out until D1 + a green differential exist.
+
+**Sequencing rule:** no semantic fix merges before (a) its pinning test exists, (b) the relevant
+ruling is ratified, (c) native + WASM both run it. D1/D2/D3 are the only hard blockers.
+
+---
+
+## Coverage ledger — every finding accounted for
+
+Status key: DONE (dc7aa709) · OPEN · PARTIAL · DEFERRED · GATE(Dn) · dup→(primary). Workstream in [].
+
+### Core (K01–K110)
+- K01 [W1] OPEN — guard/handler re-raise hang (CRITICAL, highest value)
+- K02 [W1] DONE — signal-return frame key
+- K03 [W1] OPEN — shift-k nested cek-run (CRITICAL)
+- K04 [W2] OPEN · K05 [W2] OPEN · K06 [W2] OPEN · K07 [W2] OPEN (=J8)
+- K08 [W5] OPEN — cond dual grammar
+- K09 [W5] DONE · K10 [W1] OPEN · K11 [W5] DONE
+- K12 [W1] OPEN (=W4 threading) · K13 [W4] OPEN · K14 [W4] OPEN · K15 [W4] OPEN
+- K16 [W8] OPEN · K17 [W7] OPEN — append! · K18 [W7] DONE · K19 [W14] PARTIAL · K20 [W7] DONE
+- K21 [W6] OPEN · K22 [W6] OPEN · K23 [W6] OPEN · K24 [W6] OPEN · K25 [W6] OPEN
+- K26 [W9] OPEN · K27 [W9] OPEN · K28 [W10] OPEN · K29 [W10] OPEN
+- K30 [W2/W3] OPEN — emit! cross-request (=S2 dir)
+- K31 [W2] OPEN · K32 [W2] OPEN · K33 [W2/W11] OPEN — set! split brain
+- K34 [W5] OPEN · K35 [W5/W6] OPEN · K36 [W1/W5] OPEN · K37 [W5] OPEN · K38 [W5] OPEN
+- K39 [W5] DONE · K40 [W2] OPEN · K41 [W1] OPEN · K42 [W5/W12] PARTIAL (forms registered; `values` prim runner-only)
+- K43 [W4] OPEN · K44 [W4] OPEN · K45 [W4] OPEN · K46 [W4] OPEN · K47 [W4] OPEN
+- K48 [W8] OPEN · K49 [W8] DONE · K50 [W8] OPEN · K51 [W8] OPEN (=C19)
+- K52 [W7] OPEN · K53 [W7] OPEN · K54 [W7] OPEN · K55 [W7] OPEN · K56 [W7] OPEN
+- K57 [W1/W9] OPEN · K58 [W9] OPEN · K59 [W9] OPEN · K60 [W9] OPEN
+- K61 [W10] OPEN · K62 [W10] OPEN · K63 [W6] OPEN · K64 [W6/W7] OPEN — char `=`
+- K65 [W6] OPEN · K66 [W6] OPEN · K67 [W6] OPEN · K68 [W6] OPEN · K69 [W6] OPEN
+- K70 [W5] OPEN · K71 [W5] OPEN · K72 [W5] OPEN · K73 [W15] DONE
+- K74 [W2] OPEN (component &key false→nil; R5) · K75 [W2] OPEN (trailing kw; R5)
+- K76 [W5] OPEN · K77 [W5] OPEN · K78 [W4] OPEN · K79 [W4] OPEN · K80 [W4] OPEN · K81 [W4] OPEN
+- K82 [W8] OPEN · K83 [W8] OPEN · K84 [W8] OPEN · K85 [W7] OPEN · K86 [W7] OPEN · K87 [W7/W8] OPEN
+- K88 [W7] OPEN · K89 [W7] OPEN — keys order, GATED R29 (breaks render tests, see RULINGS note)
+- K90 [W7] OPEN · K91 [W7] OPEN · K92 [W7] OPEN — apply spread
+- K93 [W9] OPEN · K94 [W9] OPEN · K95 [W9] OPEN · K96 [W9] OPEN · K97 [W9] OPEN
+- K98 [W10] OPEN · K99 [W10] OPEN · K100 [W6] OPEN · K101 [W6] OPEN · K102 [W6] OPEN · K103 [W6] OPEN
+- K104 [W14] OPEN · K105 [W15] DONE
+- K106 [W1] OPEN (SUSPECTED nested-eval boundaries) · K107 [W2] OPEN (SUSPECTED)
+- K108 [W6] OPEN (SUSPECTED CID nondeterminism) · K109 [W10] OPEN (SUSPECTED) · K110 [W9/W11] OPEN (SUSPECTED)
+
+### Hosts — JIT (J1–J12)
+- J1 [W11] DEFERRED (mitigated: JIT gated off) · J2 [W11] DEFERRED · J3 [W11] DEFERRED
+- J4 [W11] DEFERRED · J5 [W11] DEFERRED · J6 [W11] DEFERRED · J7 [W11/W4] DEFERRED
+- J8 [W2] OPEN dup→K07 · J9 [W11/W14] DEFERRED · J10 [W11] DEFERRED · J11 [W11] DEFERRED
+- J12 POSITIVE (no action — perform/resume verified fixed)
+
+### Hosts — kernel/protocol/build (C*)
+- C0a [W13] GATE(D1) · C0b [W13/W14] GATE(D1) · C1 [W3] DONE · C1b [W3] DONE
+- C2 [W15] OPEN · C3 [W14] OPEN · C4 [W14] OPEN · C5 [W14] OPEN · C6 [W14] OPEN · C7 [W14] OPEN
+- C8 [W15] OPEN · C9 [W14] OPEN · C10 [W11] DEFERRED · C11 [W11] DEFERRED · C12 [W11/W15] OPEN
+- C13 [W15] OPEN · C14 [W11/W15] OPEN · C15 [W15] OPEN · C16 [W15] OPEN · C17 [W15] OPEN · C18 [W15] OPEN
+- C19 [W8] OPEN dup→K51 · C20 [W8] OPEN · C21 [W14] OPEN · C22 [W14] OPEN · C23 [W14] OPEN
+- C24 [W12] PARTIAL · C25 [W6] OPEN · C26 [W6] OPEN · C27 [W6/W7] OPEN dup→P9
+- C28 [W12] OPEN · C29 [W12] OPEN · C30 [W12] GATE(D1) · C31 [W12] OPEN
+
+### Hosts — JS host (JS1–JS8)
+- JS1–JS8 [W13] all GATE(D1) — delete if JS retired, else ~2500-test revival
+
+### Hosts — cross-host parity (P1–P12, PY)
+- P1 [W7] OPEN · P2 [W7] OPEN · P3 [W7] OPEN · P4 [W7] OPEN · P5 [W7] OPEN · P6 [W7] OPEN
+- P7 [W7] GATE(D1) · P8 [W7] OPEN · P9 [W6/W7] OPEN (=C27) · P10 [W7] OPEN · P11 [W7] OPEN · P12 [W7] OPEN
+- PY [W13] GATE(D1) dup→C30
+
+### Hosts — HTTP/suspected (S1–S14, S-bridge*)
+- S1 [W3] OPEN (LIVE CRASH) · S2 [W3/W2] OPEN · S3 [W3] OPEN · S4 [W3] DONE · S5 [W3] OPEN
+- S6 [W14] OPEN · S7 [W14/W1] OPEN (unify eval/IO paths) · S8 [W13/W8] OPEN (browser env prims)
+- S9 [W8/W11] OPEN · S10 [W1] OPEN · S11 [W3] OPEN · S12 [W3] OPEN · S13 [W3] OPEN · S14 [W8] OPEN
+- S-bridge [W12] OPEN · S-bridge2 [W12] OPEN
+
+### Conformance (F1–F15, conf-S1–S5)
+- F1 [W7] OPEN dup→K18/P4 (WASM int wrap) · F2 [W14] OPEN · F3 [W7/W6] OPEN (apply + dict order) · F4 [W13/W14] GATE(D1)
+- F5 [W14] OPEN (host-neutral corpus) · F6 [W14] OPEN (directories one-host-gated) · F7 [W14] OPEN dup→K42
+- F8 [W14] OPEN (differential battery) · F9 [W7/W14] OPEN (primitive parity) dup→K53 · F10 [W14] OPEN (skip hs)
+- F11 [W12] OPEN dup→C24 · F12 [W6] OPEN dup→C25/26/27 · F13 [W15] DONE · F14 [W15] PARTIAL · F15 [W15] OPEN
+- conf-S1 [W14] OPEN (native-vs-WASM web-stack diff) · conf-S2 [W14] OPEN (hyperscript unverifiable)
+- conf-S3 [W11] OPEN (import path browser vs test) · conf-S4 [W14] OPEN (float golden precision) · conf-S5 [W11] OPEN (JS build-flag ADT divergence)
+
+### Tally
+~213 finding-instances. DONE: 13 (dc7aa709). PARTIAL: 4 (K19, K42, C24, F14). DEFERRED: 12 (W11 JIT).
+GATE(D1): ~16 (JS host + Python standalone). OPEN: the rest, distributed across W1–W12/W14/W15.
--- a/plans/sx-review/README.md
+++ b/plans/sx-review/README.md
@@ -0,0 +1,21 @@
+# SX Review — 2026-07-03
+
+Findings from three parallel review sessions of the SX language/runtime, plus the master
+remediation plan.
+
+| File | What |
+|------|------|
+| **PLAN.md** | Master remediation plan: 15 workstreams (W1–W15), execution order, and a full per-finding coverage ledger. Start here. |
+| **RULINGS.md** | 40 draft normative rulings (R1–R40). Phase-0 gate — ratify before the semantics fixes. |
+| core.md | Language core / spec semantics lane (K01–K110). |
+| hosts.md | Per-host implementations + FFI lane (J*, C*, JS*, P*, S*, PY). |
+| conformance.md | Cross-host agreement + test adequacy lane (F1–F15, S1–S5). |
+
+**Status:** the quick-wins batch (commit dc7aa709) landed 13 fixes + 4 partials; suite at baseline
+5762p/274f (fail set byte-identical). Everything else is OPEN/GATE/DEFERRED per PLAN.md's ledger.
+
+**Highest-value open items:** K01 (guard/handler re-raise hang — DoS-able, server+browser),
+S1 (live HTTP crash under load), K03 (shift-k double-execution), and W14 (test gate — the enabler
+that makes all other fixes verifiable).
+
+**Blocking decisions (maintainer):** D1 host lineup, D2 ratify rulings, D3 gate definition.
--- a/plans/sx-review/RULINGS.md
+++ b/plans/sx-review/RULINGS.md
@@ -0,0 +1,396 @@
+# SX RULINGS — normative decisions on every ambiguity surfaced by the 2026-07-03 review
+
+DRAFT for ratification. Each ruling: STATUS `PROPOSED` → flip to `RATIFIED` / `REJECTED` /
+`AMENDED: <text>`. Once ratified, this file moves to `spec/RULINGS.md` and becomes the
+authority the conformance batteries pin against. Evidence citations: core.md finding names,
+hosts.md J/C/JS/P/S codes, conformance.md F codes.
+
+**Default posture used for recommendations** (override per-ruling as you see fit):
+1. Prefer an ERROR over any silent behavior (silent drop/no-op/misparse caused the worst findings).
+2. Prefer R7RS/standard semantics where churn is low; prefer current-behavior-plus-documentation
+   where churn is high and behavior is defensible.
+3. Every ruling lands with conformance rows that run on BOTH production kernels (native + WASM).
+
+**Companion decisions (not language rulings, restated for context):**
+- D1 host lineup — recommended: kernel family (native OCaml + WASM) are the only evaluator
+  targets; hosts/javascript and hosts/python standalone retired; shared/sx/parser.py shrunk to a
+  wire-subset with a parity suite. Rulings below marked [D1] simplify to kernel-only if ratified.
+- D3 gate — recommended: native corpus green (hs-upstream skip-listed) + same corpus on WASM +
+  cross-kernel differential battery + CEK-vs-JIT differential (when JIT on) + sx_ref.ml regen diff.
+
+---
+
+## A. Bindings & scope
+
+### R1. `set!` on an unbound name
+- Current: silently creates a root binding (tested intent, test-scope.sx:196) — but BOTH spec docs
+  say error (eval-rules.sx:112, special-forms.sx:141), and under JIT it writes a different global
+  table than the interpreter (split brain).
+- RECOMMENDATION: **ERROR** ("set!: <name> is not bound — use define"). Typo'd set! is a bug-hider;
+  the docs already promise this. Flip test-scope.sx:196; sweep the corpus for reliance (expected
+  small — the idiom is define-then-set!). Either way the JIT/interpreter split MUST die.
+- Churn: low-medium. Findings: core set!-unbound; hosts J-globals split. STATUS: PROPOSED
+
+### R2. The ~60 special-form/HO names (`map`, `filter`, `bind`, `match`, `do`, `case`, `->`, …)
+- Current: `define`/`let`/`defmacro` of these names is silently accepted but ignored in call
+  position (CEK); the VM honors them (J8) — worst of both worlds.
+- RECOMMENDATION: **reserved words** — `define`/`let`/`set!`/`defmacro` of any dispatch-table name
+  is a load-time ERROR. Publish the list in spec. Align the VM. (Full lexical honoring is more
+  Schemely but taxes every list-head dispatch and rescues little real code.)
+- Churn: low (error surfaces existing dead definitions). Findings: core unshadowable-names; J8. STATUS: PROPOSED
+
+### R3. `let` semantics
+- Current: sequential (`let*`), body = implicit begin, on BOTH engines (tested intent). CLAUDE.md
+  island rules claim the opposite (describes a dead evaluator).
+- RECOMMENDATION: **ratify current behavior**: `let` ≡ `let*`; body sequences. Fix CLAUDE.md.
+  Document (or forbid) the observed letrec-ish quirk that binding-init lambdas capture the shared
+  frame (`(let ((f (fn () a)) (a 5)) (f))` → 5).
+- Churn: zero (docs only). Findings: core let-docs; hosts handoff let-sequential. STATUS: PROPOSED
+
+### R4. `letrec`
+- Current: parallel (all inits evaluated, then bound); read-before-init yields nil silently; PLUS
+  two outright bugs (names injected into foreign lambdas' closures = global contamination;
+  named-let loop name leaks into and clobbers the enclosing frame).
+- RECOMMENDATION: **letrec\* semantics** (sequential init) with ERROR on read-before-init
+  (pre-bind to an "uninitialized" sentinel that faults on read). Named-let binds its loop name in
+  a fresh frame, invisible after the form. The closure-injection and frame-leak are bugs to fix
+  regardless of ruling.
+- Churn: low. Findings: core letrec-parallel/-injection/named-let. STATUS: PROPOSED
+
+### R5. Component `&key` conventions
+- Current: `:flag false` is coerced to nil (indistinguishable from omitted); trailing keyword with
+  no value silently binds nil; `&key` on plain fn/defmacro silently misbinds.
+- RECOMMENDATION: `false` is a legal &key value (bind via has-key, not `(or …)`); trailing keyword
+  without a value = ERROR; `&key` in fn/defmacro either implemented identically to components or
+  ERROR at definition (recommend: implement — one binding path for all three).
+- Churn: low. Findings: core &key-false / trailing-kw / defmacro-&key. STATUS: PROPOSED
+
+## B. Errors & conditions
+
+### R6. Handler installation semantics
+- Current: a handler runs with ITSELF still installed → any raise/error inside a guard clause or
+  handler-bind handler loops forever (crit 1; WASM-verified).
+- RECOMMENDATION: **R7RS/CL semantics** — handlers run with the OUTER handler set; guard clause
+  bodies evaluate after the escape (the no-match auto-reraise path already does this correctly —
+  make it the only path).
+- Churn: zero for correct code (only un-hangs broken cases). Findings: crit 1, guard family. STATUS: PROPOSED
+
+### R7. What is catchable
+- Current: only guest `(raise …)` reaches guard; host primitive errors, undefined-symbol, arity
+  errors, and strict type errors all blow through every handler.
+- RECOMMENDATION: **everything is a condition.** Host/primitive/strict/undefined-symbol errors are
+  raised as structured condition dicts ({:type :message :op …}) through the same channel guard
+  sees. Reserve a non-catchable class only for kernel panics.
+- Churn: low-medium (code that "relied" on uncatchability is unlikely). Findings: core
+  host-errors-uncatchable, strict-uncatchable; enables sane server error pages. STATUS: PROPOSED
+
+### R8. `raise-continuable` / `signal-condition`
+- RECOMMENDATION: ratify R7RS: handler's value returns to the signal site (the current
+  whole-program-result behavior is crit 2's frame-key bug, not a semantic choice). STATUS: PROPOSED
+
+## C. Special forms
+
+### R9. `cond` grammar — kill the dual-mode heuristic
+- Current: flat pairs documented; undocumented Scheme clause mode auto-detected iff every arg is a
+  2-element list → silent side-effect drops, mode flips, wrong values (core cond-ambiguity).
+- RECOMMENDATION: **flat pairs only**: `(cond t1 r1 t2 r2 … :else d)`. Multi-expression results
+  use explicit `(do …)`. Support arrow as a flat triple `t => receiver`. A clause-shaped arg list
+  as a test position is just evaluated — no mode detection ever. Migrate the cond-arrow suite
+  (test-r7rs.sx:135-145) and any clause-mode usage (sweep needed).
+- Churn: medium (sweep + migrate clause-mode call sites). Findings: core cond-ambiguity,
+  guard-multi-expr (inherits). STATUS: PROPOSED
+
+### R10. `case`
+- RECOMMENDATION: ratify the flat evaluated-datums form and document it (vals ARE evaluated,
+  first-match, structural `=`); `:else`/`else` legal ONLY in final position (else ERROR); Scheme
+  datum-list clause syntax → clear parse-time ERROR ("use flat pairs"). Keyword/string punning
+  follows R21 and gets documented.
+- Churn: low. Findings: core case-else-position / case-dialect. STATUS: PROPOSED
+
+### R11. `do`
+- Current: `do` is a begin-alias EXCEPT when its first form's head is a list — then it's a Scheme
+  do-loop → IIFE misparse.
+- RECOMMENDATION: **`do` = begin alias, always.** Scheme do-loop moves to a distinct name
+  (`do-loop`) or is dropped (named let covers it). Kills the heuristic.
+- Churn: low (sweep for real do-loop usage; expected rare). Findings: core do-IIFE. STATUS: PROPOSED
+
+### R12. Quasiquote
+- RECOMMENDATION, four sub-rulings:
+  a. `unquote-splicing` becomes an alias of `splice-unquote` (one-line; kills the silent
+     zero-splice trap; rename the misleadingly-named tests).
+  b. Implement standard **depth tracking** (nested quasiquote raises quote depth; `,,x` works).
+     Hosts agree current shallow behavior is consistent-but-nonstandard — fix at spec level.
+  c. Quasiquote **traverses dict literals** (`{:k ,v}` works).
+  d. Splicing a non-list and malformed splice arity → ERROR.
+- Churn: low (b is the only subtle one). Findings: core qq-longhand/-depth/-dicts/-splice-nonlist. STATUS: PROPOSED
+
+### R13. Threading `->` / `->>`
+- RECOMMENDATION: (a) steps evaluate in CEK frames (bug: guard/IO broken through threading);
+  (b) a lambda literal as a step = expand-time ERROR; (c) keyword step sugar: `(-> x :k :j)` ≡
+  `(-> x (get :k) (get :j))` — cheap, expected, kills the `Not callable: nil` trap; (d) remove the
+  dead `|>` dispatch branch (parser rejects `|` anyway); (e) fix reduce-seeding via R15.
+- Churn: low. Findings: core threading-nested-CEK/-lambda-literal/|>-dead/keywords-as-getters. STATUS: PROPOSED
+
+### R14. `match`
+- RECOMMENDATION: `(pattern (when cond))` guard clauses either implemented or ERROR — never
+  silently read as a structural pattern (current). Recommend: implement (small, high value).
+  Document let-match as dict-destructuring-only with a clear error for list patterns.
+- Churn: low. Findings: core match-guards. STATUS: PROPOSED
+
+## D. Calling convention
+
+### R15. Higher-order forms
+- RECOMMENDATION, six sub-rulings:
+  a. Arg-order swap happens ONLY when exactly one argument is callable (components count as
+     callable); both-callable or neither → ERROR "map: cannot determine function/collection".
+  b. `(reduce f coll)` (2-arg) = Clojure-style fold (first element as init, empty coll → error
+     unless f has identity? keep simple: empty → ERROR); `(reduce init f coll)` and threaded
+     `(-> init (reduce f coll))` work via the one-callable rule in (a).
+  c. Data-first with extra args = ERROR (today silently dropped).
+  d. Multi-collection map coerces every collection with seq-to-list (strings/vectors), zips to
+     shortest (already); map over a dict iterates `(k v)` pairs.
+  e. HO names are first-class: `map` etc. in value position resolve to real closures so
+     `(define f map)` / `(apply map …)` work.
+  f. Zero/one-arg HO calls = arity ERROR (today silently `()`).
+  Also fix the O(n²) accumulation (implementation, not semantics).
+- Churn: medium — (a) changes behavior for ambiguous calls, sweep needed. Findings: core reduce-2arg /
+  reduce-swap / swap-drops-args / HO-not-first-class / ho-cryptic-errors / multi-coll / zero-arg;
+  J7 (VM parity). STATUS: PROPOSED
+
+### R16. `apply`
+- Current: native never spreads; WASM spreads 2-arg; test runner has a third behavior (three-way
+  divergence, F-3 + core corrected finding).
+- RECOMMENDATION: **R7RS**: `(apply f a b … rest-list)` spreads, leading args prepended. All
+  surfaces align; strict checks fire through apply (R25).
+- Churn: low (today it mostly errors). STATUS: PROPOSED
+
+### R17. Arity checking (too-few args)
+- Current: missing params silently nil-fill (this is load-bearing: 1-arg `(assert x)` works only
+  via nil-fill); too-many errors.
+- RECOMMENDATION: **ERROR on too-few** as well, with `&optional`/`&key`/`&rest` as the explicit
+  mechanisms. Sweep required (harness `assert`, any nil-fill reliance). If the sweep turns up
+  heavy reliance, fallback position: keep nil-fill but document it loudly and make strict mode
+  error. Primary recommendation stands: error.
+- Churn: **high** — flagged as the riskiest ruling; do the sweep before ratifying. Findings: core
+  strict-too-few / harness-assert nit. STATUS: PROPOSED
+
+## E. Keywords, equality, types
+
+### R18 (=R21 referenced above). Keywords
+- RECOMMENDATION: ratify current model — keywords self-evaluate to their string name; keyword-ness
+  exists only in unevaluated AST. Consequences made explicit: `(keyword-name :k)` needs a quote;
+  `"keyword"` is REMOVED from the strict type system; case/dict punning documented. NOT callable
+  (R13c covers the getter idiom).
+- Churn: zero (docs + removing a dead type branch). STATUS: PROPOSED
+
+### R19. Equality
+- RECOMMENDATION (low-churn variant, chosen deliberately over full R7RS split):
+  a. `=` stays deep structural equality (alias equal?) — ubiquitous in the corpus; add the missing
+     **Char arm** (today `(= #\a #\a)` → false) and any other missing type arms; document that
+     `(= 1 1.0)` → true (numeric value equality inside =).
+  b. Add real `eqv?` (identity + exact numeric/char equality) and `eq?` (alias identical?) as
+     kernel primitives — they are spec-declared today but implemented NOWHERE.
+  c. Comparisons `< > <= >=` become n-ary chained (R7RS); `=` stays 2+-ary deep.
+  d. If content-addressing ever needs exactness-distinguishing equality, that's `eqv?`, not `=`.
+- Churn: low. Findings: core eq?/eqv?-missing, =-binary, char-equality; P5. STATUS: PROPOSED
+
+### R20. Strict typing
+- RECOMMENDATION: (a) checks move to the continue-with-call/vm_call chokepoints → HO callbacks,
+  apply, components, => receivers all covered; (b) unknown type name at declaration = ERROR;
+  (c) `"component"` becomes a real type branch; `"keyword"` removed (R18); (d) `(:as type)` param
+  annotations become the declaration channel (deprecate the name-keyed global dict, which is
+  trivially evaded and inherited by shadowers); (e) strict errors are catchable conditions (R7);
+  (f) set-prim-param-types! merges and validates; (g) return types: explicitly out of scope now.
+- Churn: low-medium. Findings: core strict-* family (8 findings). STATUS: PROPOSED
+
+## F. Numbers
+
+### R21. Integer model & overflow
+- Current: native = int63 with overflow-promote-to-float on + and * but silent WRAP on expt;
+  WASM = 32-bit silent wrap (F-1 — production browsers!); JS bundle = float64.
+- RECOMMENDATION: spec defines SX integers as **exact within ±2^53** (the portable range);
+  arithmetic that exceeds the host's exact range **promotes to float** (never wraps) — `expt`
+  included. WASM must be fixed to match (js_of_ocaml int64/boxed or explicit overflow checks) —
+  hosts lane feasibility-checks the mechanism; silent 32-bit wrap is a bug under any ruling.
+  Values beyond 2^53 must not be trusted exact across the wire.
+- Churn: low at spec level; WASM fix is real hosts work. Findings: F-1, P4, core expt. STATUS: PROPOSED
+
+### R22. Division & zero
+- RECOMMENDATION: integer `/`, `mod`, `quotient`, `remainder` by zero = catchable SX condition
+  (today: raw OCaml Division_by_zero for mod/quotient, silent `inf` for /); float ops keep IEEE
+  (inf/nan). `/` doc fixed: returns int when exact, float otherwise (current behavior ratified).
+- Churn: low. Findings: core div-by-zero, /-doc. STATUS: PROPOSED
+
+### R23. Float text & wire
+- RECOMMENDATION: **shortest-round-trip printing everywhere** (native `%g` 6-sig-digit printing is
+  a wire-corruption bug — P1); `inf`/`-inf`/`nan` are THE wire tokens on all hosts (P10); `round`
+  stays half-away-from-zero, documented (R7RS banker's rejected: churn without benefit);
+  `inexact->exact` rounding behavior kept + documented; `str 1.0` → keep `"1"` but canonical/wire
+  serializers must preserve the float/int distinction (`1.0` serializes as `1.0`).
+- Churn: low. Findings: P1, P10, core round/float-rendering; canonical CID determinism. STATUS: PROPOSED
+
+### R24. Rationals
+- RECOMMENDATION: `string->number` parses `"1/2"`; `(/ 1 3)` stays float (rationals remain opt-in
+  via make-rational) — documented; radix arg restored by fixing the r7rs.sx shadow (C2).
+- Churn: low. STATUS: PROPOSED
+
+## G. Strings
+
+### R25. Unit semantics
+- Current: native counts UTF-8 bytes (substring can split codepoints → invalid UTF-8); JS counts
+  UTF-16 units; constructors are codepoint-aware. Project style mandates UTF-8 text everywhere.
+- RECOMMENDATION: **codepoint semantics** for length/substring/index/ref at the spec level; kernel
+  implements UTF-8-aware ops. Accept the perf cost (or add byte-* variants for hot paths later).
+- Churn: medium (kernel work + any code relying on byte counts). Findings: core UTF-8 family, P6. STATUS: PROPOSED
+
+### R26. Case mapping
+- RECOMMENDATION: kernel `upcase`/`downcase`/`upper`/`lower` are **ASCII-only, documented** (full
+  Unicode case tables deferred; JS's full-Unicode behavior dies with D1). Aliases exist on all
+  surfaces (P11).
+- Churn: zero. STATUS: PROPOSED
+
+### R27. `split` and escapes
+- RECOMMENDATION: `split` = literal substring separator, keeps empties, empty separator → chars
+  (ratifies native; pin with the multi-char test that history shows is needed). String escape
+  table is normative: `\n \t \r \\ \" \uXXXX(validated: 4 hex digits, scalar value, else ERROR)`;
+  **unknown escape = parse ERROR** (kills the native-keeps-backslash vs guest-drops-it silent
+  divergence, C25 direction fight).
+- Churn: low. Findings: core split note, \u family, unknown-escape divergence; C25. STATUS: PROPOSED
+
+## H. Collections, nil, dicts
+
+### R28. nil vs empty list
+- Current: distinct values in the reader/serializer; `(cons 1 nil)` → `(1)` on native (nil-as-
+  empty in constructors); read ops inconsistent (`first nil` → nil but `reverse nil` → error).
+- RECOMMENDATION: nil and `()` remain **distinct values**; collection READ ops uniformly
+  **nil-pun** (treat nil as empty: first/rest/nth/last/reverse/len/empty? all accept nil);
+  constructors keep nil-as-empty seeding (cons/append onto nil). `nil?` ≠ `empty?` preserved.
+- Churn: low (only un-errors cases). Findings: core nil-tolerance; P7/P8 arms. STATUS: PROPOSED
+
+### R29. Dict ordering
+- RECOMMENDATION: **insertion order preserved** — iteration, keys/vals, and serialization (OCaml
+  Hashtbl replaced with an insertion-indexed structure; keys-reversed bug dies). CANONICAL form
+  always sorts keys independently (already true in the CBOR/CID layer). Duplicate literal keys:
+  last-wins, documented.
+- EMPIRICAL NOTE (quick-wins batch, 2026-07-03): an interim sorted-keys change broke 4 render
+  tests — attr emission order flows through dict_keys and the tests PIN source-order attributes
+  (`width` before `height` etc.). So the current reverse-ish order is load-bearing for render;
+  any change here must land together with the render-attr ordering contract. Reverted; do not
+  change keys order except via this ruling.
+- Churn: medium (kernel dict rework) but pays across wire/golden/cache findings C27/P9/core-keys. STATUS: PROPOSED
+
+### R30. Small-primitive contract fixes (spec already says; hosts violate)
+- RECOMMENDATION: ratify the spec text and fix: `contains?` on dicts = key check; `merge` skips
+  nil; `into` native on the kernel; `sort` takes an optional comparator, compares int/float
+  numerically, stable; `get` returns a STORED nil (default only when key absent); `zip-pairs` =
+  sliding window per spec (kernel currently chunks); `(max)`/`(min)` zero-arg = ERROR.
+- Churn: low each. Findings: core contains?/sort/keys; P2/P3/P8/P12; JS `get` arm. STATUS: PROPOSED
+
+### R31. `append!` and mutation
+- Current: silently no-ops on ANY derived list (map/filter/rest/reverse output) — worst silent-
+  data-loss finding in the primitives sweep.
+- RECOMMENDATION: `append!` **ERRORS on non-mutable lists** immediately (honest), and is
+  deprecated in favor of persistent `append` + a real mutable vector/buffer for accumulator
+  idioms. Sweep the corpus (it's a known accumulator idiom in loops).
+- Churn: medium (idiom sweep). Findings: core append!. STATUS: PROPOSED
+
+## I. Parser & wire
+
+### R32. One token grammar
+- RECOMMENDATION: publish the normative ident/number classifier in spec/parser.sx and make every
+  surface bind THE SAME table (today: four divergent tables → same source, different ASTs).
+  Specific token rulings: maximal-munch then classify (`1+`, `a,b` are symbols — ratifies native);
+  hex/binary/octal `#x/#o/#b`-style and `0x10` accepted, documented; `inf`/`nan`/`-inf` are number
+  literals (reserved, not idents); `1e` and other malformed numbers = parse ERROR (never nil);
+  unicode identifiers **allowed** (UTF-8 letters — the docs mandate UTF-8 text; native reader
+  extends its charset); `$`/`|` NOT ident chars; `.` IS a valid symbol (ratifies native; JS4 dies
+  with D1); `#t`/`#f` = boolean literals on all surfaces.
+- Churn: medium (native reader charset + guest table sync). Findings: core parser-divergence
+  family; C1b (unicode symbol kills server — fixed by charset + C1 try-wrap); JS4. STATUS: PROPOSED
+
+### R33. Reader extensibility & comments
+- RECOMMENDATION: implement the `#name` reader-macro registry on the kernel (spec documents it;
+  only JS has it today) — small, and sx-pub extensibility wants it. `#;` datum comment valid
+  before `)` and at EOF (standard). `#|…|` stays a RAW STRING (documented loudly as not-a-block-
+  comment); no block comments.
+- Churn: low. Findings: core reader-macro/datum-comment/raw-string. STATUS: PROPOSED
+
+### R34. Dict literals & serializer round-trip
+- RECOMMENDATION: dict literal keys must be keyword/string/symbol — anything else is a parse ERROR
+  on every parser (guest currently stringifies `{1 2}` silently); odd form count gets a "dict
+  needs key-value pairs" error. Serializer: dict keys escaped/round-trippable (today unparseable
+  output for non-ident keys — also a CID hazard); chars serialize by codepoint (`#\é` readable
+  back once R25 lands); PROPERTY TEST: `parse(serialize(x)) = x` for the full value lattice, run
+  on both kernels.
+- Churn: low. Findings: core serializer-dict-keys / multibyte-chars / dict-edges. STATUS: PROPOSED
+
+### R35. Canonical form & CIDs (sx-pub-critical)
+- RECOMMENDATION: the **native CBOR/CID path is normative** (key-sorted, verified native==WASM,
+  F-3). The canonical TEXT form is defined as: sorted keys, shortest-round-trip floats with
+  preserved int/float distinction, fully-escaped strings, and is a fixed point
+  (canonical(parse(canonical(x))) = canonical(x)) — property-tested cross-kernel. spec/canonical.sx
+  either becomes a tested mirror of the native path (fix its runner-only helpers) or is deleted;
+  two silently-diverging implementations is the one unacceptable state.
+- Churn: low-medium. Findings: core canonical family; F-3; P9/C27 (via R29). STATUS: PROPOSED
+
+### R40. Primitive naming & small-default unification (answers the hosts handoff list)
+- RECOMMENDATION: one canonical name registry in spec/primitives.sx; per-host aliases die (with
+  D1 most of these resolve to "make it native on the kernel"): `json-encode`/`json-parse` are
+  KERNEL primitives (not IO-bridge helpers — today unavailable sandboxed); `regex-*` is the
+  canonical family name; `parse`/`sx-parse` — `sx-parse` canonical, `parse` alias documented;
+  1-arg `(range n)` = 0..n-1 (ratifies native); `parse-int`/`string->number` on failure → nil
+  (ratifies native, never 0); `format` and the stdlib move for real (the primitives.sx header
+  claims a stdlib migration that never happened — make the header true or revert it) and
+  spec/stdlib.sx loads in production (today `format` is unresolved on the server).
+- Churn: low. Findings: F-9 naming splits, P7 arms, core spec-drift / stdlib-header. STATUS: PROPOSED
+
+## J. Render contracts
+
+### R36. Attribute contract (all four adapters)
+- RECOMMENDATION: one contract, HTML-mode's as base: boolean-registry attrs — false/nil omit,
+  anything else emits bare name (SX truthiness, documented footgun stands); non-boolean attrs —
+  value stringified INCLUDING `"true"`/`"false"` (DOM adapter aligns — C19/core, found by both
+  lanes); attribute NAMES validated `[A-Za-z_:][A-Za-z0-9_:.-]*` else ERROR (kills spread-dict
+  injection); nil attr value omits the attribute.
+- Churn: low. Findings: core attr-name-injection / bool-footguns / dom-html-parity; C19. STATUS: PROPOSED
+
+### R37. Raw-text elements & voids
+- RECOMMENDATION: `<script>`/`<style>` children are NOT entity-escaped; instead the renderer
+  ERRORS if content contains `</script`/`</style` (raw! unchanged as the explicit bypass) — HTML-
+  correct and injection-safe, and stops corrupting legitimate inline JS/CSS. Void-element registry
+  completed (area/base/embed/param/track added to HTML_TAGS); children passed to a void element =
+  ERROR (today silently dropped). Component render gets a depth limit (default 512) with a clear
+  error. `is-render-expr?` either wired in (html:/custom elements) or deleted.
+- Churn: low. Findings: core script-style / void-elements / recursive-component / is-render-expr. STATUS: PROPOSED
+
+### R38. aser wire format
+- RECOMMENDATION: list-valued keyword args serialize quoted (`:items (quote (…))`) so the wire
+  form re-evaluates to the same value; contract documented: components unexpanded, control flow
+  evaluated, all VALUES must round-trip through parse+eval. Property-test aser output re-evaluation.
+- Churn: low. Findings: core aser-list-kwargs; S14 (deep-nesting parity — add the depth-2 test). STATUS: PROPOSED
+
+## K. Signals (spec-level semantics)
+
+### R39. Reactive semantics
+- RECOMMENDATION: (a) change detection by `=` (deep equality — needs R19's Char arm etc.), not
+  physical identity — kills the every-reset-notifies behavior; (b) `batch` is unwind-safe
+  (depth decremented on any exit — today one throw wedges all reactivity forever); (c) notify is
+  glitch-free: two-phase (mark dirty → recompute in topological order) so diamonds recompute once;
+  (d) dispose-computed actually unsubscribes (bug); (e) effect cleanup cleared after invocation
+  (bug). Ratify as the documented reactive contract with tests (today: zero coverage of these).
+- Churn: low-medium (topological notify is the real work). Findings: core signals family (5). STATUS: PROPOSED
+
+---
+
+## Ratification checklist
+
+1. Flip each STATUS; AMENDED rulings get their text edited in place.
+2. High-churn rulings needing a pre-ratification sweep: **R17 (arity)**, R9 (cond clause-mode
+   usage), R31 (append! idiom), R15a (ambiguous HO calls).
+3. On ratification: move to spec/RULINGS.md; every ruling becomes (a) a conformance battery row
+   (native + WASM), (b) a fix ticket if behavior changes, (c) a docs line in CLAUDE.md's rewrite.
+4. Rulings deliberately NOT made here (need your call, no strong recommendation):
+   - Whether rationals should ever be the default result of exact `/` (R24 keeps float).
+   - Whether to pursue full-Unicode case mapping (R26 defers).
+   - Whether `do-loop` (R11) is worth keeping at all vs deleting Scheme do entirely.
+   - JIT re-enable timeline (J1–J8 are preconditions, not rulings).
--- a/plans/sx-review/conformance.md
+++ b/plans/sx-review/conformance.md
@@ -0,0 +1,188 @@
+# SX Conformance Review — cross-host agreement + test adequacy
+
+Axis: CONFORMANCE (do the hosts agree, and would the suite catch it if not?). Sibling lanes: core semantics, host implementations.
+Date: 2026-07-03. All test runs and probes executed this session (bounded with `timeout`; no shared sx_server touched — probes used freshly-spawned bounded `sx_server.exe` instances).
+CONFIRMED = reproduced here; SUSPECTED = static reasoning, probe proposed. Severity S1 critical → S4 minor.
+Probe artifacts: `/tmp/claude-0/-root-rose-ash/9a04ba52-7bf4-476d-99ea-04f84bff1359/scratchpad/{probes,prims}/`; raw suite logs `/tmp/sx-review/*.log`.
+
+---
+
+## 0. Ground truth: what the hosts actually are (the briefed picture is stale)
+
+- **OCaml native kernel** (`hosts/ocaml`, sx_server.exe) — canonical evaluator, essentially green on the corpus.
+- **OCaml WASM kernel** (`shared/static/wasm/sx_browser.bc.wasm.js`, js_of_ocaml build of the same kernel) — **what production browsers actually run** (served at lib/host/blog.sx:1910). Served artifact verified identical (md5) to freshly built one — no artifact staleness.
+- **JS-transpiled bundle** (`hosts/javascript` → `shared/static/scripts/sx-browser.js`) — legacy bootstrapped evaluator; still built and gated by `scripts/sx-build-all.sh`; catastrophically red (below). Not referenced by any served page found.
+- **Python host** (`hosts/python`) — vestigial: test runner deleted in d735e28b; `SX_USE_OCAML=1` in every compose file; bootstrap output consumed by nothing. NOT a live evaluator target. BUT `shared/sx/parser.py` (an independent Python SX reader/serializer) **is live in production plumbing** — see F-8.
+- **hosts/native** — Cairo pixel-renderer (separate host concept), single smoke test, not a corpus runner.
+
+## Suite results (this session)
+
+| Runner | Briefed | Actual |
+|---|---|---|
+| JS standard (`node hosts/javascript/run_tests.js`) | ~747 green | **2596 pass / 2490 FAIL** (5086) |
+| JS full (`--full`) | ~870 green | **2453 pass / 3203 FAIL** (5656) |
+| Python | ~744 green | **runner deleted** (d735e28b) — zero tests |
+| OCaml (`run_tests.exe`; no `--full` flag exists) | ~1080 green | **5762 pass / 274 FAIL** |
+| OCaml WASM kernel | — | **corpus never runs on it** (F-2) |
+
+Rebuilding the JS bundle fresh from today's spec reproduces the JS failures exactly (2596/2490) → red JS is real, not a stale artifact.
+
+---
+
+# CONFIRMED findings (most severe first)
+
+## F-1 [S1, confidence high] Server and browser disagree on integer arithmetic — silently
+The WASM kernel (the artifact every production browser loads) has 32-bit int semantics (js_of_ocaml); native is 63-bit. Same expression, same "canonical kernel", different answers, no error:
+
+| Expr | native sx_server | WASM SxKernel (shipped) | legacy JS bundle |
+|---|---|---|---|
+| `(* 99999999 99999999)` | 9999999800000001 | **1674919425** (32-bit wrap) | 9999999800000000 (float64) |
+| `(+ 9007199254740992 1)` | 9007199254740993 | **0** (literal truncated mod 2^32) | 9007199254740992 |
+| `(expt 2 62)` | -4611686018427387904 (int63 wrap) | **0** | 4611686018427388000 (float) |
+| `(* 999999999999 999999999999)` | 2003762205206896641 (silent int63 wrap) | 9.99999999998e+23 (goes float) | 9.99999999998e+23 |
+
+Three hosts, three different answers on the same input. Repro: `scratchpad/probes/run_wasm.js probes.txt` vs `sx_server.exe (eval ...)`. Any SSR-computed value re-derived client-side (pagination totals, ids, hashes-by-arithmetic, seat counts) can differ. Note native has two hazards of its own: silent int63 wraparound and reduced float print precision (`(+ 0.1 0.2)` prints `0.3`).
+Fix ownership: hosts lane (int64/boxed ints or explicit overflow policy in the WASM build); THIS lane's ask: a numeric-tower differential suite that runs on every shipped artifact.
+
+## F-2 [S1, high] The shipped browser kernel never runs the corpus
+No runner feeds spec/tests through `sx_browser.bc.js`/`.wasm.js`. Its entire test surface: `test_boot.sh` (require() smoke), `test_wasm_native.js` (test-framework.sx + web/tests/test-wasm-browser.sx only), `tests/node/run-sx-tests.js` (one deftest file). Both "conformance" runners test kernels that are NOT the shipped browser artifact. F-1 and F-3 existed undetected precisely because of this. The probe harness at `scratchpad/probes/run_wasm.js` shows a corpus runner on the WASM kernel is trivially buildable (SxKernel.eval works headless in node with the test_boot.sh stub block).
+
+## F-3 [S1, high] Native and WASM disagree on `apply` and dict key order (same kernel family!)
+- `(apply + (list 1 2 3))` → native: **error** "Expected number, got list" (apply does not spread); WASM: **6** (spreads, 2-arg form only; `(apply + 1 (list 2 3))` errors "apply: function and list"). Legacy JS bundle spreads fully (both forms → 6). Three different apply semantics.
+- `(assoc {:a 1} :b 2)` → native `{:b 2 :a 1}` (new keys prepend); WASM `{:a 1 :b 2}` (insertion order); JS bundle insertion order. `merge` same. Dict iteration/serialization order differs server-vs-browser — anything ordering-dependent (rendered attr order, keys/vals iteration, golden-file comparisons) diverges.
+- Defused non-finding: `cid-from-sx` canonicalizes — CIDs of `{:a 1 :b 2}` and `(assoc {:a 1} :b 2)` are identical on native AND WASM (probe verified). Content addressing is order-safe.
+
+## F-4 [S1, high] JS host fails ~half the shared corpus; the documented build gate is red
+spec/tests discovery is byte-identical in both runners (82 files both), so results are directly comparable: OCaml ≈ green, JS 2490 FAIL / 5086. `scripts/sx-build-all.sh` (`set -euo pipefail`) runs `node hosts/javascript/run_tests.js --full` as a gate → **the documented full pipeline currently exits FAIL** (run_tests.js exits 1). Either the JS host is dead (remove from gate/docs) or alive (≈2500 tests behind). Today no automated gate enforces cross-host agreement at all.
+
+## F-5 [S1, high] The corpus is not host-neutral: OCaml runner preloads libraries + services `import`; JS runner does neither
+- OCaml `make_test_env` (run_tests.ml:3700-3806) unconditionally preloads r7rs/render/canonical/adapters, forms/engine/router/orchestration, bytecode/compiler/vm, stdlib, signals, freeze, content, **parser-combinators, graphql, graphql-exec**, dom/browser, and the full **hyperscript stack**.
+- OCaml services `(import ...)` IO-suspensions at runtime (run_tests.ml:2148-2169, 2178-2272). The JS runner calls `Sx.eval` directly — **no suspension/resume loop**, so `import` can never complete on JS.
+- Result: hundreds of "Undefined symbol: gql-tokenize / pc-* / hs-*" JS failures (js-standard.log) that are runner-environment artifacts layered on top of real gaps. Tests exercising import/suspension (test-import-bind, test-io-suspension, test-coroutines, test-modules) do not test the same thing per host.
+
+## F-6 [S1, high] Whole test directories are gated on only one host — or none
+From run_tests.js:300-448 and run_tests.ml:3959-4001:
+- **lib/tests (11 files)** — continuations, continuations-advanced, freeze, signals-advanced, stdlib, stepper, tree-tools, types, vm, vm-closures, vm-primitives — auto-run **only on JS `--full`** (OCaml runs them only if named explicitly). Their only automated gate is the runner that's red: e.g. test-continuations.sx on JS: 475 pass / 54 FAIL; algebraic-data-types: 53 FAIL. **Effectively ungated.**
+- **web/tests (13 runnable)** — adapter-html, aser, deps, engine, examples, forms, orchestration, page-helpers, relate-picker, router, signals, swap-integration, tw-layout — auto-run **only on OCaml**.
+- **6 web/tests run on NO standard runner**: test-adapter-dom, test-boot-helpers, test-cek-reactive, test-handlers (module-loaded, not run as suite), test-layout, test-wasm-browser (only via test_wasm_native.js).
+- **OCaml foundation tests** (run_tests.ml:1178+) — native-only unit tests, no cross-host equivalent (fine, but they're invisible to other hosts by design).
+- `spec/tests/test.sx` matches neither filter (`test-*.sx`) → **runs nowhere**.
+- Guest-language suites (lib/scheme/tests/records.sx, lib/haskell/tests/records.sx, …) sit outside all discovery — each loop self-tests ad hoc, no aggregate gate.
+
+## F-7 [S1, high] Features pass their tests only inside the test runner's private environment — both directions
+- **OCaml side**: `values`, `promise?`, `make-promise`, `force` are bound **only in run_tests.ml** (lines 1131-1167). Probes: `(force (delay (+ 1 2)))`, `(values ...)`, `(let-values ...)` → "Undefined symbol" on BOTH production surfaces (native sx_server epoch env AND WASM SxKernel). ~271 promises/values assertions pass against an environment that exists nowhere in production.
+- **JS side**: run_tests.js injects `equal?`, `apply`, `env-*`, `render-html`, `make-continuation`, upcase/downcase, and a **fake sha3-256 stub** (lines 80-160) into the test env. The real browser bundle lacks these (probe: `equal?` → Undefined symbol on the bundle). Coverage numbers overstate both shipped artifacts.
+
+## F-8 [S1→S3 itemized, high] Differential probes, OCaml native vs JS bundle: 130 exprs, 98 identical, real divergences
+(harness: `scratchpad/probes/`; numeric rows already in F-1)
+
+Core data semantics:
+| Expr | OCaml native | JS bundle |
+|---|---|---|
+| `(cons 1 nil)` | `(1)` (nil ≡ empty list) | `(1 nil)` (nil is an element) — **foundational** |
+| `(range 3)` | `(0 1 2)` | `()` — single-arg range returns empty on JS |
+| `(round -2.5)` | -3 (half away from zero) | -2 (JS half-up) |
+| `(str (list 1 2 3))` | `"(1 2 3)"` | `"1,2,3"` (Array.toString leak) |
+| `(str {:a 1})` | `"{:a 1}"` | `"[object Object]"` |
+| `(len "héllo")` / `(len "👍")` | 6 / 4 (bytes) | 5 / 2 (UTF-16 units) — string indexing math differs on all non-ASCII; neither counts codepoints |
+| `(upper "straße")` | "STRAßE" (ASCII-only) | "STRASSE" (Unicode) |
+| `(reverse "abc")` | error (lists only) | `"cba"` |
+| `(sort lst cmp-fn)` | error "sort: 1 list" | sorts with comparator |
+| `(max)` | error | `-Infinity` |
+| `(/ 1 0)` | `inf` | `Infinity` (print) |
+| `(+ 0.1 0.2)` | prints `0.3` | prints `0.30000000000000004` |
+
+Agreements worth recording (hosts agree, docs/spec don't): `case` uses flat pairs `(case x 1 "one" 2 "two" :else ...)` — the CLAUDE.md-documented clause-list syntax errors identically on all three kernels; `(join sep coll)` is sep-first on both; `int?` undefined on both (it's `integer?`); `string->number` radix arg unsupported on all hosts (and the corpus asserts it — a live OCaml FAIL, see F-10).
+
+## F-9 [S2, high] Primitive-set parity is broken at scale (full lists in `scratchpad/prims/report.txt`)
+- **194 OCaml-core names missing from the JS bundle**, incl. language-level: `take drop zip unique init char-at substr upcase downcase equal? identical? dict-get dict-has? dict-delete! parse parse-safe parse-float escape-string compile compile-module` + records constructors + math extras. Dynamically verified samples throw "Undefined symbol" on JS.
+- **Naming splits where both have the capability**: OCaml `regex-*` vs JS `regexp-*`; OCaml `parse` vs JS `sx-parse`; OCaml `json-encode` vs JS `json-stringify`.
+- **24 user-facing JS names missing on the OCaml kernel**: `format values force promise? make-promise call-with-values json-parse json-stringify format-date parse-datetime pluralize escape strip-tags error-message env-parent char-code-at now-ms satisfies? …` (deps-check verified). Note the epoch-mode server doesn't load `spec/stdlib.sx`, so even `format` is unresolved in production server env.
+- **Declared in spec/primitives.sx but implemented NOWHERE: `eq?`, `eqv?`** (202 spec-declared total; 9 missing from OCaml, 6 from JS).
+- JS bundle oddities: 6 statically-assigned PRIMITIVES keys absent at runtime (`loop raw-loop reactive-text read-char-name-loop read-map-loop scan`).
+
+## F-10 [S2, high] The OCaml suite is not green, and a permanent red band is normalized
+274 FAIL on the canonical host: 272 hs-upstream-* (fetch/socket/runtimeErrors/asExpression/...). If browser-only, they should be skip-listed like the 6 web tests — a permanently red FAIL column trains everyone to ignore failures. The 2 core failures are live: `can-map-an-array` ("map with block") and `string->number` radix (corpus asserts a feature no host implements). Also note hs-upstream pass/fail sets differ wildly between OCaml (272 fail) and JS (~1900 fail) — the suites' shared-corpus value is currently nil for hyperscript. **UPDATE (S-2 probe, F-18): ≥118 of the 272 red hs tests PASS on the shipped WASM kernel + happy-dom — the red band is mostly mock-DOM environment deficiency, not engine failure.**
+
+## F-11 [S2, high] Boundary validation silently disabled in production; stale imports broke the Python-side test files
+- `shared/sx/boundary.py:34` imports `.ref.boundary_parser` — moved to hosts/python in 7036621b → ImportError swallowed (boundary.py:44-49) → **boundary type validation is a silent no-op**.
+- `shared/sx/tests/test_bootstrapper.py:149`, `test_parity.py:733` import deleted `shared.sx.ref.*` — those tests cannot even load.
+- `shared/sx/async_eval.py` (fallback evaluator) is macro-crippled by design (raises "sx_ref.py has been removed"); `resolver.py` fully stubbed. Fallback path SX_USE_OCAML=0 is non-functional — fine if intentional, but it's still importable wiring.
+(Hand-off: hosts lane for the fix; kept here because it *is* the boundary-conformance enforcement.)
+
+## F-12 [S2, high] The live Python SX reader diverges from the OCaml reader
+`shared/sx/parser.py` runs on every production request (serialize/parse plumbing under SX_USE_OCAML=1: handlers.py:191,211; helpers.py:449). Probes vs OCaml reader:
+- Python CANNOT read: `#t`/`#f` (reader-macro error), `#\a` chars, dotted pairs `(a . b)`, `.5` — all valid on OCaml. If any OCaml-serialized value containing a char (`#\a`) crosses the Python plumbing, Python errors.
+- Dict serialization order differs (Python insertion `{:a 1 :b "two"}` vs OCaml reversed `{:b "two" :a 1}`) — hazard for any wire-text comparison/caching (CIDs themselves are safe, F-3).
+- Agreements: escapes (\n \t \" \\), floats/exponents, keywords, `[..]`-as-list, quote/quasiquote sugar.
+
+## F-13 [S3, med] The checked-in generated kernel is not reproducible from today's spec+bootstrap
+`python3 hosts/ocaml/bootstrap.py --output <scratch>` vs checked-in `hosts/ocaml/lib/sx_ref.ml`: 360 diff lines. Function inventory is identical (the delta is `let` vs `and` restructuring + block moves — an older generator produced the checked-in file); no semantic difference detected, and tree==\_build. But CI (.gitea/Dockerfile.test steps 3-4) re-bootstraps sx_ref.ml from spec and recompiles — so CI's binary is built from different generated source than the dev tree's. Reproducibility should be a checked invariant (regen + diff in CI).
+
+## F-14 [S3, high] Doc drift — CLAUDE.md describes a system that no longer exists
+- ~~"Canonical SX semantics in `shared/sx/ref/*.sx`"~~ — **FIXED 2026-07-03**: CLAUDE.md now points at `spec/` and documents the bootstrap chain + corrected island rules (sequential `let`, implicit begin). Remaining items below still stand.
+- Documents a Python host + ~744 tests — deleted.
+- Documents `case` clause-list syntax — all hosts reject it (flat-pair syntax is real, spec/tests/test-cek-advanced.sx:549).
+- Briefed suite counts (747/870/744/1080) are months stale; corpus is 5-6k assertions.
+- `spec/primitives.sx` header claims stdlib functions "moved to stdlib" but spec/stdlib.sx defines only `format` (and production doesn't load it).
+
+## F-15 [S4, med] Housekeeping
+- Stray recursive tree `hosts/ocaml/hosts/ocaml/hosts/ocaml/bin/sx_server.ml` (accidental copy) — confuses greps and could get compiled/edited by mistake.
+- `spec/tests/test.sx` dead filename (see F-6).
+- JS runner's fake `sha3-256` stub returns non-SHA3 values — any hash-shape-only assertion passes; any real-value assertion would mysteriously fail JS-only.
+
+---
+
+# Hyperscript on the shipped WASM kernel (S-2 probe, executed 2026-07-03)
+
+Method: new probe harness `scratchpad/hsprobe/run_hs_wasm.js` — loads the SHIPPED kernel (`shared/static/wasm/sx_browser.bc.js`) + shipped platform in node/happy-dom via `tests/node/sx-harness.js`, preloads the same files as the OCaml runner (test-framework, spec/harness, web/lib/dom, the lib/hyperscript stack), and runs the hs spec corpus with streamed per-test output. Per-test diff vs the native run in `scratchpad/hsprobe/compare2.py`; raw logs `/tmp/sx-review/wasm-hs-*.log`, comparisons `/tmp/sx-review/wasm-hs-compare.txt`, `wasm-hs-pure2-compare.txt`. Coverage: 1,553 tests measured (tokenizer/parser/compiler/runtime 206; behavioral 1,250 of 1,514 — one shard timed out; conformance 97 of 222 — timed out; conformance-dev/sandbox/diag/integration/htmx unmeasured). Bottom line: **the kernel itself conforms; the packaging and the test environments are where the defects are.**
+
+## F-16 [S1, high, CONFIRMED] Shipped browser hyperscript cannot call functions: `host-call-fn` is referenced but defined nowhere in the shipped stack
+- The shipped stack ships a full hs engine (`shared/static/wasm/sx/hs-{tokenizer,parser,compiler,runtime,integration,htmx,worker,prolog}.sx(+.sxbc)`), but `hs-runtime.sx(bc)`/`hs-integration.sx(bc)` reference `host-call-fn`/`host-call-fn-raising`, which only `run_tests.ml:3564` defines (the test runner's mock host bridge). The shipped platform (`sx-platform.js`) registers `host-call` / `host-new` / `host-get` etc. — NOT `host-call-fn`.
+- Reproduced: running the behavioral corpus on the shipped stack as-is → **536 pass / 978 fail, 906 of them "Undefined symbol: host-call-fn"** (wasm-hs-behavioral.log). Any hyperscript that invokes a function or method would fail the same way in a real browser.
+- The OCaml runner's own comment calls this binding "the single biggest gap: ~900 behavioral tests failed" — the fix was made in the test runner only, never in the shipped platform. Textbook case of F-7 (feature exists only in the runner's private environment).
+
+## F-17 [S1, high, CONFIRMED] Shipped `hs-runtime.sx` is missing the `(jit-exclude! "hs-*")` guard
+- `lib/hyperscript/runtime.sx` ends with `(jit-exclude! "hs-*")` + a comment explaining the hs recursion web **miscompiles under the bytecode JIT** (the parser-combinator JIT bug) and must stay CEK-interpreted.
+- `shared/static/wasm/sx/hs-runtime.sx` is byte-identical EXCEPT this guard is absent (10-line diff; the other 5 hs modules are identical copies). The browser is precisely the sxbc/JIT-heavy environment. So the tested configuration (JIT-excluded) is not the shipped configuration (JIT-eligible). The lib→wasm/sx sync that copied these files dropped exactly the safety-critical line.
+
+## F-18 [S2, high, CONFIRMED] The kernel conforms; the native runner's mock DOM is the outlier — and the "272 red" band mostly indicts the test env, not the code
+Per-test diff, shipped-kernel+happy-dom (with the host bridge mirrored) vs native mock-DOM run:
+- Pure pipeline (tokenizer/parser/compiler/runtime, 206 tests): **0 WASM-only failures**; all 51 failures shared with native; 2 native-only.
+- Behavioral (1,250 matched): **9 WASM-only failures** vs **118 native-only failures** (WASM+happy-dom passes where the mock DOM fails: fetch 22, toggle 10, on 10, make 8, repeat 6, append 6, ...). Conformance (97 matched): 0 WASM-only, 4 native-only.
+- Consequence for F-10: the permanently-red 272 hs failures on the canonical host are largely **mock-DOM environment artifacts** — the engine passes those tests on the shipped kernel with a more realistic DOM. The red band hides real information in both directions.
+- The 9 WASM-only failures (listed in wasm-hs-compare.txt): behavior scoping ×4 ("Expected 10/20, got <empty>"), as-fragment conversions ×3 (a `{:__host_handle N}` leaks where a list is expected — happy-dom NodeList boundary), init/where sequencing ×2. Candidates for bisection; none look like core-semantics bugs.
+- Also measured: **the js_of_ocaml kernel is ~1-2 orders of magnitude slower** on this corpus than native (conformance ≈24s/test vs seconds for the whole file natively) — worth knowing before making WASM corpus runs a gate.
+
+## F-19 [S3, med, CONFIRMED] hs corpus drift + inverted assert labels
+- The shared-failure bucket (~50 behavioral + parser/tokenizer suites) is corpus drift: generated tests (generate-sx-tests.py, "DO NOT EDIT") still expect the old parser AST — `(me)` implicit target and `(. obj k)` — while the current parser emits `(beingTold)` / `poss` (probed identically on native sx_server AND WASM: `(hs-compile "add .foo")` → `(add-class "foo" (beingTold))` on both). Tests were not regenerated after the parser change.
+- `assert=` (spec/harness.sx:31) is `(actual expected msg)` but the generated corpus calls it as `(assert= expected actual)` — every failure message prints Expected/got **swapped**, which materially misled diagnosis during this probe (it makes current-parser output look like the "expected" value).
+- `hypersx.sx` in the shipped boot list is NOT the hyperscript engine (it's an sx→hypersx-notation pretty-printer); the actual hs engine modules are shipped but absent from the boot module list (`loadWebStackFallback`, sx-platform.js:670) — load path for hs in production is unclear/on-demand only.
+
+---
+
+# SUSPECTED findings / gaps (probe proposed, not yet reproduced)
+
+## S-1 [S1, med] Everything F-8 lists between native and the JS bundle likely also splits native vs WASM in web-stack .sx code paths not covered by my 130 probes (strings near render, regex at the sxbc layer, signals timing). Probe: run the full probe corpus + web adapter smoke through `run_wasm.js` and diff (harness exists; only 130 exprs run so far).
+
+## S-2 — RESOLVED, promoted to F-16/F-17/F-18 below (hs corpus WAS run against the shipped WASM kernel; see "Hyperscript on the shipped kernel" section).
+
+## S-3 [S2, med] `import`-dependent components behave differently in production browser vs test: OCaml runner resolves imports synchronously from disk; browser resolves via fetch + sxbc bundles (compile-modules.js). No test asserts the browser-side import path resolves the same module set. Probe: compare `sx_build_manifest`/dist sxbc contents vs run_tests preload list.
+
+## S-4 [S3, med] Float printing (`0.3` vs `0.30000000000000004`) means any golden-string test containing floats validates different precision per host — some current "passes" may be precision-coincidences. Probe: grep corpus asserts for float literals with >6 significant digits.
+
+## S-5 [S3, med — partially confirmed] The two JS kernel BUILDS disagree with each other: `spec/tests/test-adt.sx` (algebraic-data-types) passes on the standard bundle but fails 53 assertions on the `--full` build (`--extensions continuations --spec-modules types`). Confirmed from js-standard.log (0 FAIL, PASSes present) vs js-full.log (53 FAIL). So build flags change language behavior within one host — the types/continuations extension modules interfere with ADT/match. Suspected mechanism unverified; probe: bisect by building with each flag alone and running test-adt.sx.
+
+---
+
+# Suite asymmetry — why the counts differ (summary answer to the brief)
+
+- Corpus exploded (hyperscript upstream ports, GQL, parser combinators, regex, records, chars, bytevectors, numeric tower, values/promises...) to ~5-6k assertions; briefed counts are months stale.
+- JS standard (5086) = spec/tests only; JS full (5656) = + lib/tests + bigger kernel build.
+- OCaml (6036) = spec/tests + web/tests + native foundation tests, but NOT lib/tests.
+- Python = 0 (deleted).
+- WASM (shipped browser artifact) = ~0 (boot smoke + 2 files).
+- **No two hosts run the same file set; only spec/tests overlaps fully — and even there the runner environments differ (F-5, F-7), so "same test passing on two hosts" does not mean "same behavior".**
+
+# Recommended conformance gate (one paragraph for the maintainer)
+Run the spec/tests corpus on: native run_tests.exe, the WASM kernel via node (harness pattern: `scratchpad/probes/run_wasm.js`), and (if kept) the JS bundle — from the SAME preload manifest, with runner shims deleted or mirrored into all three; add lib/tests to the OCaml runner; skip-list the browser-only hs suites instead of letting them fail; add a numeric/string/dict-order differential probe file (seed: `scratchpad/probes/probes.txt`, 130 exprs, 32 divergences today) that must be output-identical across kernels; regen sx_ref.ml in CI and diff against checked-in.
--- a/plans/sx-review/core.md
+++ b/plans/sx-review/core.md
@@ -0,0 +1,717 @@
+# SX Language Core Review — spec/ semantics
+
+Reviewer axis: LANGUAGE CORE (spec/evaluator.sx CEK machine, parser.sx, primitives.sx,
+render.sx, special-forms.sx, eval-rules.sx, stdlib.sx, signals.sx, coroutines.sx, canonical.sx).
+Note: brief mentions spec/types.sx — it no longer exists; strict-typing machinery lives in evaluator.sx.
+Status: COMPLETE — all 8 dimension sweeps merged (CEK core, env/scope, HO forms,
+special forms/macros, parser/serializer/canonical, primitives/stdlib, render modes,
+strict typing + signals/coroutines/harness).
+
+TOTALS: 104 CONFIRMED findings (3 critical, 26 high, 40 medium, 2 low-medium, 33 low)
+ 5 SUSPECTED. Every CONFIRMED item has a runtime repro (fresh sx_server.exe unless noted).
+All 3 criticals additionally re-verified on the shipped WASM browser kernel (see CROSS-LANE CHECK).
+
+THEMES a ranker should know:
+1. **Nested cek-run instead of CEK frames** is one root cause behind ≥4 findings (shift-k
+   double-execution, threading guard/IO break, + suspected macro/let-values/qq boundaries).
+2. **Handler still installed while handler runs** explains the guard/handler-bind hang family.
+3. **Name-before-env dispatch** explains the ~60 unshadowable names + HO-not-first-class family.
+4. **Global mutable stacks popped only on normal exit** explains provide/winder/batch leak family
+   (scope stack, *winders*, *batch-depth* — none unwind-safe).
+5. **Test-runner-only bindings** make whole suites (values, canonical floats, batch, coroutines)
+   green for features the shipped runtime doesn't have; one test passes *vacuously because of the
+   very bug it tests* (signal-return).
+6. **Per-host re-bound platform primitives** (parse-number, char-code, escape-string, split, get…)
+   are the drift engine behind parser AST divergence + harness/runtime divergence.
+
+CROSS-LANE CHECK (vs /tmp/sx-review/hosts.md and conformance.md, done 2026-07-03):
+- **All 3 criticals re-verified on the shipped WASM browser kernel** (js_of_ocaml build browsers
+  actually load; probe harness from the conformance lane): guard re-raise HANGS (node killed at
+  25s), signal-condition → `42` (same kont drop), shift repro → identical double-execution trace
+  `(99 ("r=escaped" "after-k" "r=99"))`. Kernel-family bugs: native server AND production browser.
+  Conformance F-2 (corpus never runs on WASM) explains why nothing caught them there.
+- **Not masked by the JIT**: hosts J2/J9 confirm guard-installing lambdas are interpret-only and
+  any raise/call-cc in JIT'd code falls back to the CEK — my criticals are the live path.
+- **Three independent double-side-effect mechanisms now on record**: my shift-k nested cek-run
+  (critical #3), hosts J1 (`->` miscompiled under serving-JIT, steps re-run), hosts J2
+  (JIT-fallback re-runs whole call). Same user-visible symptom, three distinct fixes.
+- **One of my findings corrected** (apply — see the finding below); expt int63 wrap corroborated
+  by conformance F-1 (WASM is worse: `(expt 2 62)` → 0); unshadowable-HO finding extended by hosts
+  J8 (the VM DOES honor local bindings — CEK/VM divergence within one host); render dom/html attr
+  parity independently confirmed as hosts C19; values/eq?/eqv? runner-only bindings corroborated
+  by conformance F-7/F-9.
+- **No contradiction on canonical/CIDs**: conformance's working `cid-from-sx` is a native kernel
+  primitive (verified: works with spec/canonical.sx not loaded). My canonical.sx finding concerns
+  the spec guest implementation — production CIDs bypass it. Two parallel CID implementations,
+  only the native one exercised; spec-vs-native canonical-form agreement is untested (conformance
+  F-3 checked native-vs-WASM only).
+
+Verification recipe: `sx_harness_eval` (MCP) cross-checked against fresh real-runtime processes
+`printf '(epoch 1)\n(eval "...")\n' | timeout 30 hosts/ocaml/_build/default/bin/sx_server.exe`.
+TOOLING CAVEATS found during review (also listed as handoffs): (1) the MCP harness primitive
+table diverges from the real runtime; (2) `sx_harness_eval` is NOT a fresh sandbox — state
+persists and cross-contaminates calls; (3) sx_read_subtree ignores `path`, sx_read_tree ignores
+`max_lines`. All critical/divergent probes were re-verified on fresh `sx_server.exe` processes.
+
+---
+
+## CONFIRMED findings (most severe first)
+
+### [critical] [CONFIRMED×2] Any raise/error inside a guard clause body or handler-bind handler loops forever — handler runs with its own handler frame still installed
+- Location: spec/evaluator.sx, `raise-eval` case of step-continue (~4547-4573) + `kont-unwind-to-handler` (236-259); inherited by `step-sf-guard` (~1693)
+- What: `kont-unwind-to-handler` returns `{:handler match :kont kont}` where `kont` still contains the matched handler frame; the handler is invoked with that kont. A `raise` inside the handler re-matches the same handler → infinite loop. Not just explicit re-raise: ANY error while a handler/clause body runs (`(error ...)`, a raised different value) hangs instead of propagating. CL/R7RS: handlers run with the enclosing (outer) handler set. `guard` desugars clause bodies to run INSIDE the handler-bind extent (`(__guard-k (cond ...))` — clauses evaluate before the escape), so the memory'd gotcha "`(raise e)` in a guard clause hangs" is exactly this. Contrast: the no-matching-clause auto-reraise is R7RS-correct (`(guard (outer (true outer)) (guard (e ((= e 1) "one")) (raise 2)))` → outer catches 2) because the sentinel re-raise happens after the call/cc return, OUTSIDE handler-bind — which is exactly how clause bodies should also run.
+- Repro (bounded CLI, all timeout exit 124): `(guard (e (true (raise e))) (raise 42))`; `(handler-bind (((fn (c) true) (fn (c) (raise c)))) (raise 1))`; same with `(raise "different")` and `(error "again")`.
+- Cross-check: reproduced on the shipped WASM browser kernel (hangs, killed at 25s) — affects production browsers, not just the server. Hosts lane J2/J9: guard-installing lambdas are interpret-only, so the JIT never masks this.
+- Coverage: test-r7rs.sx guard suite + test-conditions.sx cover happy paths only; no test raises from within a handler. test-cek-try-seq.sx "error in error handler propagates" passes because `cek-try` is a different mechanism.
+
+### [critical] [CONFIRMED] signal-return frame key mismatch drops the caller's continuation — continuable signal/raise-continuable returns the handler value as the WHOLE program's result; the covering test passes vacuously
+- Location: spec/evaluator.sx, `make-signal-return-frame` (line 182, stores saved kont under `:f`) vs `signal-return` case of step-continue (~4509-4512, reads `(get frame "saved-kont")`); mirrored in hosts/ocaml/lib/sx_runtime.ml:210-231 (CekFrame get has no `"saved-kont"` mapping → Nil)
+- What: the resume kont is always nil, so after the handler returns, its value becomes the terminal value of the entire CEK run — every frame outside the signal site (arithmetic, enclosing lists, asserts) is silently discarded.
+- Repro: `(list "outer" (handler-bind (((fn (c) true) (fn (c) 42))) (+ 1 (signal-condition 5))) "end")` → `42`; expected `("outer" 43 "end")`. Same with `raise-continuable` → `42`. The shipped test expr `(handler-bind (((fn (c) true) (fn (c) (* c 10)))) (+ 1 (signal-condition 5)))` → `50` on both CLI and harness, yet the test asserting `51` PASSES under run_tests.exe — the dropped continuation includes the `assert-equal` frame itself, so the assertion never executes (vacuous pass).
+- Cross-check: reproduced byte-identically (`42`) on the shipped WASM browser kernel.
+- Coverage: test-conditions.sx "signal returns handler value to call site" — passing vacuously; the bug defeats its own test.
+
+### [critical] [CONFIRMED] Invoking a shift-captured continuation uses a nested cek-run — escaping across that boundary re-executes the outer program tail (double execution, duplicated side effects); raising inside a resumed k can't reach outer handlers
+- Location: spec/evaluator.sx, `continue-with-call`, `continuation?` branch (~4708-4716): `(let ((result (cek-run (make-cek-value arg env captured)))) (make-cek-value result env kont))`
+- What: the nested run's kont ends at the captured frames; (a) a call/cc escape invoked inside the resumed extent rewrites the kont *inside the nested run*, which then runs the rest of the program to completion, returns that as the value of `(k arg)`, and the outer run executes the program tail again; (b) handler frames in the outer kont are invisible to `kont-unwind-to-handler` inside the nested run.
+- Repro (a): `(do (define log (list)) (define r (call/cc (fn (esc) (reset (do (shift k (do (k 1) (set! log (append log (list "after-k"))) 99)) (esc "escaped") "unreached"))))) (set! log (append log (list (str "r=" r)))) (list r log))` → actual `(99 ("r=escaped" "after-k" "r=99"))` (tail executed twice); expected `("escaped" ("r=escaped"))`.
+- Repro (b): `(guard (e (true (list "caught" e))) (reset (do (shift k (k 1)) (raise "boom"))))` → `Unhandled exception: "boom"`; expected `("caught" "boom")`.
+- Cross-check: repro (a) reproduced with the identical wrong trace `(99 ("r=escaped" "after-k" "r=99"))` on the shipped WASM browser kernel. Note: hosts J1/J2 are two FURTHER independent double-side-effect mechanisms (JIT `->` miscompile; JIT-fallback re-run) — three distinct fixes needed for "side effects ran twice" reports.
+- Coverage: not covered (test-cek-advanced.sx shift/reset tests never cross the boundary with call/cc or raise).
+
+### [high] [CONFIRMED] Caller's immediate frame leaks into interpreted lambda calls — partial dynamic scoping, and the JIT disagrees
+- Location: spec/evaluator.sx, `continue-with-call` (`(local (env-merge (lambda-closure f) env))` ~4739; same in `call-lambda` ~896) + `env_merge` in hosts/ocaml/lib/sx_types.ml:390
+- What: when the call-site env is NOT a descendant of the lambda's closure env, `env_merge` copies the caller's **top frame** bindings into the lambda's local env. Free variables in the body resolve to the caller's locals — a lexical-scoping violation. Depth-1 only (a binding one frame deeper does not leak). The JIT path disagrees: a VM-compiled body raises "VM undefined" for the same program — behavior flips depending on whether the body got JIT-compiled.
+- Repro: `(do (define mg (fn () (fn () (guard (e (true e)) leakedz)))) (define gz (mg)) (let ((leakedz 66)) (gz)))` → **66** (guard forces interpretation); expected undefined-symbol error. Without guard → `"VM undefined: leaked"` (JIT). Depth-2 variant → Undefined symbol.
+- Coverage: not covered — test-scope.sx "environment-isolation" tests only the lambda→caller direction.
+
+### [high] [CONFIRMED] letrec injects its bindings into foreign lambdas' closure envs — permanent global contamination
+- Location: spec/evaluator.sx, `sf-letrec` (~1370: `(env-bind! (lambda-closure val) n (env-get local n))`)
+- What: after evaluating inits, letrec binds ALL letrec names into the closure env of every lambda **value**. `make_lambda` stores the defining env directly, so a letrec whose value is a pre-existing (e.g. top-level) lambda writes the letrec names into that lambda's closure — the **global env** — permanently.
+- Repro: `(do (define idf (fn (x) x)) (letrec ((zzq idf) (zzn 55)) nil) zzn)` → **55**; expected "Undefined symbol: zzn". (This leak also polluted the shared MCP harness image across calls during verification.)
+- Coverage: not covered — test-scope.sx "letrec-edge" only binds lambdas created inside the letrec (extra binds are no-ops there).
+
+### [high] [CONFIRMED] Named let leaks its loop name into the enclosing env frame and clobbers same-name bindings
+- Location: spec/evaluator.sx, `sf-named-let` (~1035: `(env-bind! (lambda-closure loop-fn) loop-name loop-fn)`)
+- What: `lambda-closure loop-fn` IS the enclosing env (no fresh frame), so the loop name is bound into the surrounding scope: visible after the form, and it clobbers (not shadows) an existing binding of the same name.
+- Repro: `(do (let lp ((i 0)) i) (lambda? lp))` → **true** (expected unbound). `(let ((lp2 5)) (let lp2 ((i 0)) i) lp2)` → loop lambda in interpreter, **nil** under JIT — never the expected 5.
+- Coverage: not covered — test-named-let-sx locks set!-accumulator patterns only.
+
+### [high] [CONFIRMED×3] ~60 special-form/HO names are silently unshadowable — define/let/defmacro accepted, call-position dispatch ignores them
+- Location: spec/evaluator.sx `step-eval-list` (~1801-1958) — head-name `match` runs before any env lookup; only the `_` fallthrough (custom special forms, ~1959) checks `(not (env-has? env name))`
+- What: list-head dispatch checks built-in special/HO forms BEFORE env lookup. `(define bind (fn (a b) "mine"))` succeeds (`type-of` says lambda) but `(bind 1 2)` → `1` (special form runs). `(define map ...)`, `(let ((map ...)) ...)`, `(defmacro if ...)`, `(defmacro map ...)` — all silently ignored in call position while honored in value position. Regular primitives ARE properly shadowable (`(define get ...)`, `(define inc ...)` → user def wins) — only this name set is hijacked, and custom special forms DO respect user bindings, making built-ins doubly inconsistent.
+- Unshadowable names (extracted from dispatch): `if when cond case and or let let* lambda fn define defcomp defisland defmacro defio define-foreign io begin do guard quote quasiquote -> ->> |> as-> set! letrec reset shift deref scope provide peek provide! context bind emit! emitted handler-bind restart-case signal-condition invoke-restart match let-match dynamic-wind map map-indexed filter reduce some every? for-each raise raise-continuable call/cc call-with-current-continuation perform define-library import define-record-type define-protocol implement parameterize syntax-rules define-syntax`. Collision-prone short ones: `map filter reduce some bind match peek context deref guard io do case`.
+- Repro: `(do (define map (fn (f xs) "mine")) (map (fn (x) (* x 10)) (list 1 2)))` → `(10 20)`; `(let ((map (fn (a b) 42))) (map 1 2))` → `Error: rest: 1 list arg`; `(let ((-> (fn (a b) 99))) (-> 1 2))` → `Not callable: nil`; `(do (defmacro if (a b c) 99) (if true 1 2))` → 1.
+- Coverage: not covered anywhere. Memory gotcha "bind/conj/disj shadowed" confirmed for `bind`; `conj`/`disj` aren't core primitives (guest-worktree lore).
+
+### [high] [CONFIRMED] cond grammar is ambiguous — an all-clauses-len-2 heuristic silently switches modes; multi-expr clause bodies are dropped or crash; flat-intent code can silently return the wrong value
+- Location: spec/evaluator.sx, `step-sf-cond` (a `scheme?` detection binding selects clause-mode vs flat-pair mode)
+- What: `cond` supports flat pairs `(cond t1 r1 ... :else d)` (the only documented syntax — eval-rules.sx:64) plus an undocumented Scheme clause mode auto-detected iff **every** arg is a 2-element list (or `(test => proc)`). All verified consequences:
+  - Single clause with multi-expr body: `(cond ((= 1 1) (set! a 1) (set! b 2)))` → nil, **neither set! runs** — silent total drop of side effects.
+  - Multi-expr body + other clauses: `(cond ((= 1 1) "a" "b") (:else "no"))` → `Not callable: true` — one len≠2 clause anywhere flips the WHOLE cond to flat mode.
+  - Silent misinterpretation: `(do (define x false) (define y true) (cond (not x) (list 1) (not y) (list 2)))` → `false` (clause-mode reads `(not x)` as test=`not`, result=`x`); flat reading gives `(1)`. Wrong answer, no error.
+  - Test-only clause `(cond (5))` → nil (Scheme: 5); poisons detection: `(cond (true "t") (5))` → `Not callable: nil`.
+  - Trailing odd flat arg silently ignored, never evaluated: `(cond (set! a 99))` leaves a unchanged.
+- Coverage: flat tested (test-eval.sx:306-312); clause mode only via cond-arrow suite (test-r7rs.sx:135-145). Ambiguity/multi-expr/test-only uncovered; clause mode entirely undocumented.
+
+### [high] [CONFIRMED] `(unquote-splicing x)` longhand silently no-splices; only `splice-unquote` is recognized
+- Location: spec/evaluator.sx, `qq-expand` (checks `(symbol-name (first item)) = "splice-unquote"` only)
+- What: `,@` sugar parses to `splice-unquote` and works; the R7RS-standard longhand `unquote-splicing` fails dispatch, is recursed into as an ordinary list, and is emitted literally — silent zero-splice. `(unquote x)` longhand works.
+- Repro: `(quasiquote (a (unquote-splicing xs)))` → `(a (unquote-splicing xs))`; `` `(a ,@xs) `` and `(splice-unquote xs)` → `(a 1 2)`.
+- Coverage: not covered — worse, test-macros.sx tests are NAMED "unquote-splicing …" (lines 43-63) while all using `,@` sugar, actively reinforcing the trap. (Confirms memory gotcha; root cause now located.)
+
+### [high] [CONFIRMED] dynamic-wind before-thunks never re-run on continuation re-entry; global length-based winder stack corrupts across sibling wind contexts (afters skipped/duplicated)
+- Location: spec/evaluator.sx, `continue-with-call` callcc-continuation branch (~4702-4707: `(do (wind-escape-to w-len) ...)`), `wind-escape-to` (261-271)
+- What: invoking a captured continuation only pops after-thunks down to the captured *length* of the global `*winders*` stack. No common-ancestor computation, no before-thunks on entry (R7RS requires before/after along the path between extents). Lengths from unrelated wind contexts collide: resuming a k captured inside wind A while inside wind B (equal depth) unwinds nothing, then A's `wind-after` frame pops B's winder.
+- Repro 1 (re-entry): capture k inside wind, escape, re-invoke → `(2 ("b" "a" "a"))`; expected `(2 ("b" "a" "b" "a"))` (before not re-run; after ran twice).
+- Repro 2 (sibling): capture in wind A, re-invoke from inside wind B → `(2 ("A-in" "A-out" "B-in" "A-out"))`; expected B-out + A-in before final A-out. B's after silently never runs (resource-leak class), A's runs twice.
+- Coverage: test-dynamic-wind.sx (8 tests): normal return, raise, one-shot escape only.
+
+### [high] [CONFIRMED×2] guard re-raise sentinel is forgeable — a body/clause legitimately returning `(list '__guard-reraise__ X)` is misinterpreted as a re-raise of X
+- Location: spec/evaluator.sx, `step-sf-guard` (~1693-1767): sentinel `(make-symbol "__guard-reraise__")`, detected by structural `=` on any 2-element list escaping the guard
+- Repro: `(guard (e (true (list (quote __guard-reraise__) 42))) (raise 1))` → `Unhandled exception: 42`; `(guard (e (true "handled")) (list (quote __guard-reraise__) 7))` → `Unhandled exception: 7` — the guard *body's* return value converted into a raise. Should be an unforgeable/gensym'd token. (Severity judged high by one reviewer, low by another — data-dependent conversion of values into raises; rank accordingly.)
+- Coverage: not covered.
+
+### [high] [CONFIRMED] `->`/`->>` non-HO steps run in a nested CEK with empty kont — guard and IO suspension broken through threading
+- Location: spec/evaluator.sx, `thread-insert-arg`/`thread-insert-arg-last` (72–89) call `eval-expr` (4828: `cek-run` with kont `(list)`); "thread" frame handler (~4074) stays CEK-native only for `ho-form-name?` heads
+- What: a threaded non-HO step evaluates in a fresh machine that can't see outer guard frames and can't suspend. (a) `raise` inside a threaded call escapes an enclosing `guard`; (b) IO/effects inside a threaded step hard-crash instead of suspending. The HO path is CEK-native and correct — same expression works or fails depending on the step's head symbol. Same root pattern as the shift-k critical.
+- Repro: `(define boom (fn (x) (raise "T"))) (guard (e (else "caught")) (-> 1 boom))` → `Unhandled exception: "T"` (map version → caught). `(-> {:op "noop"} (perform))` → `Error: Sx_vm.VmSuspended(_,_)` (map version suspends/resumes fine).
+- Coverage: not covered
+
+### [high] [CONFIRMED] 2-arg `(reduce f coll)` silently returns the collection unchanged
+- Location: spec/evaluator.sx, `ho-setup-dispatch` "reduce" branch (~3671) + `ho-swap-args` (~3557)
+- What: fn-first 2-arg reduce makes `init` the collection and `coll` nil → returns init. Expected: fold with first element as init (Scheme/Clojure) or arity error. Asymmetrically, data-first `(reduce coll f)` DOES fold — with nil init (works only via nil-coercion in `+`/`str`).
+- Repro: `(reduce + (list 1 2 3))` → `(1 2 3)` (expected 6 or error); `(reduce (list 1 2 3) +)` → `6`.
+- Coverage: not covered (tests only use 3-arg forms)
+
+### [high] [CONFIRMED] ho-swap-args misreads `(reduce init f coll)` — breaks `(-> init (reduce f coll))`
+- Location: spec/evaluator.sx, `ho-swap-args` reduce branch: `(list b (nth evaled 2) a)`
+- What: with non-callable arg0, `(reduce init f coll)` treats arg0 as coll and arg2 as init — the threaded scalar seed becomes the "collection" → cryptic host error. The thread handler inserts the threaded value FIRST for HO forms, so any `->` reduce with a scalar seed hits this.
+- Repro: `(-> 0 (reduce + (list 1 2 3)))` → `Error: rest: 1 list arg` (expected 6); same for `(reduce 0 + (list 1 2 3))`.
+- Coverage: not covered — thread-ho suite only tests `(-> coll (reduce + 0))` (test-cek-advanced.sx:673)
+
+### [high] [CONFIRMED] Data-first ho-swap-args silently drops all args beyond the second
+- Location: spec/evaluator.sx, `ho-swap-args` non-reduce branch: `(list b a)`
+- What: when arg0 is data and arg1 callable, everything after arg1 is discarded — a data-first multi-collection map silently maps over only the first collection; with no lambda-arity enforcement, garbage results, not errors.
+- Repro: `(map (list 1 2) (fn (x) (* x 10)) (list 3 4))` → `(10 20)`; `(map (list 1 2) (fn (x y) (+ x y)) (list 30 40))` → `(1 2)` (y → nil, `(+ 1 nil)` = 1).
+- Coverage: not covered
+
+### [high] [CONFIRMED] Infinite recursive component hangs the renderer — no depth guard
+- Location: web/adapter-html.sx `render-html-component`/`render-list-to-html`; spec/render.sx has no recursion bound
+- What: a self-referencing component with no base case (or data-driven cycle) recurses forever — one render pins the server thread indefinitely. No depth limit or cycle detection.
+- Repro: `(do (defcomp ~loop () (div (~loop))) (render-to-html '(~loop) (current-env)))` → never returns (killed at 20s). Bounded `(~nest :n 3)` renders fine.
+- Coverage: not covered (needs a depth limit + test)
+
+### [high] [CONFIRMED] append! silently no-ops on all derived lists
+- Location: spec/primitives.sx `append!` (+ OCaml impl)
+- What: `append!` mutates only literal `(list ...)` cells. Lists produced by `map`, `filter`, `rest`, `reverse`, `append` are silently unappendable — no error, mutation lost. `append!` returns the appended *value*, masking the failure.
+- Repro: `(let ((xs (map (fn (x) x) (list 1 2)))) (append! xs 3) xs)` → `(1 2)`; literal list → `(1 2 3)`.
+- Coverage: test-primitives.sx:339 uses `append!` only on a literal-list accumulator.
+
+### [high] [CONFIRMED] expt silently wraps at 63-bit int; inconsistent with +/* which promote to float
+- Location: spec/primitives.sx `expt`
+- Repro: `(expt 2 62)` → `-4611686018427387904`; `(expt 2 100)` → `0`; but `(* 4611686018427387904 4)` → float and `(+ 9223372036854775807 1)` → float. `(expt 2.0 100)` correct.
+- Coverage: test-math.sx:66-71 — overflow not covered.
+
+### [high] [CONFIRMED] MCP harness primitive table diverges from real runtime — invalidates harness-based verification
+- Location: hosts/ocaml/bin/mcp_tree.ml (own primitive table, e.g. `bind "contains?"` L484, `bind "split"` L563) vs hosts/ocaml/lib/sx_primitives.ml (sx_server)
+- What: sx_harness_eval runs a *parallel implementation* of many primitives. Divergences (harness → runtime): `(empty? "")`/`(empty? {})` false → **true** (test-primitives.sx:89 asserts true — harness contradicts a passing test); `(get {:a 1} :a 99)` **nil even for present key** → 1; `(get {:a 1} :zz 99)` nil → 99; `(get (list 10 20) 1)` nil → 20; `(split "a--b" "--")` char-class → substring; `(split "abc" "")` crash → `("a" "b" "c")`; `equal?` undefined → defined; `(contains? {:a 1} :a)` true → **error**; `(keyword-name :kw)` `""` → error. CLAUDE.md mandates harness verification, so this drift silently produces false findings/passes.
+- Coverage: nothing tests harness/runtime parity. (Cross-lane: host tooling — see handoffs — but it's the spec-mandated verification path.)
+
+### [high] [CONFIRMED] contains? does not support dicts in the real runtime, contradicting its spec doc
+- Location: spec/primitives.sx `contains?` (":doc … Dicts: key check"); sx_primitives.ml
+- Repro: `(contains? {:a 1} :a)` → `Unhandled exception: "contains?: 2 args"` (misleading arity error); lists/strings work.
+- Coverage: list membership only (run_tests.ml:1255); no dict case.
+
+### [high] [CONFIRMED] canonical.sx depends on test-runner-only helpers — content addressing fails on ANY number outside run_tests
+- Location: spec/canonical.sx, `canonical-number` (46-59) calls `contains-char?` (defined only in run_tests.ml:728 / run_tests.js:85) and `trim-right` (run_tests.js:87 only — not even OCaml run_tests). Neither exists in sx_primitives.ml, sx_server.ml, or mcp_tree.ml.
+- What: `canonical-serialize`/`content-id` on the production server errors on any number. In the OCaml test runner the trim-right branch (floats with trailing zeros) is unreachable-but-passing because tests only canonicalise integers.
+- Repro: fresh sx_server: `(load "spec/canonical.sx")` `(canonical-serialize 42)` → `Undefined symbol: contains-char?`; with a shim, `(canonical-serialize 0.1)` → `Undefined symbol: trim-right`.
+- Coverage: test-canonical.sx covers ints/dict-sorting/CIDs — never a non-`.0` float; failure mode invisible to all suites.
+
+### [high] [CONFIRMED] Serializer emits dict keys unescaped — non-identifier keys produce unparseable/wrong output; canonical form not a fixed point (CID hazard)
+- Location: spec/parser.sx `sx-serialize-dict` (emits `(str ":" key)`); spec/canonical.sx `canonical-dict` (~79, same pattern)
+- What: dict keys are strings; both serializers print `:` + raw key. Keys with spaces/parens/non-ident chars produce output that reparses differently or errors. Since `canonical-serialize` feeds sha3-256, CIDs exist for values whose canonical form violates `canonical(parse(canonical(x))) = canonical(x)`. The native reader accepts string keys `{"a b" 1}`, so such dicts are creatable from plain source.
+- Repro: dict with key `"hello world"` → `"{:hello world 2 :k 1}"` → reparse errors; `{(+ 1 2) 5}` → key `"(+ 1 2)"` → serializes `{:(+ 1 2) 5}` → garbage.
+- Coverage: "serialize dict round-trips" uses keyword-shaped keys only.
+
+### [high] [CONFIRMED] Same source parses to different ASTs across the four ident/number classifier variants
+- Location: hosts/ocaml/lib/sx_parser.ml:36-46 (native), hosts/ocaml/bin/sx_server.ml:1330-1348, hosts/ocaml/bin/mcp_tree.ml:391-410, hosts/javascript/platform.py:2622-2626 — four different ident-start/ident-char tables feeding the one spec grammar
+- What (all verified live): `(a,b)` → single symbol on native/mcp/JS but `(a (unquote b))` on sx_server guest; unicode idents accepted by mcp guest only (forbidden by the production reader); `$x`/`|y|` symbols on sx_server guest only; `0x10`/`0b101`/`1_000` → numbers 16/5/1000 on native (undocumented C-style acceptance) vs number 0 + symbol `x10` on guest/JS (silent token split); `inf`/`nan`/`-inf` are float literals on native (can't be variable names!) vs symbols on guest/JS; `1+`/`1abc` single symbols native vs silent `1` + symbol split guest/JS (`(1+ 2)` → 3-element list); `#t`/`#f` booleans native vs `Undefined symbol: reader-macro-get` on OCaml guest vs "Unknown reader macro" on JS; `{1 2}` rejected native vs silently stringified key `"1"` guest/JS.
+- Coverage: none of these tokens appear in any test file — suites exercise only the intersection.
+
+### [high] [CONFIRMED] `1e` bare-exponent numbers silently parse to nil in the guest parser
+- Location: spec/parser.sx `read-number` → `parse-number` fallthrough (nil emitted as a value, no error)
+- Repro: `(sx-parse "1e")` → `(nil)`; JS `parseAll('1e')` → `[null]`; native reader yields `Symbol "1e"` — a third behavior. `(foo 1e)` becomes `(foo nil)` silently.
+- Coverage: only valid exponent forms tested.
+
+### [high] [CONFIRMED] Guest parser cannot produce rationals on server/tooling hosts — `1/2` throws
+- Location: spec/parser.sx `read-number` (215-231); sx_server.ml:1325 and mcp_tree.ml:384 override `parse-number` to always return float, shadowing the Integer-aware sx_primitives version; `make-rational` rejects (Number,Number)
+- Repro: fresh sx_server: `(load "spec/parser.sx")` `(sx-parse "1/2")` → `make-rational: expected 2 integers`. Works only in run_tests env. Native reader parses `1/2` fine.
+- Coverage: test-rationals.sx (62 tests) never uses `sx-parse`; test-parser.sx has zero rational tests.
+
+### [high] [CONFIRMED] Strict mode: HO-form callbacks bypass type checks entirely
+- Location: spec/evaluator.sx `step-continue` — map/filter/reduce/for-each/some/every/multi-map frames call `continue-with-call` directly; only the "arg" frame runs `strict-check-args` (enforcement site 4152-4194). Same in sx_ref.ml:1009.
+- What: with strict on and types declared for `f`, `(f "a")` errors but `(map f coll)`/`(filter f coll)`/`(reduce f init coll)`/`(for-each f coll)`/`(every? f coll)`/`(some f coll)` silently pass mistyped elements. Also unchecked: cond `=>` arrow calls, call/cc continuation invocation, exception-handler invocation, signal-subscriber cek-calls.
+- Repro: `hh` typed `(x number)`: `(hh "abc")` → type error; `(map hh (list "a" "b"))` → `("a" "b")` silently.
+- Coverage: test-strict.sx checks direct calls only.
+
+### [high] [CONFIRMED] Strict mode: `apply` bypasses type checks on the target function
+- Location: hosts/ocaml/lib/sx_primitives.ml:1534 / sx_server.ml:1240 — native prim spreads args and calls directly
+- Repro: `(apply hh (list "a"))` → `"a"` (no error); direct `(hh "a")` errors.
+- Coverage: not covered.
+
+### [high] [CONFIRMED] `dispose-computed` is a no-op — computed signals leak subscriptions after disposal
+- Location: spec/signals.sx, `dispose-computed` — `(signal-remove-sub! dep nil)` passes **nil** as the subscriber; the actual `recompute` closure is trapped in `computed`'s letrec and unreachable. The island-scope disposer registered by `computed` is therefore broken (contrast `effect`, whose dispose-fn works).
+- Repro: computed on `a2` (1 run); `(dispose-computed c2)`; `(reset! a2 5)` → runs=2, value updated. Expected: runs=1, unchanged. Subscriber leak in island teardown.
+- Coverage: no dispose-computed test exists.
+
+### [high] [CONFIRMED] Exception inside `batch` permanently wedges the reactive system
+- Location: spec/signals.sx, `batch` — increments `*batch-depth*`, runs thunk with no unwind protection; decrement skipped on throw
+- What: after any error escapes a batch thunk (even if caught outside), `*batch-depth*` stays >0 — every future `notify-subscribers` queues forever and never flushes; all reactivity dead. Related: `(import (sx signals))` copies value bindings rather than aliasing, so the top-level `*batch-depth*` reads 0 while the library-internal one is 1 (exported mutable state vars are misleading).
+- Repro: effect on `a3` (fired=1); `(guard (e (true "caught")) (batch (fn () (error "boom"))))` → caught; `(reset! a3 2)` → fired stays 1. Control test without error flushes correctly.
+- Coverage: not covered.
+
+### [high] [CONFIRMED — surfaced by hosts lane, verified here] emit!/emitted state accumulates across evaluator invocations — cross-request contamination on the server
+- Location: spec/evaluator.sx scope/emit frame handlers + the process-global scope stacks (hosts: sx_primitives.ml `_scope_stacks`)
+- What: `(scope (emit! :k 1) (emit! :k 2) (len (emitted :k)))` returns 2, then 4, then 6 on successive epoch-server evals — the emit accumulator for a normally-exited scope persists in process-global state and each new scope sees prior invocations' values. On the HTTP server this means one request's emitted values are visible to the next (correctness + information-leak class). Complements the provide/raise leak finding: the scope facility's global stacks are neither unwind-safe NOR invocation-scoped. (My in-eval probe showed no leak *within* one evaluation — the leak is across evaluator entries.)
+- Repro: three identical `(eval "(scope (emit! :k 1) (emit! :k 2) (len (emitted :k)))")` epochs on one fresh sx_server → `2`, `4`, `6`. JIT disabled, so not a VM bug.
+- Coverage: scope/emit!/emitted have zero tests (noted previously); cross-invocation behavior untested anywhere.
+
+### [medium] [CONFIRMED] provide's dynamic value permanently leaks on non-local exit (raise, shift)
+- Location: spec/evaluator.sx, `step-sf-provide` (:3344 `scope-push!`) + "provide" frame handler (:4293, `scope-pop!` only on normal completion); no pop during raise/guard/shift unwinding
+- What: `provide` pushes onto a global per-name stack, popped only on normal frame completion. Any non-local exit through the body skips the pop — the value stays on the global stack **forever**, and `context` prefers `scope-peek`, so all later code sees the stale value.
+- Repro: `(do (guard (e (true "caught")) (provide "kk" 42 (raise "boom"))) (context "kk"))` → **42** (expected nil). `(do (reset (provide "esc" 9 (shift k 77))) (context "esc"))` → **9**.
+- Coverage: test-unified-reactive.sx covers provide/context nesting for normal exits only.
+
+### [medium] [CONFIRMED] provide! outside any enclosing provide installs a permanent ambient global
+- Location: spec/evaluator.sx, "provide-set" frame handler (:4334-4346: pop-then-push); host `scope-pop!` on empty stack is a no-op (sx_primitives.ml:1998)
+- Repro: `(do (provide! "pk" 7) nil)` then, in a later top-level eval, `(context "pk")` → **7**.
+- Coverage: provide! tests all run inside provide scopes; bare case uncovered.
+
+### [medium] [CONFIRMED×2] set! on unbound name silently creates a binding — contradicting both spec docs — and JIT vs interpreter write different global tables (split brain)
+- Location: spec/evaluator.sx `step-sf-set!` + hosts/ocaml/lib/sx_types.ml `env_set_id` (:378 root-create fallback) vs sx_vm.ml OP_GLOBAL_SET (:606 writes `vm.globals`); contradicted docs: spec/eval-rules.sx:112 ("Error if name is not bound"), spec/special-forms.sx:141 ("must already be bound")
+- What: (a) interpreted `set!` on unbound silently creates a root binding — typo'd set! hides bugs, and directly contradicts both spec documents (test-scope.sx:196 locks the create behavior, so impl-vs-doc conflict must be resolved one way or the other). (b) inside a JIT-compiled lambda the same `set!` writes the VM's separate `vm.globals` table — visible to VM code, **invisible to interpreted code**.
+- Repro: `(set! never-defined-var 5)` → 5 (readable after). Split brain: `(do (define setter (fn () (set! q5 42))) (define reader (fn () q5)) (setter) (reader))` → **"Undefined symbol: q5"** (yet q5 reads as 42 inside setter).
+- Coverage: test-scope.sx:196 asserts creation only; visibility split uncovered.
+
+### [medium] [CONFIRMED] Quasiquote has no depth tracking — nested quasiquote evaluates inner unquotes early; `,,x` errors
+- Location: spec/evaluator.sx, `qq-expand` (no level parameter)
+- Repro: `(let ((x 7)) (quasiquote (a (quasiquote (b (unquote x))))))` → `(a (quasiquote (b 7)))` (Scheme: unquote preserved); `` `(a `(b ,,x)) `` → `Undefined symbol: unquote`.
+- Coverage: test-cek-advanced.sx:486 "nested unquote" is single-level despite its name.
+
+### [medium] [CONFIRMED] Quasiquote does not traverse dict literals — `,v` inside `{...}` stays literal
+- Location: spec/evaluator.sx, `qq-expand` (non-list templates returned as-is)
+- Repro: `(let ((v 3)) (quasiquote {:k (unquote v)}))` → `{:k (unquote v)}`. Inconsistent with dict eval rule ("values are evaluated", eval-rules.sx:40).
+- Coverage: not covered.
+
+### [medium] [CONFIRMED] guard clause bodies: multi-expr → crash; multi-expr `else` → "Undefined symbol: else"
+- Location: spec/evaluator.sx, `step-sf-guard` — clauses spliced verbatim into a generated `cond`, inheriting the cond dual-mode defect
+- Repro: `(guard (e (true 1 2)) (raise 9))` → `Not callable: nil`; `(guard (e (else 1 2 3)) (raise 9))` → `Undefined symbol: else`. R7RS requires body sequencing. `=>` receiver works.
+- Coverage: only single-expr clause bodies tested.
+
+### [medium] [CONFIRMED] defmacro/fn `&key` params silently misbind — keyword names ignored, off-by-one positional binding
+- Location: spec/evaluator.sx, macro/lambda param binding (&key pairing implemented only for components)
+- Repro: `(defmacro mk2 (&key a b) ...)`: `(mk2 :a 10 :b 20)` → a=10, b=`:b` (the keyword itself); `(mk2 :b 20 :a 10)` → a=20 despite the `:b` label. Plain `(fn (&key a b) ...)` treats `&key` as a positional param name → "expects 3 args, got 4". Accepted without error, misbehaves.
+- Coverage: not covered.
+
+### [medium] [CONFIRMED] Splicing a non-list silently wraps it; malformed splice forms pass through literally
+- Location: spec/evaluator.sx, `qq-expand`
+- Repro: `(quasiquote (a (splice-unquote 5)))` → `(a 5)` (Scheme: error); `(splice-unquote xs ys)` (arity 3) → stays literal; `(unquote a b)` silently drops b.
+- Coverage: not covered.
+
+### [medium] [CONFIRMED] `do` misparses a first form whose head is a list (IIFE) as a Scheme do-loop
+- Location: spec/evaluator.sx, step-eval-list "do" branch (~1843): dispatches to do-loop when `(list? (first (first args)))`
+- Repro: `(do ((fn (x) x) 5) 99)` → error `"first: expected list, got 5"`; expected 99.
+- Coverage: not covered.
+
+### [medium] [CONFIRMED] scope's `:value` parameter is parsed but unreadable — dead feature + dead frame type
+- Location: spec/evaluator.sx, `step-sf-scope` (:3318) / `make-scope-acc-frame` (:120); `context`/`peek` never consult scope-acc frames. Pre-CEK `sf-scope` (:1495) did `scope-push!`; the CEK rewrite dropped it. Frame type "scope" (make-scope-frame :111, handler :4279) is never pushed by any live path.
+- Repro: `(scope "v" :value 10 (list (context "v") (peek "v")))` → `(nil nil)`.
+- Coverage: scope/emit!/emitted have ZERO tests in spec/tests (doc example only, eval-rules.sx:200).
+
+### [medium] [CONFIRMED] Host-level errors are uncatchable by guard (only SX-level raise is)
+- Location: spec/evaluator.sx raise/handler machinery vs host primitive errors
+- What: errors from host primitives (`rest: 1 list arg`, `Undefined symbol`, arity errors) escape enclosing `guard` entirely; only guest `(raise ...)` unwinds to handlers. Guest code cannot write defensive wrappers around primitive misuse.
+- Repro: `(guard (e (true "caught")) (undefined-symbol-xyz))` → propagates, guard never fires.
+- Coverage: test-errors.sx/test-conditions.sx exercise guest raise only.
+
+### [medium] [CONFIRMED] `values`/`call-with-values` bound only inside the test runner — Undefined symbol on every real runtime surface; `let-values`/`define-values` unusable
+- Location: spec/evaluator.sx `values` (2093), `call-with-values` (1392), `sf-let-values` (1403), `sf-define-values` (1437); hosts/ocaml/bin/run_tests.ml:1131 (`bind "values"` — test env only)
+- Repro: `(call-with-values (fn () (values 1 2)) +)` on CLI → `Undefined symbol: call-with-values`; same expr under run_tests → PASS. test-values.sx (22 tests) overstates the shipped runtime.
+- Coverage: green only in the runner environment.
+
+### [medium] [CONFIRMED] map/filter/map-indexed are O(n²)
+- Location: spec/evaluator.sx, "map"/"filter" continue handlers (~4364, ~4397): `(append results (list value))` per element; map-indexed also recomputes `(len new-results)` each step
+- Repro: fresh sx_server: 10k → 0.58s, 20k → 2.56s, 40k → 13.6s (≈×4.7 per doubling); 100k map DNF in 120s while `(reduce + 0 (in-range 100000))` takes 0.32s. Stack-safe — purely time.
+- Coverage: not covered (no perf tests)
+
+### [medium] [CONFIRMED] HO form names are not first-class — value position yields nil with a misleading type
+- Location: spec/evaluator.sx, symbol lookup (~1650) vs special-cased call dispatch
+- Repro: `(define f2 map) (f2 (fn (x) x) (list 1 2))` → `Not callable: nil`; yet `(type-of map)` → `"function"`.
+- Coverage: not covered
+
+### [medium] [CONFIRMED] Cryptic uncatchable errors for bad HO data: dicts, both-args-callable
+- Location: spec/evaluator.sx, `seq-to-list` `(else x)` passthrough (~3573) + `ho-setup-dispatch`
+- Repro: `(map (fn (kv) kv) {:a 1 :b 2})` → `rest: 1 list arg`; `(map (fn (x) 1) (fn (y) 2))` → same. Expected: iterate dict entries or a clear "map: cannot iterate X".
+- Coverage: not covered
+
+### [medium] [CONFIRMED] Multi-collection map rejects strings/vectors that single-collection map accepts
+- Location: spec/evaluator.sx, `ho-setup-dispatch` "map" N-coll branch skips `seq-to-list`
+- Repro: `(map + (vector 1 2) (vector 10 20))` → `first: expected list, got #(1 2)`; single-collection vector/string map works.
+- Coverage: list multi-map covered (test-r7rs.sx:110–124); strings/vectors not
+
+### [medium] [CONFIRMED] Threading a lambda literal returns a silently malformed lambda
+- Location: spec/evaluator.sx, `thread-insert-arg` — splices the value into the params position of `(fn ...)`
+- Repro: `((-> 5 (fn (y) (+ y 1))) 7)` → `Undefined symbol: y`. Should error at thread time.
+- Coverage: not covered
+
+### [medium] [CONFIRMED] Attribute names are never escaped/validated — spreading an untrusted-keyed dict injects attributes (XSS class)
+- Location: spec/render.sx, `render-attrs` (emits key raw) + `merge-spread-attrs` (copies spread-dict keys verbatim)
+- What: attribute *values* are escaped; attribute *names* are concatenated raw. Keys reach render-attrs via the spread operator, so spreading a dict built from user data yields event-handler injection.
+- Repro: `(render-attrs {"x onload=alert(1) y" "1"})` → ` x onload=alert(1) y="1"`. Values confirmed safe.
+- Coverage: not covered
+
+### [medium] [CONFIRMED] Five void elements unrenderable — in VOID_ELEMENTS but missing from HTML_TAGS
+- Location: spec/render.sx, `VOID_ELEMENTS` vs `HTML_TAGS`
+- Repro: `area base embed param track` fall through to function-call dispatch: `(render-to-html '(base :href "x") ...)` → `Undefined symbol: base`.
+- Coverage: void suite tests br/hr/img/input/meta/link/source/col/wbr only
+
+### [medium] [CONFIRMED] aser serialises list-valued keyword args as bare unquoted lists → breaks on client re-evaluation
+- Location: web/adapter-sx.sx `aser-call`
+- Repro: `(aser '(~tags :items (list "a" "b")) env)` → `(~tags :items ("a" "b"))`; re-evaluating the wire form → `Not callable: nil`. Dicts round-trip fine; only lists break. Should emit `(quote (...))` or `(list ...)`.
+- Coverage: test-aser covers lists as children, not as kwarg values
+
+### [medium] [CONFIRMED-html / SUSPECTED-dom — independently double-confirmed] render-to-dom disagrees with render-to-html on non-boolean attrs valued true/false (hydration mismatch)
+- Location: web/adapter-dom.sx (attr cond ~357) vs spec/render.sx `render-attrs`
+- What: for attrs NOT in BOOLEAN_ATTRS, HTML mode stringifies (`data-flag="true"`, `data-off="false"`), DOM mode omits `false` and emits `true` as an empty attr. SSR HTML and hydrated DOM differ. HTML side executed; DOM side code-read (dom adapter not loadable in harness). Cross-check: hosts lane C19 found the same defect independently (same conclusion, same confidence split) — treat as confirmed pending a browser-side execution.
+- Repro: `(render-to-html '(div :data-flag true :data-off false) ...)` → `<div data-off="false" data-flag="true">`.
+- Coverage: not covered
+
+### [medium] [CONFIRMED] String primitives are byte-based; substring can produce invalid UTF-8
+- Location: `string-length`, `substring`, `upcase`/`downcase`
+- Repro: `(string-length "é")` → 2, `"👍"` → 4; `(substring "é" 0 1)` → `"<22>"`; `(upcase "héllo")` → `"HéLLO"`. Constructors are codepoint-aware (`char-from-code 233` → `"é"`) while measurement is byte-based. Project rule "use UTF-8 chars" makes this a live hazard.
+- Coverage: no codepoint-semantics tests.
+
+### [medium] [CONFIRMED] Spec declares primitives that don't exist; runtime has primitives the spec omits
+- Location: spec/primitives.sx
+- What: `eq?` (L285), `eqv?` (L292) declared, undefined in both harness and sx_server; `into` (L722) declared — IO-bridge-only in server; `json-encode` declared plain but IO-bridge-only; `sort` exists in runtime but NOT in spec; header (L27-35) claims ~40 functions "moved to stdlib.sx" but stdlib.sx contains only `format`.
+- Repro: `(eq? 1 1)` → `Undefined symbol: eq?`; `(sort (list 3 1 2))` → `(1 2 3)`.
+- Coverage: drift untested.
+
+### [medium] [CONFIRMED] Division-by-zero inconsistency: / returns inf silently, mod/quotient leak raw OCaml exception
+- Repro: `(/ 1 0)` → `inf`; `(mod 7 0)`/`(quotient 7 0)` → unstructured host `Division_by_zero`.
+- Coverage: not covered.
+
+### [medium] [CONFIRMED] `/` doc contradicts behavior: ":returns float" but exact results snap to int
+- Repro: `(integer? (/ 6 3))` → true. `(/ 1 3)` → float, never rational despite `make-rational`.
+- Coverage: behavior covered green — the doc is wrong.
+
+### [medium] [CONFIRMED] sort takes no comparator
+- Repro: `(sort (list 3 1 2) (fn (a b) (> a b)))` → `Unhandled exception: "sort: 1 list"`. Natural ascending on numbers/strings only.
+- Coverage: not covered.
+
+### [medium] [CONFIRMED] Strict type errors are uncatchable by guard (host/spec error-channel divergence)
+- Location: sx_ref.ml `strict_check_args` (:516, raises Eval_error outside the CEK raise-eval machinery); the spec expresses it as `(error ...)` which would use the ordinary condition channel
+- What: `(guard (e (true ...)) (typed-call bad-arg))` does not catch — the type error escapes to top level, while user `(error "boom")` IS caught by the same guard. Programs cannot recover from type errors. Same channel problem as the general host-errors-uncatchable finding, but here spec and host disagree about which channel it should be.
+- Repro: `(guard (e (true (str "CAUGHT: " e))) (s1 "bad"))` → protocol-level type error; `(guard (e (true ...)) (error "boom"))` → caught.
+- Coverage: test-strict.sx asserts at the runner level; the guard channel untested.
+
+### [medium] [CONFIRMED] Strict mode: unknown type names silently match everything
+- Location: spec/evaluator.sx `value-matches-type?` — `_` fallback returns true for any unknown non-"?"-suffixed string; `set-prim-param-types!` does no validation
+- Repro: `gg` typed `(x "integer")`: `(gg "abc")` → `"abc"` (typo silently disables checking); `"frobnicate?"` matches all values.
+- Coverage: not covered.
+
+### [medium] [CONFIRMED] Strict mode: `"keyword"` type is dead; components are untypeable
+- Location: `value-matches-type?` vs eval-rules.sx keyword rule (keywords evaluate to strings)
+- What: (a) evaluated keyword args arrive as strings, so a `"keyword"`-typed param always fails on `(f :foo)` and passes plain strings via `"string"`; (b) `type-of` a component is `"component"`, which fails `"lambda"`, and `"component"` isn't a match branch — falls to the catch-all and **accepts everything**. No way to require a component.
+- Repro: `(pk :foo)` → "expected keyword … got string (foo)"; `c7` typed `"component"`: `(c7 42)` passes.
+- Coverage: no keyword/lambda/component type tests.
+
+### [medium] [CONFIRMED] Strict mode: component `&key` calls misalign with positional type specs
+- Location: strict-check-args positional indexing vs component keyword calling convention
+- Repro: `~tc` typed `(a number)`: `(~tc :n 5)` → "expected number for param a, got string (n)" — the keyword marker itself is checked as arg 0. Typing components via this machinery is impossible.
+- Coverage: not covered.
+
+### [medium] [CONFIRMED] Signals: reset!/computed change-detection is dead for numbers and strings
+- Location: spec/signals.sx `reset!`, `swap!`, `computed` — `(when (not (identical? old value)) ...)`; `identical?` is physical equality: `(identical? 5 5)` → false
+- What: setting a signal to its current value still notifies; computeds recomputing to an equal number/string still cascade — spurious re-runs throughout the reactive graph.
+- Repro: effect on `(signal 5)` (runs=1); `(reset! a7 5)` → runs=2. Expected 1.
+- Coverage: not covered.
+
+### [medium] [CONFIRMED] Signals: diamond dependency glitch — no glitch-freedom
+- Location: spec/signals.sx — notify/flush propagate depth-first synchronously; batch dedups only direct subscribers of directly-mutated signals and decrements depth before cascades
+- What: a → b,c → d: one change to `a` recomputes `d` twice; the first recompute observes new-b with stale-c (inconsistent intermediate state).
+- Repro: initial d runs=1; `(reset! a 2)` → d runs=3, final value correct.
+- Coverage: not covered.
+
+### [medium] [CONFIRMED] Datum comment `#;` cannot precede `)` or end input — all three parsers
+- Location: spec/parser.sx read-expr `#;` branch (discard-then-read-next); sx_parser.ml:167-171 same structure
+- Repro: `(sx-parse "(a #;b)")` → `Unexpected character: )`; `(sx-parse "1 2 #;3")` → `Unexpected end of input`. Standard Lisp: `(a #;b)` = `(a)`.
+- Coverage: three datum-comment tests, all mid-list.
+
+### [medium] [CONFIRMED] Char values never compare equal — `=` lacks a Char case
+- Location: hosts/ocaml/lib/sx_primitives.ml `safe_eq` (749-804): no Char,Char arm → falls to `_ -> false`
+- Repro: `(= (make-char 32) (make-char 32))` → `false`. parse(serialize(char)) ≠ char for every char; char-keyed memoization silently fails.
+- Coverage: test-chars.sx compares via char->integer/predicates; no `=`-on-chars test.
+
+### [medium] [CONFIRMED] `#\a` char literals crash the guest parser on the mcp-tree host (Int/Float primitive drift)
+- Location: mcp_tree.ml:378 (`char-code` returns float) vs sx_primitives.ml:2811 (`make-char` requires Integer)
+- Repro: mcp harness `(sx-parse "#\a")` → `make-char: expected integer codepoint`; sx_server OK. Same shadowing family as parse-number.
+- Coverage: no char-literal-via-sx-parse tests.
+
+### [medium] [CONFIRMED] Multibyte character literals broken everywhere; serialized chars ≥128 don't reparse; unknown char names silently truncate
+- Location: sx_parser.ml:153-159 (byte-level Char.code); spec/parser.sx `read-char-literal` (byte-level); serializer emits `#\` + raw char
+- Repro: native `'#\é` → `Parse_error "Unexpected char: \169"`; `(sx-serialize (make-char 233))` → `"#\é"` which no parser reads back; `#\spade` → `#\s` silently (both implementations).
+- Coverage: no non-ASCII char literals tested.
+
+### [medium] [CONFIRMED] `\uXXXX` escape: invalid input crashes raw (OCaml) or silently corrupts (JS); no astral/surrogate-pair support
+- Location: spec/parser.sx read-string `\u` branch (no hex validation, no bounds check, -1 from failed digit lookup); sx_parser.ml:70-77
+- What: valid BMP works on all three parsers (the "never use \uXXXX" project rule is style, not brokenness). Invalid hex: guest → raw `Invalid_argument` (negative codepoint); native → uncaught `Failure("int_of_string")`; JS → silently yields garbage string. Surrogates: OCaml raises, JS produces lone surrogate. Truncated `"\u41"` → guest reads past the closing quote (`Expected string, got nil`). Astral unrepresentable.
+- Coverage: zero \u tests in any suite.
+
+### [medium] [CONFIRMED] Unknown string escapes diverge: native keeps the backslash, guest/JS drop it
+- Location: sx_parser.ml:79 (`_ -> add '\\'; add esc`) vs spec/parser.sx read-string `:else esc`
+- Repro: `"a\qb"` is 4 chars through the native reader, 3 chars through guest/JS — same source file, different data depending on which parser read it. `\b`/`\f` unsupported both (silent literal); native additionally accepts undocumented `\/` and `` \` ``.
+- Coverage: only \n \t \" tested.
+
+### [medium] [CONFIRMED] `#name` extensible reader-macro dispatch is unimplemented on OCaml hosts
+- Location: spec/parser.sx:459 (`reader-macro-get`); registry exists only in hosts/javascript/platform.py:2639-2640
+- Repro: mcp harness `(sx-parse "#t")` → `Undefined symbol: reader-macro-get` (instead of the intended "Unknown reader macro" error). The sole production evaluator cannot register reader macros at all.
+- Coverage: reader-macro suite tests only `#;` `#|` `#'`.
+
+### [low-medium] [CONFIRMED] case: `:else`/`else` matches in ANY position, shadowing later valid clauses
+- Location: spec/evaluator.sx, `step-sf-case` / `is-else-clause?`
+- Repro: `(case 1 :else "e" 1 "one")` → "e".
+- Coverage: not covered.
+
+### [low-medium] [CONFIRMED] case: evaluated datums, keyword/string punning, Scheme clause syntax crashes misleadingly
+- Location: spec/evaluator.sx, `step-sf-case`; documented flat in eval-rules.sx:70 (but rule text doesn't say vals are evaluated)
+- What (verified): vals evaluated sequentially until match (side effects only for pre-match vals), scrutinee once, comparison structural `=` (lists match), duplicates first-wins, no-match+no-else → nil; keywords evaluate to strings so `(case "k" :k "kw")` matches. Scheme datum-list clauses crash: `(case "a" (("a") 1) (else 2))` → `Not callable: ("a")`. Flat form is intended (test-cek.sx:130-138); the unstated eval semantics + hostile diagnostic are the issues.
+- Coverage: happy paths only.
+
+### [low] [CONFIRMED×2] letrec is parallel (not letrec*) and reference-before-init silently yields nil
+- Location: spec/evaluator.sx, `sf-letrec` (~1366-1469: all inits evaluated before any name bound; names pre-bound nil)
+- Repro: `(letrec ((a b) (b 1)) a)` → `nil` (R7RS: error); `(letrec ((a 1) (b (+ a 1))) b)` → **1** (nil-coerced by +; letrec* would give 2). Masks initialization-order bugs.
+- Coverage: only well-formed lambda recursion tested.
+
+### [low] [CONFIRMED] Documentation contradicts implementation: let IS sequential and multi-expression bodies ARE implicit begin
+- Location: spec/evaluator.sx `step-sf-let` (:3133 — `let` and `let*` dispatch identically, shared local frame) vs CLAUDE.md "SX Island Authoring Rules" (claims parallel let, last-expr-only bodies, "reactive text needs deref computed", "effects go in inner let")
+- What: `(let ((a 1) (b a)) b)` → 1; `(let ((x 5) (x (* x 2))) x)` → 10; let/when/fn multi-expr bodies evaluate every form (side effects verified). Sequential let is explicitly tested intent (test-scope.sx:45). The CLAUDE.md gotchas describe a different evaluator (likely the OCaml SSR island path) — doc drift that misleads every SX author. Also `(let ((f (fn () a2)) (a2 5)) (f))` → 5: binding-init lambdas capture the let frame itself (letrec-like — beyond even letrec* semantics; worth documenting).
+- Coverage: sequential let tested; the doc is what's wrong.
+
+### [low] [CONFIRMED] Component &key argument `false` is coerced to nil
+- Location: spec/evaluator.sx, component branch: `(env-bind! local p (or (dict-get kwargs p) nil))`
+- Repro: `(do (defcomp ~t1 (&key flag) (if (nil? flag) "NIL" "VAL")) (~t1 :flag false))` → `"NIL"`. Components can't distinguish `:flag false` from omitted.
+- Coverage: invisible to test-defcomp.sx (only used in conditionals).
+
+### [low] [CONFIRMED] Trailing keyword argument without a value silently accepted
+- Location: spec/evaluator.sx, `parse-keyword-args` (:935)
+- Repro: `(do (defcomp ~c4 (&key a) (list a)) (~c4 :a))` → `(nil)`; expected kwarg error.
+- Coverage: not covered.
+
+### [low] [CONFIRMED] defmacro is unhygienic (classic capture) while the test suite is named "macro-hygiene"
+- Repro: `(defmacro my-or2 (a b) `(let ((t ,a)) (if t t ,b)))`; `(let ((t 5)) (my-or2 false t))` → `false`. CL-style defmacro — judged intended (gensym available, unique, tested); but test-macros.sx "macro-hygiene" suite (line 208) tests only the leak-OUT direction, overstating the guarantee.
+
+### [low] [CONFIRMED] match has no guard clauses — Racket-style `(pattern (when cond))` silently read as a structural pattern
+- Repro: `(match 9 ((x (when (> x 5))) "big") (_ "small"))` → "small" (silent structural fail → fall through). Supported features work; non-match raises properly. `let-match` is dict-destructuring only; list patterns give a confusing "no clause matched".
+- Coverage: supported features covered; guard-clause rejection not.
+
+### [low] [CONFIRMED] Components not recognized by `ho-fn?`; map-with-component yields silent zeros
+- Location: spec/evaluator.sx, `ho-fn?` (3554) — no component check
+- Repro: `(defcomp ~c2 (x) (* x 2))`; `(map ~c2 (list 1 2 3))` → `(0 0 0)`; `(map (list 1 2 3) ~c2)` → `rest: 1 list arg`.
+- Coverage: not covered
+
+### [low] [CONFIRMED] `|>` alias is dead code — parser rejects `|`
+- Location: spec/evaluator.sx step-eval-list `("|>" ...)` (1906); tokenizer
+- Repro: `(|> (list 1 2 3) ...)` → `Parse_error("Unexpected char: |")`. Branch unreachable.
+
+### [low] [CONFIRMED] Keywords-as-getters unsupported in HO fn position and `->` chains, with misleading errors
+- Repro: `(map :name (list {:name 1}))` → `Not callable: "name"`; `(-> {:a {:b 42}} :a :b)` → `Not callable: nil`.
+- Coverage: not covered
+
+### [low] [CONFIRMED] Zero/one-arg HO calls return empty results silently
+- Repro: `(map)` → `()`; `(map (fn (x) x))` → `()`; `(reduce +)` → `nil`; `(-> (list 1 2 3) map)` → `()` (plausible typo silently discards data).
+- Coverage: not covered
+
+### [low] [CONFIRMED] Boolean-attr truthiness footguns: string "false" and 0 emit the bare attribute
+- Location: spec/render.sx, `render-attrs` (SX truthiness)
+- Repro: `(input :disabled "false")` → `<input disabled />`; `(input :disabled 0)` same. Aligns with SX truthiness but surprising when values come from data.
+- Coverage: true/false booleans tested; string/number values not
+
+### [low] [CONFIRMED] `is-render-expr?` exported but dead; `html:` tags and hyphenated custom elements error despite being "recognised"
+- Location: spec/render.sx, `is-render-expr?` — zero callers
+- Repro: `(render-to-html '(html:my-tag :foo "bar") ...)` → `Undefined symbol: html:my-tag`; `(aser '(custom-widget :foo "bar" "child") ...)` → `Undefined symbol: custom-widget`.
+- Coverage: not covered
+
+### [low] [CONFIRMED] `<script>`/`<style>` content is HTML-escaped like text — corrupts legitimate inline JS/CSS
+- Location: web/adapter-html.sx `render-html-element`
+- Repro: `(script "if (a < b && c) { x=\"y\"; }")` → entities inside script (broken JS); `(style ".a > .b {}")` → `.a &gt; .b {}`. Blocks `</script>` breakout (good) but breaks real inline code; `raw!` is the workaround.
+- Coverage: only script attrs tested, never content
+
+### [low] [CONFIRMED] Comparison/equality strictly binary; = is deep structural equality conflating exactness
+- Repro: `(< 1 2 3)`/`(= 1)` → unstructured arity error (matches spec, deviates from Scheme); `(= {:a 1} {:a 1})` → true; `(= 1 1.0)` → true (dedup-key hazard).
+
+### [low] [CONFIRMED] Rounding half-away-from-zero, not banker's; inexact->exact rounds; (sqrt -1) → nan
+- Repro: `(round 2.5)` → 3 (R7RS: 2); `(inexact->exact 1.5)` → 2 (locked by test-numeric-tower.sx:115 — intended but R7RS-divergent); `(sqrt -1)` → nan silently.
+
+### [low] [CONFIRMED] Float/nil rendering inconsistencies across str/format/render
+- Repro: `(str 1.0)` → `"1"` (float/int distinction lost — also `(div 1.0)` renders `1`); `(str nil)` → `""` but `(format "~a" nil)` → `"()"`; `(format "~d" 3.7)` → `"3"` (silent truncation).
+
+### [low] [CONFIRMED] Inconsistent nil/empty tolerance across list ops
+- Repro: `(first nil)` → nil, `(rest nil)` → `()`, `(nth (list 1 2) 5)` → nil silently — but `(last nil)`, `(reverse nil)`, `(nth nil 0)` all raise.
+
+### [low] [CONFIRMED] keys returns strings in reverse insertion order
+- Repro: `(keys {:a 1 :b 2 :c 3})` → `("c" "b" "a")`. Determinism footgun for serialization/content-addressing.
+
+### [low] [CONFIRMED] keyword-name unusable on evaluated keywords
+- Repro: `(keyword-name :kw)` → error (`:kw` self-evaluates to `"kw"`); only `(keyword-name ':kw)` works.
+
+### [low] [CONFIRMED] string->number: no rational/whitespace parsing
+- Repro: `"1/2"` → nil (despite make-rational), `" 5 "` → nil, `"1e3"` → 1000, garbage → nil (good).
+
+### [medium] [CONFIRMED — CORRECTED after cross-lane check] `apply` does not spread AT ALL on the native production surface
+- Location: continue-with-call native-call path / apply primitive
+- What: originally reported as "leading-args form missing, two-arg form works" — WRONG. Re-verified on fresh sx_server: `(apply + (list 1 2))` → `Unhandled exception: "Expected number, got list: "`. The list is passed as a single argument, never spread — `(apply str (list 1 2 3))` → `"(1 2 3)"` (str of the list itself). The earlier "works" observation came from a test-runner/harness environment with its own apply. Conformance lane F-3 independently found this AND that the WASM kernel spreads the 2-arg form (→ 6) while native errors — the same kernel family disagrees with itself on apply.
+- Repro: `(apply + (list 1 2))` → error; `(apply + (list 1 2 3))` → error; `(apply str (list 1 2 3))` → `"(1 2 3)"` (fresh sx_server, verified 2026-07-03).
+- Coverage: not covered on the production surface (runner env has a different apply — see the values/call-with-values finding for the same pattern).
+
+### [low] [CONFIRMED] Strict checks are name-keyed at the call site — trivially evaded, and shadowers inherit checks
+- Repro: `(let ((zz hh)) (zz "a"))` → unchecked; computed heads `((mk) "bad")` → unchecked; conversely a user fn shadowing a typed name gets the declared checks applied to it. First-class function flow is entirely unchecked.
+- Coverage: not covered.
+
+### [low] [CONFIRMED] set-prim-param-types! replaces wholesale; no validation; malformed specs fail cryptically and uncatchably
+- What: second call wipes all earlier declarations (no merge); nonexistent prim names accepted silently; `{"positional" "oops"}` errors at call time with "Expected list, got string" (uncatchable, doesn't name the spec as culprit); `{"name" "not-a-dict"}` silently checks nothing; declaring types for HO-form names never fires (HO dispatch intercepts before the arg frame).
+- Coverage: only the nil-reset path tested.
+
+### [low] [CONFIRMED] Too-few args never error and their declared types are silently skipped
+- What: user lambdas nil-fill missing params (`(f2 1)` → `(1 nil)` with b typed number, no error); strict-check-args guards `idx < len(args)` so unsupplied params skip checking. Too-many args DO error. `foreign-check-args` has the mirror asymmetry (extra args unchecked; code-level).
+- Coverage: not covered.
+
+### [low] [CONFIRMED] `(:as type)` parameter annotations are never enforced — even in strict mode
+- Location: eval-rules.sx documents `(:as type)` in the lambda rule; spec/signals.sx uses them pervasively (`(s :as signal)`)
+- Repro: `(define tf (fn ((x :as number)) x))` `(tf "not-a-number")` → returns the string, strict on or off. The natural per-param channel is decorative; strict mode reads only the global name-keyed dict.
+- Coverage: not covered.
+
+### [low] [CONFIRMED] Strict-machinery paper cuts
+- Return types unsupported anywhere (params only). Rest-arg errors index from 0 within the rest section ("rest arg 0" is overall arg 2). `set-strict!` is one global OCaml ref — not per-env, not captured by continuations; toggling mid-program retroactively affects existing lambdas. Dead shadowed duplicates `_strict_ref`/`_prim_param_types_ref` at sx_ref.ml:18-19 (transpiler cruft, no desync). Host surface inconsistency: sx_server binds set-strict!/set-prim-param-types! but not value-matches-type?; the harness binds none. Positive: error message quality is good (names function, param, expected, actual, value).
+
+### [low] [CONFIRMED] `batch` unusable on the server host; coroutines module inert outside the test runner
+- What: `batch` calls `(batch-begin!)` on non-client hosts; `batch-begin!`/`batch-end!` are bound only in run_tests.ml:564 — on sx_server `(batch ...)` → `Undefined symbol: batch-begin!` (which, per the wedge finding, also leaves `*batch-depth*` stuck). Separately, spec/coroutines.sx lacks the trailing `(import (sx coroutines))` re-export that signals.sx/harness.sx have — loading it binds nothing globally; tests work only via explicit import + run_tests-only cek-* hooks.
+- Coverage: not covered.
+
+### [low] [CONFIRMED] `effect` stale cleanup double-invocation
+- Location: spec/signals.sx `effect`/`run-effect` — cleanup-fn invoked at each re-run start but never cleared; only overwritten when a run returns a new callable
+- Repro: effect returns cleanup only when v=0: after two resets, cleanup-calls = 2. Expected 1.
+- Coverage: not covered.
+
+### [low] [CONFIRMED] Guest parse errors carry no source locations; native has line/col on only 2 of ~8 error types
+- Location: spec/parser.sx (all error sites location-free); sx_parser.ml (locations only for "Unexpected end of input"/"Unexpected char"; unterminated string/list/dict etc. location-free)
+- Repro: `(sx-parse "(a (b)")` → just `"Unterminated list"`. Also test-source-locations.sx tests a parser-combinator library, NOT spec/parser.sx, and its cols are 0-based vs native 1-based.
+- Coverage: no reader-location tests exist.
+
+### [low] [CONFIRMED] Dict literal edges: odd form count → misleading error; duplicate keys silently last-win
+- Repro: `{:a}` → `Unexpected character: }` (no mention of pairing); `{:a 1 :a 3}` → `{:a 3}` silently (both parsers).
+- Coverage: not covered.
+
+### [low] [CONFIRMED] `#|...|` is a raw string to the first `|`, not a block comment; `#|a|#` leaves a dangling `#`
+- Repro: `(sx-parse "#|hello world|")` → `("hello world")`. Documented, but a Scheme-expectation trap with no test for the `|#` suffix case.
+
+### [low] [CONFIRMED] Keyword edge tokens: `:` parses as keyword with empty name; `::a` is a keyword named ":a"
+- Coverage: numeric-suffix/consecutive keywords tested; `:`/`::` not.
+
+### [low] [CONFIRMED] Harness contract nits
+- A throwing mock leaves no IO-log entry (append happens after the mock returns) — failed calls invisible to assert-io-called. `(assert cond)` one-arg form works only via the evaluator-wide nil-fill of missing params.
+
+### [low] [CONFIRMED] CLAUDE.md points at a deleted canonical spec (`shared/sx/ref/*.sx`)
+- What: CLAUDE.md instructs reading `shared/sx/ref/eval.sx`/`parser.sx`/`primitives.sx`/`render.sx` as "authoritative SX semantics"; the directory contains only `BOUNDARY.md` + Python cache. Live spec is `spec/*.sx`. Together with the island-authoring-rules drift (let/body semantics above), the project docs actively mislead on core semantics.
+
+---
+
+## SUSPECTED findings (reasoning only, not reproduced)
+
+### [medium] [SUSPECTED] More nested-eval boundaries: `expand-macro`, `sf-let-values`, `sf-define-values`, `qq-expand` unquotes all evaluate via `(trampoline (eval-expr ...))` instead of CEK frames
+- Location: spec/evaluator.sx, expand-macro (1548-1580), sf-let-values (1411-1417), sf-define-values (1443-1445), qq-expand unquote eval
+- Reasoning: same structural pattern as the three CONFIRMED nested-run bugs (shift-k invoke, threading, signal) — continuation capture, `perform`/IO suspension, or raise-to-outer-handler inside a macro body, let-values initializer, or unquote crosses a nested trampoline the outer kont cannot see. let-values untestable at runtime (`values` missing — see medium finding); macro-expansion capture is expansion-time and rare.
+- Coverage: not covered.
+
+### [low] [SUSPECTED] env_merge is_descendant depth cap (>100) silently flips scoping semantics
+- Location: hosts/ocaml/lib/sx_types.ml:394 (`if depth > 100 then false`)
+- Reasoning: call-site env chains deeper than 100 frames false-negative the descendant check, activating the caller-frame-copy branch (the dynamic-scoping leak above) in code that was previously purely lexical. Rare (needs ~100 nested closure/let layers), silent flip. Code-read only.
+- Coverage: not covered.
+
+### [medium] [SUSPECTED] Canonical serialization is not cross-host deterministic — CIDs can differ between OCaml and JS
+- Location: spec/canonical.sx (`canonical-number` uses host `str`; string case uses host `escape-string`)
+- Reasoning + partial confirmation: OCaml `(canonical-serialize 1e-7)` → `"1e-07"` (verified live) while JS `String(1e-7)` → `"1e-7"` (code-read) — different canonical text → different sha3 CIDs for the same value. Also: sx_server escapes `\r` (sx_server.ml:1275), JS platform does not (platform.py:2628); integers beyond 2^53 exact on OCaml, unrepresentable in JS. Full cross-host CID comparison not run.
+- Coverage: test-canonical.sx never canonicalises exponent-form floats, CR strings, or big ints. (Dict-key sorting IS implemented and idempotence holds for tested classes.)
+
+### [medium] [SUSPECTED] Coroutine performing a non-yield effect is permanently wedged
+- Location: spec/coroutines.sx, `coroutine-handle-result` — for a suspension with op ≠ "coroutine-yield" it does `(perform request)`: forwards outward but **discards both the answer and the coroutine's suspension**; state stays "running" and `coroutine-resume` has no "running" branch → "unexpected state: running"
+- Reasoning: code-level; not reproducible outside run_tests (needs cek-step-loop/cek-resume hooks bound only in run_tests.ml:951-955). Correct forwarding would cek-resume the suspension with the outer answer in a loop.
+- Coverage: test-coroutines.sx (27 tests) has zero `perform` usage.
+
+### [low] [SUSPECTED] VM/JIT execution path has no strict checking
+- Location: sx_vm.ml — zero callers of `strict_check_args` (repo-wide grep: only sx_ref.ml)
+- Reasoning: any call executed as compiled bytecode bypasses checks. Could not confirm live — lazy JIT never engaged in CLI probes (bytecode-inspect after 300 calls: "no compiled bytecode").
+- Coverage: not covered.
+
+---
+
+## Checked, NOT reproducible (negative results correcting project memory)
+
+- **"Short helper names (name/dyad) hang the runtime"**: does NOT reproduce — `(define name …)`/`(define dyad …)` work. The `guard` case is the unshadowable-name finding (error, not hang).
+- **"split is char-class not substring"**: harness/guest-worktree only. Real sx_server `(split "a--b" "--")` → `("a" "b")` substring, keeps empties. Multi-char delimiter untested in spec/tests — worth a pinning test.
+- **"let is parallel / bodies evaluate only last expr / effects need inner let"** (CLAUDE.md island rules): all false for the spec evaluator — let is sequential, bodies are implicit begin (tested intent). Likely describes the separate OCaml SSR island path → doc fix + cross-lane check.
+
+## Clean areas verified
+
+**CEK core**: TCO through all special forms (named-let 200k, mutual 100k, non-tail 100k heap-safe);
+call/cc escape/multi-shot/independence; shift/reset delimiting + multi-shot composable k; shift
+without reset → clean error; escape from HO callbacks; multi-shot resume INTO map frames (no
+accumulator leakage); raise through dynamic-wind one-shot (after exactly once, 50k-frame unwind);
+`(and)`/`(or)`/`(begin)`/`(cond)`/if-no-else edge values; cond `=>`; head-position exprs;
+parameterize; restart-case/invoke-restart.
+
+**Env/scope**: closure sharing + isolation both directions; define local-in-lambda vs top-level
+redefine; set! write-through 1-2 levels; `(let ((x x)) x)` → outer; letrec mutual recursion
+(lambda case); emit!/emitted ordering/extent/nesting/TCO-survival/no-leak (correct but ZERO test
+coverage — gap worth closing given the scope/provide bugs); provide/context/peek normal-flow
+nesting (well covered); component &rest/kwarg interleaving; component set! does not write back
+to caller; primitive shadowing works for genuine primitives.
+
+**HO forms**: HoSetupFrame stages both args exactly once, left-to-right, both orders; map over
+list-of-functions picks sane reading; guest raise mid-map caught cleanly; some/every?/filter/
+for-each/map-indexed semantics sane (0/"" truthy — internally consistent); no double-eval in
+threading (quoted-value splice protects data both paths); as->; ->> normalizes via swap; nested
+map-in-map; reduce 100k in 0.3s; multi-map zips to shortest (covered).
+
+**Special forms**: when/begin/do sequencing; and/or/if falsiness fully consistent (only false/nil
+falsy); short-circuit verified; defmacro recursive expansion, &rest + `,@` templates, ~name heads
+in qq; guard happy paths incl. R7RS auto-reraise; ->/set! interplay; eval-rules.sx accurate except
+set!-error claim, cond clause mode, case evaluated-vals; `unless` intentionally userland.
+
+**Render**: text + attr-value escaping correct; raw!/SxExpr single-escape guarantee (no double-
+escape); registered void elements self-close, drop children silently; boolean-attr registry (23)
+correct for true/false/nil; numbers/booleans/nil as children; aser wire semantics (components
+unexpanded, control flow evaluated, string/dict args round-trip incl. quotes/unicode); recursive-
+with-base-case components; fragment/nil/string/number component returns; &rest spliced flat.
+
+**Primitives**: quotient/remainder/modulo signs R7RS-correct; substring clamping; replace;
+trim/index-of/starts-with?/ends-with?; assoc/dissoc/merge/has-key?; range/flatten/chunk-every;
+rationals (normalization, contagion, zero-denominator errors); vectors/sets/ports/chars/string-
+buffers basics; dict-set! vs assoc; truthiness consistent; format directives; max/min zero-arg
+errors clean. Not probed (dedicated green suites): zip-pairs, bitwise, bytevectors, regexp.
+
+**Parser/serializer**: basic escapes correct + exact round-trips (quotes/backslashes/newlines/
+multibyte strings); quote sugar nesting incl. before `)`/EOF; 10k-deep nesting + 10k-char tokens
+parse fine (heap frames, no hangs on any adversarial input — every failure errors rather than
+loops); serializer round-trips for number/keyword/symbol/list/nested-dict(ident keys)/bool/nil;
+nil vs () vs {} distinct; canonical-dict key sorting + idempotence (tested classes); -0.0 → "0";
+negative numbers vs `-` symbol; `5.`/`1e10`/`-1.5e-3`; comments at EOF; dotted pairs cleanly
+rejected on all hosts; keyword AST round-trip.
+
+**Strict typing**: value-matches-type? core semantics correct (number/string/boolean/nil/list/
+dict; empty list not a dict; nullability exclusively via "type?" suffix — consistent; floats+ints
+both "number"; quoted symbols; lambdas). ->/->> threading IS strict-checked (re-dispatches a real
+call form). Recovery after a strict error works. Error messages high quality.
+
+**Signals**: effect does not re-run on unrelated signals; effect's dispose-fn unsubscribes
+correctly; batch dedups multiple resets of one signal (when it works — see wedge finding).
+
+**Harness (spec/harness.sx)**: interceptors log args/result/op correctly; arity fan-out 0-3 +
+apply; custom-platform merge over defaults; assertion messages descriptive.
+
+---
+
+## Handoffs to other lanes
+
+- **HOSTS**: hosts/ocaml/bin/mcp_tree.ml maintains its own primitive table, drifted from
+  sx_primitives.ml (empty?/get/split/contains?/equal?/keyword-name differ — details in the
+  harness-divergence finding). Also: sx_harness_eval is a shared persistent image, not a fresh
+  sandbox; sx_read_subtree ignores `path`; sx_read_tree ignores `max_lines`.
+- **HOSTS/CONFORMANCE — JIT vs interpreter divergence**: three confirmed behavior flips between
+  VM-compiled and interpreted paths: (1) set!-unbound writes vm.globals vs root env (split brain);
+  (2) env_merge caller-frame leak exists only interpreted ("VM undefined" under JIT); (3) named-let
+  leaked loop name reads as lambda interpreted / nil under JIT. Parity suite has no coverage.
+- **HOSTS (Python shell)**: aser output embedded into `<script>` via `json.dumps` in
+  shared/sx/helpers.py `sx_streaming_resolve_script` — `json.dumps` doesn't escape `/` or `<`;
+  check whether serialized SX can ever contain `</script>` (aser HTML-escapes text children,
+  but attr/raw paths unverified).
+- **CONFORMANCE**: run_tests.ml injects bindings absent from the real runtime — `values`/
+  `call-with-values` (test-values.sx), `contains-char?`/`trim-right` (canonical.sx),
+  `batch-begin!`/`batch-end!` (signals), cek-step-loop/cek-resume (coroutines). Whole suites are
+  green only in-runner; test-env vs runtime-env parity needs a systematic sweep.
+- **CONFORMANCE — parser fleet**: three parser implementations (native OCaml reader, spec guest
+  parser over per-host primitive bindings, JS transpiled spec) with four ident/number classifier
+  tables that were never reconciled (details in the AST-divergence finding). Guest-parser platform
+  primitives (`parse-number`, `char-code`, `contains-char?`, `trim-right`, `reader-macro-get`,
+  `escape-string`) drift per host because each host re-binds them ad hoc. Suites only exercise
+  the intersection — that's why everything stays 1080/1080 green.
+- **HOSTS (JS)**: JS parser silently corrupts invalid `\uXXXX` escapes (garbage string, no error)
+  where OCaml raises; JS `reader-macro-get` registry exists but OCaml's doesn't.
+- **DOCS**: CLAUDE.md island-authoring rules describe non-spec semantics (parallel let, last-expr
+  bodies); CLAUDE.md canonical-reference section points at deleted files.
+- **TOOLING incident log**: mid-review another session polluted the shared MCP image (`inc`
+  redefined to a constant, breaking guest parsing with spurious "Unterminated" errors); the parser
+  agent restored it. Underlines the harness-not-fresh finding — harness state is shared across
+  concurrent sessions.
--- a/plans/sx-review/hosts.md
+++ b/plans/sx-review/hosts.md