# SX Language Improvements — roadmap Language-building improvements to the SX evaluator, compiler, and standard library. Ordered by impact and prerequisite chain. Each step is one loop commit. Branch: `architecture`. SX files via `sx-tree` MCP only. Never edit generated files. ## Current baseline (2026-05-06) - SX core spec: 2571 passing (595 non-HS pre-existing failures — bytecode-serialize, defcomp-render, etc.) - HyperScript behavioral: 1478/1496 (run via `node tests/hs-kernel-eval.js`) - Active bugs: JIT combinator bug (11 HS failures), letrec+resume (browser-only) - E38 sourceInfo: 2/4 tests passing (tokenizer missing `:end`/`:line`, some spans incomplete) --- ## Phase 1 — Bug fixes ### Step 1: Fix JIT closures-returning-closures **What:** `parse-bind`, `many`, `seq`, and other parser combinators that return closures miscompile under JIT. The compiled closure drops intermediate stack values when the callee itself returns a closure. 11 HyperScript tests fail under JIT, pass under CEK. **Root cause in `hosts/ocaml/lib/sx_vm.ml`:** When a JIT-compiled closure returns another closure (i.e. the callee is `VmClosure`), the frame restoration after the call incorrectly reuses the parent frame's locals slot, overwriting saved intermediate values. The `call_closure_reuse` path must snapshot `sp` before the inner call and restore it after, or bail to the non-reuse path for closures-returning-closures. **Verify:** `node tests/hs-kernel-eval.js 2>&1 | tail -3` — should go from 3116/3127 to 3127/3127. ### Step 2: Fix letrec + perform resume (browser) **What:** In browser JIT mode, `letrec` sibling bindings are nil after a `perform`/resume cycle. `call_closure_reuse` in `sx_browser.ml` intentionally ignores `_saved_sp`, which strips the frame locals that `sf_letrec` was waiting on. **Fix:** In `sx_browser.ml`, the `VmSuspension` resume path must restore frame locals from the suspension snapshot before calling the continuation. Mirror what `sx_vm.ml` does in the non-browser case. **Verify:** Write a test in `spec/tests/` that does `(letrec ((f (fn () (perform :io nil)))) (f))` with a resume, check bindings survive. Runs under OCaml: `dune exec -- bin/run_tests.exe`. --- ## Phase 2 — Source info (E38 completion) Design: `plans/designs/e38-sourceinfo.md`. Target: 4/4 sourceInfo tests. The API (`hs-parse-ast`, `hs-source-for`, `hs-line-for`, `hs-node-get`, `hs-src`, `hs-src-at`, `hs-line-at`) and parser span wrapping (`hs-ast-wrap`, `hs-span-mode`) are already in the codebase. Two tests are passing; two fail because: - Tokenizer tokens lack `:end` and `:line` (only `:pos` today). - Some statement-level spans and `:next` field navigation are incomplete. ### Step 3: Tokenizer — add `:end` and `:line` to tokens `lib/hyperscript/tokenizer.sx`: extend `hs-make-token` to `{:pos :end :value :type :line}`. Track a `current-line` counter (1-based, increments after `\n`). Update all ~20 emission sites. Mirror to `shared/static/wasm/sx/hs-tokenizer.sx` after edits. **Verify:** `(hs-make-token "NUMBER" "1" 0)` returns a dict with `:end` and `:line` keys. ### Step 4: Complete parser spans + :next field `lib/hyperscript/parser.sx`: ensure `hs-ast-wrap` populates `:next` on every command in a `CommandList` (i.e. the following sibling command). Check that statement-level productions (if, for) correctly populate `:true-branch`. Trace through the two failing tests (`get source works for expressions`, `get line works for statements`) to find the exact missing fields or off-by-one positions. Mirror to `shared/static/wasm/sx/hs-parser.sx`. **Verify:** All 4 `hs-upstream-core/sourceInfo` tests pass. **Outcome:** Subsumed by Step 3. Once tokens carried `:end` and `:line`, the existing parser plumbing (`link-next-cmds` for `:next`, `:true-branch` extraction in `parse-cmd`) worked end-to-end. All 4 `hs-upstream-core/sourceInfo` tests pass with no parser changes. --- ## Phase 3 — Native ADTs (`define-type` / `match`) Design: `plans/designs/sx-adt.md`. No existing implementation. Impact: every language implementation (Haskell, Prolog, Lua, Common Lisp, Erlang) currently fakes sum types with `{:tag "..." :field ...}` dicts. Native ADTs remove that everywhere. ### Step 5: OCaml — AdtValue type + `define-type` + basic `match` `hosts/ocaml/lib/sx_types.ml`: ```ocaml type adt_value = { av_type: string; av_ctor: string; av_fields: value array } | AdtValue of adt_value ``` `hosts/ocaml/lib/sx_runtime.ml` (or evaluator): - `step-sf-define-type`: parse `(Name (Ctor1 f1 f2) (Ctor2) ...)`, register constructor NativeFns, predicates (`Ctor1?`, `Name?`), field accessors (`Ctor1-f1`) via `env-bind!`. - `step-sf-match` + `MatchFrame`: linear scan of clauses; flat patterns only for 6a; bind pattern variables in child env; `else` clause; raise on no match. - `type-of` returns the type name (e.g. `"Maybe"`). Write tests in `spec/tests/test-adt.sx`: basic constructor, predicate, accessor, match, else, no-match raise. **Verify:** `dune exec -- bin/run_tests.exe` — new test file all green. ### Step 6: JS — AdtValue + `define-type` + `match` `hosts/javascript/platform.py`: add `AdtValue` as `{ _adt: true, _type, _ctor, _fields }`. Mirror `define-type` and `match` special forms in the JS evaluator. Retranspile: `python3 hosts/javascript/cli.py --output shared/static/scripts/sx-browser.js` **Verify:** `node hosts/javascript/run_tests.js` — adt tests pass on JS too. ### Step 7: Nested patterns (Phase 6b) Both OCaml and JS `MatchFrame`: replace linear binding with recursive `matchPattern(pattern, value, env)` that: - Recurses into constructor sub-patterns. - Returns `{matched: bool, bindings: map}`. - Handles wildcard `_`, literals (`42`, `"str"`, `true`, `nil`). Extend `spec/tests/test-adt.sx` with nested pattern tests. **Outcome:** No host-side changes needed. The spec-level `match-pattern` function in `spec/evaluator.sx` (≈line 2835) already recurses through constructor sub-patterns via the dict-shape shim (`(get value :_adt|:_ctor|:_fields)`), handles `_` wildcards, literals, and variable bindings. Step 7 added 8 new deftests to `spec/tests/test-adt.sx` covering: nested constructor sanity, nested constructor with field binding, nested wildcard, nested literal equality, nested literal-vs-var clause fall-through, deeply nested constructors, mixed bind+wildcard, and nested ctor fail-through. Both hosts: +8 tests pass, zero regressions (OCaml 4532→4540, JS 2578→2586). ### Step 8: Exhaustiveness warnings (Phase 6c) `_adt_registry: type_name → [ctor_names]` global populated by `define-type`. On first non-exhaustive `match` evaluation: `console.warn("[sx] match: non-exhaustive …")`. No error — warning only. **Outcome:** `host-warn` primitive added on both hosts (OCaml `prerr_endline`, JS `console.warn`). Spec-level helpers `match-clause-is-else?`, `match-clause-ctor-name`, `match-warn-non-exhaustive`, `match-check-exhaustiveness` added in `spec/evaluator.sx` and called from `step-sf-match`. `*adt-warned*` env-bound dict used to dedupe warnings per (type, missing-set). The OCaml `step_sf_match` in `hosts/ocaml/lib/sx_ref.ml` was hand-patched (not retranspiled) because `sx_ref.ml` retranspilation drops several preamble fixes; the spec changes still flow to JS via `sx_build target="js"`. Both hosts emit identical warnings (e.g. `[sx] match: non-exhaustive — Maybe: missing Nothing`). 5 new tests added. OCaml: 4540 → 4545. JS: 2586 → 2591. Zero regressions. --- ## Phase 4 — Plugin / extension system Design: `plans/designs/hs-plugin-system.md`. ### Step 9: Parser feature registry `lib/hyperscript/parser.sx`: replace `parse-feat` hardcoded `cond` with a dict lookup. `(hs-register-feature! name parse-fn)` adds to the registry. ### Step 10: Compiler command registry + `as` converter registry `lib/hyperscript/compiler.sx`: replace `hs-to-sx` hardcoded dispatch with dict. `(hs-register-command! name compile-fn)` and `(hs-register-converter! name convert-fn)`. ### Step 11: Migrate hs-prolog-hook + Worker plugin `lib/hyperscript/runtime.sx`: remove `hs-prolog-hook`/`hs-set-prolog-hook!` ad-hoc slots. Create `lib/hyperscript/plugins/prolog.sx` that calls `hs-register-feature!` and `hs-register-command!`. Create `lib/hyperscript/plugins/worker.sx` replacing the E39 stub. --- ## Phase 5 — Performance These are incremental and can interleave with other phases. ### Step 12: Frame records (CEK) `hosts/ocaml/lib/sx_runtime.ml`: represent CEK frames as OCaml records instead of tagged variant lists. Eliminates allocation pressure from list construction per frame. Profile before/after on a tight-loop benchmark. ### Step 13: Buffer primitive for string building Add `make-buffer`, `buffer-append!`, `buffer->string` primitives. Eliminates the `(str a b c d ...)` quadratic allocation pattern in serializers and renderers. Wire into `sx_primitives.ml` and the JS platform. ### Step 14: Inline common primitives in JIT `hosts/ocaml/lib/sx_vm.ml`: add `OP_ADD`, `OP_SUB`, `OP_EQ`, `OP_APPEND` specialised opcodes that skip the primitive table lookup for the most common calls. Compiler emits these when operands are known numbers/lists. --- ## Progress log | Step | Status | Commit | |------|--------|--------| | 1 — JIT combinator bug | [x] | 882a4b76 | | 2 — letrec+resume | [x] | e80e655b | | 3 — tokenizer :end/:line | [x] | 023bc2d8 | | 4 — parser spans complete | [x] | b7ad5152 (subsumed by 023bc2d8) | | 5 — OCaml AdtValue + define-type + match | [x] | 1f49242a | | 6 — JS AdtValue + define-type + match | [x] | fc8a3916 | | 7 — nested patterns | [x] | 0679edf5 | | 8 — exhaustiveness warnings | [x] | 6d391119 | | 9 — parser feature registry | [x] | PENDING | | 10 — compiler + as converter registry | [ ] | — | | 11 — plugin migration + worker | [ ] | — | | 12 — frame records | [ ] | — | | 13 — buffer primitive | [ ] | — | | 14 — inline primitives JIT | [ ] | — | --- ## Rules - Branch: `architecture`. Never push to `main`. - SX files: `sx-tree` MCP tools only. `sx_validate` after every edit. - After every `.sx` edit to `lib/hyperscript/`, mirror to `shared/static/wasm/sx/hs-.sx`. - OCaml build: `sx_build target="ocaml"` MCP tool (never raw `dune`). - JS build: `sx_build target="js"` MCP tool. - One step per commit. Update progress log in this file. - No new planning docs. No comments in SX unless non-obvious. - Unicode in SX: raw UTF-8 only, never `\uXXXX`.