Files
rose-ash/plans/sx-improvements.md
2026-05-06 22:27:36 +00:00

8.8 KiB

SX Language Improvements — roadmap

Language-building improvements to the SX evaluator, compiler, and standard library. Ordered by impact and prerequisite chain. Each step is one loop commit.

Branch: architecture. SX files via sx-tree MCP only. Never edit generated files.

Current baseline (2026-05-06)

  • SX core spec: 2571 passing (595 non-HS pre-existing failures — bytecode-serialize, defcomp-render, etc.)
  • HyperScript behavioral: 1478/1496 (run via node tests/hs-kernel-eval.js)
  • Active bugs: JIT combinator bug (11 HS failures), letrec+resume (browser-only)
  • E38 sourceInfo: 2/4 tests passing (tokenizer missing :end/:line, some spans incomplete)

Phase 1 — Bug fixes

Step 1: Fix JIT closures-returning-closures

What: parse-bind, many, seq, and other parser combinators that return closures miscompile under JIT. The compiled closure drops intermediate stack values when the callee itself returns a closure. 11 HyperScript tests fail under JIT, pass under CEK.

Root cause in hosts/ocaml/lib/sx_vm.ml: When a JIT-compiled closure returns another closure (i.e. the callee is VmClosure), the frame restoration after the call incorrectly reuses the parent frame's locals slot, overwriting saved intermediate values. The call_closure_reuse path must snapshot sp before the inner call and restore it after, or bail to the non-reuse path for closures-returning-closures.

Verify: node tests/hs-kernel-eval.js 2>&1 | tail -3 — should go from 3116/3127 to 3127/3127.

Step 2: Fix letrec + perform resume (browser)

What: In browser JIT mode, letrec sibling bindings are nil after a perform/resume cycle. call_closure_reuse in sx_browser.ml intentionally ignores _saved_sp, which strips the frame locals that sf_letrec was waiting on.

Fix: In sx_browser.ml, the VmSuspension resume path must restore frame locals from the suspension snapshot before calling the continuation. Mirror what sx_vm.ml does in the non-browser case.

Verify: Write a test in spec/tests/ that does (letrec ((f (fn () (perform :io nil)))) (f)) with a resume, check bindings survive. Runs under OCaml: dune exec -- bin/run_tests.exe.


Phase 2 — Source info (E38 completion)

Design: plans/designs/e38-sourceinfo.md. Target: 4/4 sourceInfo tests.

The API (hs-parse-ast, hs-source-for, hs-line-for, hs-node-get, hs-src, hs-src-at, hs-line-at) and parser span wrapping (hs-ast-wrap, hs-span-mode) are already in the codebase. Two tests are passing; two fail because:

  • Tokenizer tokens lack :end and :line (only :pos today).
  • Some statement-level spans and :next field navigation are incomplete.

Step 3: Tokenizer — add :end and :line to tokens

lib/hyperscript/tokenizer.sx: extend hs-make-token to {:pos :end :value :type :line}. Track a current-line counter (1-based, increments after \n). Update all ~20 emission sites. Mirror to shared/static/wasm/sx/hs-tokenizer.sx after edits.

Verify: (hs-make-token "NUMBER" "1" 0) returns a dict with :end and :line keys.

Step 4: Complete parser spans + :next field

lib/hyperscript/parser.sx: ensure hs-ast-wrap populates :next on every command in a CommandList (i.e. the following sibling command). Check that statement-level productions (if, for) correctly populate :true-branch. Trace through the two failing tests (get source works for expressions, get line works for statements) to find the exact missing fields or off-by-one positions.

Mirror to shared/static/wasm/sx/hs-parser.sx.

Verify: All 4 hs-upstream-core/sourceInfo tests pass.

Outcome: Subsumed by Step 3. Once tokens carried :end and :line, the existing parser plumbing (link-next-cmds for :next, :true-branch extraction in parse-cmd) worked end-to-end. All 4 hs-upstream-core/sourceInfo tests pass with no parser changes.


Phase 3 — Native ADTs (define-type / match)

Design: plans/designs/sx-adt.md. No existing implementation.

Impact: every language implementation (Haskell, Prolog, Lua, Common Lisp, Erlang) currently fakes sum types with {:tag "..." :field ...} dicts. Native ADTs remove that everywhere.

Step 5: OCaml — AdtValue type + define-type + basic match

hosts/ocaml/lib/sx_types.ml:

type adt_value = { av_type: string; av_ctor: string; av_fields: value array }
| AdtValue of adt_value

hosts/ocaml/lib/sx_runtime.ml (or evaluator):

  • step-sf-define-type: parse (Name (Ctor1 f1 f2) (Ctor2) ...), register constructor NativeFns, predicates (Ctor1?, Name?), field accessors (Ctor1-f1) via env-bind!.
  • step-sf-match + MatchFrame: linear scan of clauses; flat patterns only for 6a; bind pattern variables in child env; else clause; raise on no match.
  • type-of returns the type name (e.g. "Maybe").

Write tests in spec/tests/test-adt.sx: basic constructor, predicate, accessor, match, else, no-match raise.

Verify: dune exec -- bin/run_tests.exe — new test file all green.

Step 6: JS — AdtValue + define-type + match

hosts/javascript/platform.py: add AdtValue as { _adt: true, _type, _ctor, _fields }. Mirror define-type and match special forms in the JS evaluator. Retranspile: python3 hosts/javascript/cli.py --output shared/static/scripts/sx-browser.js

Verify: node hosts/javascript/run_tests.js — adt tests pass on JS too.

Step 7: Nested patterns (Phase 6b)

Both OCaml and JS MatchFrame: replace linear binding with recursive matchPattern(pattern, value, env) that:

  • Recurses into constructor sub-patterns.
  • Returns {matched: bool, bindings: map}.
  • Handles wildcard _, literals (42, "str", true, nil).

Extend spec/tests/test-adt.sx with nested pattern tests.

Step 8: Exhaustiveness warnings (Phase 6c)

_adt_registry: type_name → [ctor_names] global populated by define-type. On first non-exhaustive match evaluation: console.warn("[sx] match: non-exhaustive …"). No error — warning only.


Phase 4 — Plugin / extension system

Design: plans/designs/hs-plugin-system.md.

Step 9: Parser feature registry

lib/hyperscript/parser.sx: replace parse-feat hardcoded cond with a dict lookup. (hs-register-feature! name parse-fn) adds to the registry.

Step 10: Compiler command registry + as converter registry

lib/hyperscript/compiler.sx: replace hs-to-sx hardcoded dispatch with dict. (hs-register-command! name compile-fn) and (hs-register-converter! name convert-fn).

Step 11: Migrate hs-prolog-hook + Worker plugin

lib/hyperscript/runtime.sx: remove hs-prolog-hook/hs-set-prolog-hook! ad-hoc slots. Create lib/hyperscript/plugins/prolog.sx that calls hs-register-feature! and hs-register-command!. Create lib/hyperscript/plugins/worker.sx replacing the E39 stub.


Phase 5 — Performance

These are incremental and can interleave with other phases.

Step 12: Frame records (CEK)

hosts/ocaml/lib/sx_runtime.ml: represent CEK frames as OCaml records instead of tagged variant lists. Eliminates allocation pressure from list construction per frame. Profile before/after on a tight-loop benchmark.

Step 13: Buffer primitive for string building

Add make-buffer, buffer-append!, buffer->string primitives. Eliminates the (str a b c d ...) quadratic allocation pattern in serializers and renderers. Wire into sx_primitives.ml and the JS platform.

Step 14: Inline common primitives in JIT

hosts/ocaml/lib/sx_vm.ml: add OP_ADD, OP_SUB, OP_EQ, OP_APPEND specialised opcodes that skip the primitive table lookup for the most common calls. Compiler emits these when operands are known numbers/lists.


Progress log

Step Status Commit
1 — JIT combinator bug [x] 882a4b76
2 — letrec+resume [x] e80e655b
3 — tokenizer :end/:line [x] 023bc2d8
4 — parser spans complete [x] b7ad5152 (subsumed by 023bc2d8)
5 — OCaml AdtValue + define-type + match [ ]
6 — JS AdtValue + define-type + match [ ]
7 — nested patterns [ ]
8 — exhaustiveness warnings [ ]
9 — parser feature registry [ ]
10 — compiler + as converter registry [ ]
11 — plugin migration + worker [ ]
12 — frame records [ ]
13 — buffer primitive [ ]
14 — inline primitives JIT [ ]

Rules

  • Branch: architecture. Never push to main.
  • SX files: sx-tree MCP tools only. sx_validate after every edit.
  • After every .sx edit to lib/hyperscript/, mirror to shared/static/wasm/sx/hs-<file>.sx.
  • OCaml build: sx_build target="ocaml" MCP tool (never raw dune).
  • JS build: sx_build target="js" MCP tool.
  • One step per commit. Update progress log in this file.
  • No new planning docs. No comments in SX unless non-obvious.
  • Unicode in SX: raw UTF-8 only, never \uXXXX.