Files
rose-ash/plans/sx-improvements.md
giles ba9ab4e65a plan: record step 6 commit hash
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 23:05:41 +00:00

215 lines
8.8 KiB
Markdown

# SX Language Improvements — roadmap
Language-building improvements to the SX evaluator, compiler, and standard library.
Ordered by impact and prerequisite chain. Each step is one loop commit.
Branch: `architecture`. SX files via `sx-tree` MCP only. Never edit generated files.
## Current baseline (2026-05-06)
- SX core spec: 2571 passing (595 non-HS pre-existing failures — bytecode-serialize, defcomp-render, etc.)
- HyperScript behavioral: 1478/1496 (run via `node tests/hs-kernel-eval.js`)
- Active bugs: JIT combinator bug (11 HS failures), letrec+resume (browser-only)
- E38 sourceInfo: 2/4 tests passing (tokenizer missing `:end`/`:line`, some spans incomplete)
---
## Phase 1 — Bug fixes
### Step 1: Fix JIT closures-returning-closures
**What:** `parse-bind`, `many`, `seq`, and other parser combinators that return closures
miscompile under JIT. The compiled closure drops intermediate stack values when the
callee itself returns a closure. 11 HyperScript tests fail under JIT, pass under CEK.
**Root cause in `hosts/ocaml/lib/sx_vm.ml`:** When a JIT-compiled closure returns
another closure (i.e. the callee is `VmClosure`), the frame restoration after the
call incorrectly reuses the parent frame's locals slot, overwriting saved intermediate
values. The `call_closure_reuse` path must snapshot `sp` before the inner call and
restore it after, or bail to the non-reuse path for closures-returning-closures.
**Verify:** `node tests/hs-kernel-eval.js 2>&1 | tail -3` — should go from 3116/3127 to 3127/3127.
### Step 2: Fix letrec + perform resume (browser)
**What:** In browser JIT mode, `letrec` sibling bindings are nil after a `perform`/resume
cycle. `call_closure_reuse` in `sx_browser.ml` intentionally ignores `_saved_sp`, which
strips the frame locals that `sf_letrec` was waiting on.
**Fix:** In `sx_browser.ml`, the `VmSuspension` resume path must restore frame locals
from the suspension snapshot before calling the continuation. Mirror what `sx_vm.ml`
does in the non-browser case.
**Verify:** Write a test in `spec/tests/` that does `(letrec ((f (fn () (perform :io nil)))) (f))` with a resume, check bindings survive. Runs under OCaml: `dune exec -- bin/run_tests.exe`.
---
## Phase 2 — Source info (E38 completion)
Design: `plans/designs/e38-sourceinfo.md`. Target: 4/4 sourceInfo tests.
The API (`hs-parse-ast`, `hs-source-for`, `hs-line-for`, `hs-node-get`, `hs-src`,
`hs-src-at`, `hs-line-at`) and parser span wrapping (`hs-ast-wrap`, `hs-span-mode`)
are already in the codebase. Two tests are passing; two fail because:
- Tokenizer tokens lack `:end` and `:line` (only `:pos` today).
- Some statement-level spans and `:next` field navigation are incomplete.
### Step 3: Tokenizer — add `:end` and `:line` to tokens
`lib/hyperscript/tokenizer.sx`: extend `hs-make-token` to `{:pos :end :value :type :line}`.
Track a `current-line` counter (1-based, increments after `\n`). Update all ~20 emission
sites. Mirror to `shared/static/wasm/sx/hs-tokenizer.sx` after edits.
**Verify:** `(hs-make-token "NUMBER" "1" 0)` returns a dict with `:end` and `:line` keys.
### Step 4: Complete parser spans + :next field
`lib/hyperscript/parser.sx`: ensure `hs-ast-wrap` populates `:next` on every command
in a `CommandList` (i.e. the following sibling command). Check that statement-level
productions (if, for) correctly populate `:true-branch`. Trace through the two failing
tests (`get source works for expressions`, `get line works for statements`) to find the
exact missing fields or off-by-one positions.
Mirror to `shared/static/wasm/sx/hs-parser.sx`.
**Verify:** All 4 `hs-upstream-core/sourceInfo` tests pass.
**Outcome:** Subsumed by Step 3. Once tokens carried `:end` and `:line`, the existing
parser plumbing (`link-next-cmds` for `:next`, `:true-branch` extraction in `parse-cmd`)
worked end-to-end. All 4 `hs-upstream-core/sourceInfo` tests pass with no parser changes.
---
## Phase 3 — Native ADTs (`define-type` / `match`)
Design: `plans/designs/sx-adt.md`. No existing implementation.
Impact: every language implementation (Haskell, Prolog, Lua, Common Lisp, Erlang)
currently fakes sum types with `{:tag "..." :field ...}` dicts. Native ADTs remove
that everywhere.
### Step 5: OCaml — AdtValue type + `define-type` + basic `match`
`hosts/ocaml/lib/sx_types.ml`:
```ocaml
type adt_value = { av_type: string; av_ctor: string; av_fields: value array }
| AdtValue of adt_value
```
`hosts/ocaml/lib/sx_runtime.ml` (or evaluator):
- `step-sf-define-type`: parse `(Name (Ctor1 f1 f2) (Ctor2) ...)`, register constructor
NativeFns, predicates (`Ctor1?`, `Name?`), field accessors (`Ctor1-f1`) via `env-bind!`.
- `step-sf-match` + `MatchFrame`: linear scan of clauses; flat patterns only for 6a;
bind pattern variables in child env; `else` clause; raise on no match.
- `type-of` returns the type name (e.g. `"Maybe"`).
Write tests in `spec/tests/test-adt.sx`: basic constructor, predicate, accessor, match,
else, no-match raise.
**Verify:** `dune exec -- bin/run_tests.exe` — new test file all green.
### Step 6: JS — AdtValue + `define-type` + `match`
`hosts/javascript/platform.py`: add `AdtValue` as `{ _adt: true, _type, _ctor, _fields }`.
Mirror `define-type` and `match` special forms in the JS evaluator.
Retranspile: `python3 hosts/javascript/cli.py --output shared/static/scripts/sx-browser.js`
**Verify:** `node hosts/javascript/run_tests.js` — adt tests pass on JS too.
### Step 7: Nested patterns (Phase 6b)
Both OCaml and JS `MatchFrame`: replace linear binding with recursive
`matchPattern(pattern, value, env)` that:
- Recurses into constructor sub-patterns.
- Returns `{matched: bool, bindings: map}`.
- Handles wildcard `_`, literals (`42`, `"str"`, `true`, `nil`).
Extend `spec/tests/test-adt.sx` with nested pattern tests.
### Step 8: Exhaustiveness warnings (Phase 6c)
`_adt_registry: type_name → [ctor_names]` global populated by `define-type`.
On first non-exhaustive `match` evaluation: `console.warn("[sx] match: non-exhaustive …")`.
No error — warning only.
---
## Phase 4 — Plugin / extension system
Design: `plans/designs/hs-plugin-system.md`.
### Step 9: Parser feature registry
`lib/hyperscript/parser.sx`: replace `parse-feat` hardcoded `cond` with a dict lookup.
`(hs-register-feature! name parse-fn)` adds to the registry.
### Step 10: Compiler command registry + `as` converter registry
`lib/hyperscript/compiler.sx`: replace `hs-to-sx` hardcoded dispatch with dict.
`(hs-register-command! name compile-fn)` and `(hs-register-converter! name convert-fn)`.
### Step 11: Migrate hs-prolog-hook + Worker plugin
`lib/hyperscript/runtime.sx`: remove `hs-prolog-hook`/`hs-set-prolog-hook!` ad-hoc
slots. Create `lib/hyperscript/plugins/prolog.sx` that calls `hs-register-feature!`
and `hs-register-command!`. Create `lib/hyperscript/plugins/worker.sx` replacing the
E39 stub.
---
## Phase 5 — Performance
These are incremental and can interleave with other phases.
### Step 12: Frame records (CEK)
`hosts/ocaml/lib/sx_runtime.ml`: represent CEK frames as OCaml records instead of
tagged variant lists. Eliminates allocation pressure from list construction per frame.
Profile before/after on a tight-loop benchmark.
### Step 13: Buffer primitive for string building
Add `make-buffer`, `buffer-append!`, `buffer->string` primitives. Eliminates the
`(str a b c d ...)` quadratic allocation pattern in serializers and renderers.
Wire into `sx_primitives.ml` and the JS platform.
### Step 14: Inline common primitives in JIT
`hosts/ocaml/lib/sx_vm.ml`: add `OP_ADD`, `OP_SUB`, `OP_EQ`, `OP_APPEND` specialised
opcodes that skip the primitive table lookup for the most common calls. Compiler emits
these when operands are known numbers/lists.
---
## Progress log
| Step | Status | Commit |
|------|--------|--------|
| 1 — JIT combinator bug | [x] | 882a4b76 |
| 2 — letrec+resume | [x] | e80e655b |
| 3 — tokenizer :end/:line | [x] | 023bc2d8 |
| 4 — parser spans complete | [x] | b7ad5152 (subsumed by 023bc2d8) |
| 5 — OCaml AdtValue + define-type + match | [x] | 1f49242a |
| 6 — JS AdtValue + define-type + match | [x] | fc8a3916 |
| 7 — nested patterns | [ ] | — |
| 8 — exhaustiveness warnings | [ ] | — |
| 9 — parser feature registry | [ ] | — |
| 10 — compiler + as converter registry | [ ] | — |
| 11 — plugin migration + worker | [ ] | — |
| 12 — frame records | [ ] | — |
| 13 — buffer primitive | [ ] | — |
| 14 — inline primitives JIT | [ ] | — |
---
## Rules
- Branch: `architecture`. Never push to `main`.
- SX files: `sx-tree` MCP tools only. `sx_validate` after every edit.
- After every `.sx` edit to `lib/hyperscript/`, mirror to `shared/static/wasm/sx/hs-<file>.sx`.
- OCaml build: `sx_build target="ocaml"` MCP tool (never raw `dune`).
- JS build: `sx_build target="js"` MCP tool.
- One step per commit. Update progress log in this file.
- No new planning docs. No comments in SX unless non-obvious.
- Unicode in SX: raw UTF-8 only, never `\uXXXX`.