Add `_hs-feature-registry` dict and `hs-register-feature!` to `lib/hyperscript/parser.sx`. Replace `parse-feat`'s hardcoded `cond` on feature names with a registry lookup; the paren-open and default-expression branches remain as fallthroughs. Each parse-fn receives a `ctx` dict (built per call by `parse-feat-ctx`) exposing parser internals (`:adv!`, `:tp-val`, `:tp-type`, `:at-end?`, `:parse-cmd-list`, `:parse-expr`) and the per-feature handlers (`:parse-on-feat` … `:parse-socket-feat`). All nine builtins (`on`, `init`, `def`, `behavior`, `live`, `when`, `worker`, `bind`, `socket`) are registered at file load time, so plugins added later via `hs-register-feature!` persist across `hs-parse` calls. Worker stub still raises identically. Mirror `shared/static/wasm/sx/hs-parser.sx` copied byte-identical. OCaml: 4545/1339, JS: 2591/2465 — both match baseline, zero regressions. First piece of plans/designs/hs-plugin-system.md (Steps 10/11 follow).
237 lines
10 KiB
Markdown
237 lines
10 KiB
Markdown
# SX Language Improvements — roadmap
|
|
|
|
Language-building improvements to the SX evaluator, compiler, and standard library.
|
|
Ordered by impact and prerequisite chain. Each step is one loop commit.
|
|
|
|
Branch: `architecture`. SX files via `sx-tree` MCP only. Never edit generated files.
|
|
|
|
## Current baseline (2026-05-06)
|
|
|
|
- SX core spec: 2571 passing (595 non-HS pre-existing failures — bytecode-serialize, defcomp-render, etc.)
|
|
- HyperScript behavioral: 1478/1496 (run via `node tests/hs-kernel-eval.js`)
|
|
- Active bugs: JIT combinator bug (11 HS failures), letrec+resume (browser-only)
|
|
- E38 sourceInfo: 2/4 tests passing (tokenizer missing `:end`/`:line`, some spans incomplete)
|
|
|
|
---
|
|
|
|
## Phase 1 — Bug fixes
|
|
|
|
### Step 1: Fix JIT closures-returning-closures
|
|
|
|
**What:** `parse-bind`, `many`, `seq`, and other parser combinators that return closures
|
|
miscompile under JIT. The compiled closure drops intermediate stack values when the
|
|
callee itself returns a closure. 11 HyperScript tests fail under JIT, pass under CEK.
|
|
|
|
**Root cause in `hosts/ocaml/lib/sx_vm.ml`:** When a JIT-compiled closure returns
|
|
another closure (i.e. the callee is `VmClosure`), the frame restoration after the
|
|
call incorrectly reuses the parent frame's locals slot, overwriting saved intermediate
|
|
values. The `call_closure_reuse` path must snapshot `sp` before the inner call and
|
|
restore it after, or bail to the non-reuse path for closures-returning-closures.
|
|
|
|
**Verify:** `node tests/hs-kernel-eval.js 2>&1 | tail -3` — should go from 3116/3127 to 3127/3127.
|
|
|
|
### Step 2: Fix letrec + perform resume (browser)
|
|
|
|
**What:** In browser JIT mode, `letrec` sibling bindings are nil after a `perform`/resume
|
|
cycle. `call_closure_reuse` in `sx_browser.ml` intentionally ignores `_saved_sp`, which
|
|
strips the frame locals that `sf_letrec` was waiting on.
|
|
|
|
**Fix:** In `sx_browser.ml`, the `VmSuspension` resume path must restore frame locals
|
|
from the suspension snapshot before calling the continuation. Mirror what `sx_vm.ml`
|
|
does in the non-browser case.
|
|
|
|
**Verify:** Write a test in `spec/tests/` that does `(letrec ((f (fn () (perform :io nil)))) (f))` with a resume, check bindings survive. Runs under OCaml: `dune exec -- bin/run_tests.exe`.
|
|
|
|
---
|
|
|
|
## Phase 2 — Source info (E38 completion)
|
|
|
|
Design: `plans/designs/e38-sourceinfo.md`. Target: 4/4 sourceInfo tests.
|
|
|
|
The API (`hs-parse-ast`, `hs-source-for`, `hs-line-for`, `hs-node-get`, `hs-src`,
|
|
`hs-src-at`, `hs-line-at`) and parser span wrapping (`hs-ast-wrap`, `hs-span-mode`)
|
|
are already in the codebase. Two tests are passing; two fail because:
|
|
- Tokenizer tokens lack `:end` and `:line` (only `:pos` today).
|
|
- Some statement-level spans and `:next` field navigation are incomplete.
|
|
|
|
### Step 3: Tokenizer — add `:end` and `:line` to tokens
|
|
|
|
`lib/hyperscript/tokenizer.sx`: extend `hs-make-token` to `{:pos :end :value :type :line}`.
|
|
Track a `current-line` counter (1-based, increments after `\n`). Update all ~20 emission
|
|
sites. Mirror to `shared/static/wasm/sx/hs-tokenizer.sx` after edits.
|
|
|
|
**Verify:** `(hs-make-token "NUMBER" "1" 0)` returns a dict with `:end` and `:line` keys.
|
|
|
|
### Step 4: Complete parser spans + :next field
|
|
|
|
`lib/hyperscript/parser.sx`: ensure `hs-ast-wrap` populates `:next` on every command
|
|
in a `CommandList` (i.e. the following sibling command). Check that statement-level
|
|
productions (if, for) correctly populate `:true-branch`. Trace through the two failing
|
|
tests (`get source works for expressions`, `get line works for statements`) to find the
|
|
exact missing fields or off-by-one positions.
|
|
|
|
Mirror to `shared/static/wasm/sx/hs-parser.sx`.
|
|
|
|
**Verify:** All 4 `hs-upstream-core/sourceInfo` tests pass.
|
|
|
|
**Outcome:** Subsumed by Step 3. Once tokens carried `:end` and `:line`, the existing
|
|
parser plumbing (`link-next-cmds` for `:next`, `:true-branch` extraction in `parse-cmd`)
|
|
worked end-to-end. All 4 `hs-upstream-core/sourceInfo` tests pass with no parser changes.
|
|
|
|
---
|
|
|
|
## Phase 3 — Native ADTs (`define-type` / `match`)
|
|
|
|
Design: `plans/designs/sx-adt.md`. No existing implementation.
|
|
|
|
Impact: every language implementation (Haskell, Prolog, Lua, Common Lisp, Erlang)
|
|
currently fakes sum types with `{:tag "..." :field ...}` dicts. Native ADTs remove
|
|
that everywhere.
|
|
|
|
### Step 5: OCaml — AdtValue type + `define-type` + basic `match`
|
|
|
|
`hosts/ocaml/lib/sx_types.ml`:
|
|
```ocaml
|
|
type adt_value = { av_type: string; av_ctor: string; av_fields: value array }
|
|
| AdtValue of adt_value
|
|
```
|
|
|
|
`hosts/ocaml/lib/sx_runtime.ml` (or evaluator):
|
|
- `step-sf-define-type`: parse `(Name (Ctor1 f1 f2) (Ctor2) ...)`, register constructor
|
|
NativeFns, predicates (`Ctor1?`, `Name?`), field accessors (`Ctor1-f1`) via `env-bind!`.
|
|
- `step-sf-match` + `MatchFrame`: linear scan of clauses; flat patterns only for 6a;
|
|
bind pattern variables in child env; `else` clause; raise on no match.
|
|
- `type-of` returns the type name (e.g. `"Maybe"`).
|
|
|
|
Write tests in `spec/tests/test-adt.sx`: basic constructor, predicate, accessor, match,
|
|
else, no-match raise.
|
|
|
|
**Verify:** `dune exec -- bin/run_tests.exe` — new test file all green.
|
|
|
|
### Step 6: JS — AdtValue + `define-type` + `match`
|
|
|
|
`hosts/javascript/platform.py`: add `AdtValue` as `{ _adt: true, _type, _ctor, _fields }`.
|
|
Mirror `define-type` and `match` special forms in the JS evaluator.
|
|
Retranspile: `python3 hosts/javascript/cli.py --output shared/static/scripts/sx-browser.js`
|
|
|
|
**Verify:** `node hosts/javascript/run_tests.js` — adt tests pass on JS too.
|
|
|
|
### Step 7: Nested patterns (Phase 6b)
|
|
|
|
Both OCaml and JS `MatchFrame`: replace linear binding with recursive
|
|
`matchPattern(pattern, value, env)` that:
|
|
- Recurses into constructor sub-patterns.
|
|
- Returns `{matched: bool, bindings: map}`.
|
|
- Handles wildcard `_`, literals (`42`, `"str"`, `true`, `nil`).
|
|
|
|
Extend `spec/tests/test-adt.sx` with nested pattern tests.
|
|
|
|
**Outcome:** No host-side changes needed. The spec-level `match-pattern` function
|
|
in `spec/evaluator.sx` (≈line 2835) already recurses through constructor
|
|
sub-patterns via the dict-shape shim (`(get value :_adt|:_ctor|:_fields)`),
|
|
handles `_` wildcards, literals, and variable bindings. Step 7 added 8 new
|
|
deftests to `spec/tests/test-adt.sx` covering: nested constructor sanity,
|
|
nested constructor with field binding, nested wildcard, nested literal
|
|
equality, nested literal-vs-var clause fall-through, deeply nested constructors,
|
|
mixed bind+wildcard, and nested ctor fail-through. Both hosts: +8 tests pass,
|
|
zero regressions (OCaml 4532→4540, JS 2578→2586).
|
|
|
|
### Step 8: Exhaustiveness warnings (Phase 6c)
|
|
|
|
`_adt_registry: type_name → [ctor_names]` global populated by `define-type`.
|
|
On first non-exhaustive `match` evaluation: `console.warn("[sx] match: non-exhaustive …")`.
|
|
No error — warning only.
|
|
|
|
**Outcome:** `host-warn` primitive added on both hosts (OCaml `prerr_endline`,
|
|
JS `console.warn`). Spec-level helpers `match-clause-is-else?`,
|
|
`match-clause-ctor-name`, `match-warn-non-exhaustive`,
|
|
`match-check-exhaustiveness` added in `spec/evaluator.sx` and
|
|
called from `step-sf-match`. `*adt-warned*` env-bound dict used to
|
|
dedupe warnings per (type, missing-set). The OCaml `step_sf_match`
|
|
in `hosts/ocaml/lib/sx_ref.ml` was hand-patched (not retranspiled)
|
|
because `sx_ref.ml` retranspilation drops several preamble fixes;
|
|
the spec changes still flow to JS via `sx_build target="js"`. Both
|
|
hosts emit identical warnings (e.g. `[sx] match: non-exhaustive — Maybe: missing Nothing`).
|
|
5 new tests added. OCaml: 4540 → 4545. JS: 2586 → 2591. Zero regressions.
|
|
|
|
---
|
|
|
|
## Phase 4 — Plugin / extension system
|
|
|
|
Design: `plans/designs/hs-plugin-system.md`.
|
|
|
|
### Step 9: Parser feature registry
|
|
|
|
`lib/hyperscript/parser.sx`: replace `parse-feat` hardcoded `cond` with a dict lookup.
|
|
`(hs-register-feature! name parse-fn)` adds to the registry.
|
|
|
|
### Step 10: Compiler command registry + `as` converter registry
|
|
|
|
`lib/hyperscript/compiler.sx`: replace `hs-to-sx` hardcoded dispatch with dict.
|
|
`(hs-register-command! name compile-fn)` and `(hs-register-converter! name convert-fn)`.
|
|
|
|
### Step 11: Migrate hs-prolog-hook + Worker plugin
|
|
|
|
`lib/hyperscript/runtime.sx`: remove `hs-prolog-hook`/`hs-set-prolog-hook!` ad-hoc
|
|
slots. Create `lib/hyperscript/plugins/prolog.sx` that calls `hs-register-feature!`
|
|
and `hs-register-command!`. Create `lib/hyperscript/plugins/worker.sx` replacing the
|
|
E39 stub.
|
|
|
|
---
|
|
|
|
## Phase 5 — Performance
|
|
|
|
These are incremental and can interleave with other phases.
|
|
|
|
### Step 12: Frame records (CEK)
|
|
|
|
`hosts/ocaml/lib/sx_runtime.ml`: represent CEK frames as OCaml records instead of
|
|
tagged variant lists. Eliminates allocation pressure from list construction per frame.
|
|
Profile before/after on a tight-loop benchmark.
|
|
|
|
### Step 13: Buffer primitive for string building
|
|
|
|
Add `make-buffer`, `buffer-append!`, `buffer->string` primitives. Eliminates the
|
|
`(str a b c d ...)` quadratic allocation pattern in serializers and renderers.
|
|
Wire into `sx_primitives.ml` and the JS platform.
|
|
|
|
### Step 14: Inline common primitives in JIT
|
|
|
|
`hosts/ocaml/lib/sx_vm.ml`: add `OP_ADD`, `OP_SUB`, `OP_EQ`, `OP_APPEND` specialised
|
|
opcodes that skip the primitive table lookup for the most common calls. Compiler emits
|
|
these when operands are known numbers/lists.
|
|
|
|
---
|
|
|
|
## Progress log
|
|
|
|
| Step | Status | Commit |
|
|
|------|--------|--------|
|
|
| 1 — JIT combinator bug | [x] | 882a4b76 |
|
|
| 2 — letrec+resume | [x] | e80e655b |
|
|
| 3 — tokenizer :end/:line | [x] | 023bc2d8 |
|
|
| 4 — parser spans complete | [x] | b7ad5152 (subsumed by 023bc2d8) |
|
|
| 5 — OCaml AdtValue + define-type + match | [x] | 1f49242a |
|
|
| 6 — JS AdtValue + define-type + match | [x] | fc8a3916 |
|
|
| 7 — nested patterns | [x] | 0679edf5 |
|
|
| 8 — exhaustiveness warnings | [x] | 6d391119 |
|
|
| 9 — parser feature registry | [x] | PENDING |
|
|
| 10 — compiler + as converter registry | [ ] | — |
|
|
| 11 — plugin migration + worker | [ ] | — |
|
|
| 12 — frame records | [ ] | — |
|
|
| 13 — buffer primitive | [ ] | — |
|
|
| 14 — inline primitives JIT | [ ] | — |
|
|
|
|
---
|
|
|
|
## Rules
|
|
|
|
- Branch: `architecture`. Never push to `main`.
|
|
- SX files: `sx-tree` MCP tools only. `sx_validate` after every edit.
|
|
- After every `.sx` edit to `lib/hyperscript/`, mirror to `shared/static/wasm/sx/hs-<file>.sx`.
|
|
- OCaml build: `sx_build target="ocaml"` MCP tool (never raw `dune`).
|
|
- JS build: `sx_build target="js"` MCP tool.
|
|
- One step per commit. Update progress log in this file.
|
|
- No new planning docs. No comments in SX unless non-obvious.
|
|
- Unicode in SX: raw UTF-8 only, never `\uXXXX`.
|