erlang: sync fed-sx + opcode-ext plans; add Phase 9 (specialized opcodes)

2026-05-14 20:45:05 +00:00
parent 6636f9c170
commit f6a6865635
4 changed files with 4010 additions and 0 deletions
--- a/plans/sx-vm-opcode-extension.md
+++ b/plans/sx-vm-opcode-extension.md
@@ -0,0 +1,430 @@
+# SX VM Opcode Extension Mechanism
+
+Mechanism in `hosts/ocaml/evaluator/` that lets language ports register
+specialized bytecode opcodes without modifying the SX VM core. Direct
+prerequisite for **erlang-on-sx Phase 9** (the BEAM analog) and a structural
+enabler for any future language port that wants performance-critical opcodes.
+
+Reference: `plans/erlang-on-sx.md` Phase 9, `plans/fed-sx-design.md` §17.5,
+`hosts/ocaml/lib/sx_vm.ml` (current VM).
+
+Status: **design** — implementation pending. Sister workstream to the
+`loops/erlang` loop, but lives in `hosts/`, not `lib/erlang/`.
+
+---
+
+## Goal
+
+Allow language ports to register custom bytecode opcodes in the SX VM, with:
+
+- **Zero overhead for core opcodes.** Existing 37 opcodes (per `sx_vm.ml`)
+  must dispatch identically. No regression for any existing language port or
+  the core SX runtime.
+- **One additional dispatch step for extension opcodes.** Acceptable cost; the
+  win comes from avoiding the general CEK machinery.
+- **Per-extension state slot.** Erlang's process scheduler, Haskell's thunk
+  cache, etc. need somewhere to hang state alongside the VM.
+- **Compiler awareness.** The bytecode compiler (`lib/compiler.sx`) must be
+  able to emit extension opcodes by name, looked up against the registered
+  set.
+- **JIT compatibility.** Existing JIT (lazy lambda compilation) continues to
+  work for code paths using only core opcodes. Extension opcodes are
+  interpreted in v1; JITing them is a follow-up.
+
+## Non-goals
+
+- **Hot opcode reload.** Adding/replacing opcodes mid-runtime is not in
+  scope. Extensions are compile-time additions to the OCaml binary. (If
+  needed, that's a separate project.)
+- **Per-instance opcode sets.** All running instances of the SX VM share
+  the same opcode set determined at build time. Selective opcode loading
+  per instance is out of scope.
+- **Opcode hot-swap or supersession.** Once registered, opcodes are stable
+  for the lifetime of the binary.
+- **Language-port isolation at the dispatch layer.** Two language ports can
+  see each other's opcodes (they share the dispatch table). Isolation is a
+  build-time concern — don't compile in extensions you don't trust.
+
+---
+
+## Why now
+
+The Erlang-on-SX Phase 9 work needs this. Without it, Phase 9b-9g (the actual
+opcode implementations) have nowhere to plug in. The Erlang loop will hit
+this dependency as a Blocker; this design is what unblocks it.
+
+It also enables the **shared opcode pattern** discussed in `plans/fed-sx-
+design.md` §17.5: opcodes Erlang Phase 9 produces that other ports could
+plausibly use (pattern match, perform/handle, record access) get chiselled
+out to `lib/guest/vm/` when a second port has an actual second use. Without
+the extension mechanism, each port would have to fork the SX VM core or
+modify shared dispatch — neither acceptable.
+
+---
+
+## Architectural overview
+
+```
+                ┌──────────────────────────────────────────┐
+                │ SX VM core (hosts/ocaml/lib/sx_vm.ml)    │
+                │                                            │
+                │  ┌────────────────────────────────────┐  │
+                │  │ Bytecode dispatch loop             │  │
+                │  │                                     │  │
+                │  │ match op with                       │  │
+                │  │   | 1  (OP_CONST) -> ...           │  │
+                │  │   | 2  (OP_NIL)   -> ...           │  │
+                │  │   | ...                            │  │
+                │  │   | 199 -> ... (last core opcode)  │  │
+                │  │   | op when op >= 200 ->            │  │
+                │  │       Extensions.dispatch op vm     │  │ ◄── new
+                │  │       frame                         │  │
+                │  └────────────────────────────────────┘  │
+                │                                            │
+                │  ┌────────────────────────────────────┐  │
+                │  │ Extension registry                 │  │
+                │  │   opcode_id -> handler             │  │ ◄── new
+                │  │   opcode_name -> opcode_id         │  │
+                │  │   extension_state per extension    │  │
+                │  └────────────────────────────────────┘  │
+                └──────────────────────────────────────────┘
+                                   ▲
+                                   │ register at startup
+                ┌──────────────────┴──────────────────────┐
+                │ Extension modules                       │
+                │  hosts/ocaml/extensions/erlang.ml       │
+                │  hosts/ocaml/extensions/haskell.ml      │
+                │  hosts/ocaml/extensions/datalog.ml      │
+                │  hosts/ocaml/extensions/guest_vm.ml     │ ◄── shared opcodes
+                └─────────────────────────────────────────┘
+```
+
+### Opcode ID space partition
+
+Current SX VM uses opcode IDs in roughly the range 1-162 (per inspection of
+`sx_vm.ml`). We partition the 0-255 space:
+
+| Range | Use |
+|-------|-----|
+| 0 | reserved / NOP |
+| 1-127 | **core opcodes** — owned by the SX VM, locked schema |
+| 128-199 | **`lib/guest/vm/` shared opcodes** — chiselled-out shared opcodes |
+| 200-247 | **language-port opcodes** — registered by extensions |
+| 248-255 | reserved for future expansion / multi-byte opcodes |
+
+This gives ~50 slots for shared opcodes (Phase 1-2 of `lib/guest/vm/` will
+not exhaust this; we can renegotiate if it does), ~50 for any single language
+port's specialized opcodes, and clean separation that makes it obvious which
+opcodes are stable (core), shared (guest), or port-specific (extension).
+
+If we need more than 256 opcodes total, multi-byte opcodes (a leading 248-255
+byte plus a second byte) extend the space without breaking the schema.
+
+### Extension module signature
+
+```ocaml
+(* hosts/ocaml/lib/sx_vm_extension.ml *)
+
+(** A handler for an extension opcode. Reads operands from bytecode,
+    manipulates the VM stack, updates the frame's instruction pointer.
+    May raise exceptions (which propagate via the existing VM error path). *)
+type handler = vm -> frame -> unit
+
+(** State an extension carries alongside the VM. Opaque to the VM core;
+    extensions cast as needed. *)
+type extension_state = ..
+
+module type EXTENSION = sig
+  (** Stable name for this extension (e.g. "erlang", "guest_vm"). *)
+  val name : string
+
+  (** Initialize per-instance state. Called once when the VM starts and the
+      extension is loaded. *)
+  val init : unit -> extension_state
+
+  (** Opcodes this extension provides. Each is (opcode_id, opcode_name, handler).
+      opcode_id must be in the range allowed for this extension's tier
+      (128-199 for guest, 200-247 for ports). Conflicts cause startup failure. *)
+  val opcodes : extension_state -> (int * string * handler) list
+end
+```
+
+### Registration and dispatch
+
+```ocaml
+(* hosts/ocaml/lib/sx_vm_extensions.ml *)
+
+let extensions : (module EXTENSION) list ref = ref []
+let states : (string, extension_state) Hashtbl.t = Hashtbl.create 8
+let by_id : (int, handler) Hashtbl.t = Hashtbl.create 64
+let by_name : (string, int) Hashtbl.t = Hashtbl.create 64
+
+let register (m : (module EXTENSION)) =
+  let module M = (val m) in
+  let st = M.init () in
+  Hashtbl.add states M.name st;
+  List.iter (fun (id, name, h) ->
+    if Hashtbl.mem by_id id then
+      failwith (Printf.sprintf "Opcode %d (%s) already registered" id name);
+    Hashtbl.add by_id id h;
+    Hashtbl.add by_name name id
+  ) (M.opcodes st);
+  extensions := m :: !extensions
+
+let dispatch op vm frame =
+  match Hashtbl.find_opt by_id op with
+  | Some handler -> handler vm frame
+  | None -> raise (Invalid_opcode op)
+
+let id_of_name name = Hashtbl.find_opt by_name name
+let state_of_extension name = Hashtbl.find_opt states name
+```
+
+The dispatch path adds **one hashtable lookup per extension opcode**.
+Acceptable cost — and Erlang's specialized opcodes win >100× over going
+through the general CEK machine, so the overhead is negligible by comparison.
+
+### Bytecode compiler integration
+
+The compiler (`lib/compiler.sx`) needs to know extension opcode IDs to emit
+them. New SX primitive exposed to the compiler:
+
+```sx
+(extension-opcode-id "erlang.OP_PATTERN_TUPLE_2") ; → 200, or nil if not loaded
+```
+
+When the compiler wants to emit a specialized opcode, it queries by name. If
+the extension isn't loaded, the compiler falls back to the general path
+(emit a `CALL_PRIM` or general SX `case`). This means a language port's
+optimization is opt-in per build, and missing extensions degrade to slower
+correct execution rather than failure.
+
+Naming convention: `<extension-name>.OP_<NAME>`. So `erlang.OP_PATTERN_TUPLE_2`,
+`guest_vm.OP_PERFORM`, etc.
+
+### Per-extension state access
+
+Some opcodes need state beyond the VM stack (Erlang's scheduler, mailbox
+state, etc.). Extensions store state in their `init`-returned value, accessed
+via `state_of_extension`:
+
+```ocaml
+let op_spawn vm frame =
+  let st = Sx_vm_extensions.state_of_extension "erlang"
+           |> Option.get
+           |> Obj.magic in   (* extension casts to its known type *)
+  let body = pop vm in
+  let pid = Erlang_scheduler.spawn st body in
+  push vm (pid_value pid);
+  frame.ip <- frame.ip + 1
+```
+
+Shared scheduler state lives in the Erlang extension's state value. Other
+extensions don't see it.
+
+---
+
+## Phase plan
+
+Five sub-phases in dependency order. Each is testable in isolation.
+
+### Phase A — Opcode ID partition + dispatch fallthrough
+
+Smallest viable change to `sx_vm.ml`:
+
+- Add the `| op when op >= 128 -> Sx_vm_extensions.dispatch op vm frame`
+  fallthrough case.
+- Document the partition in a comment at the top of the opcode list.
+
+**Tests:**
+- All existing SX VM tests pass unchanged (zero regression for core).
+- Calling `dispatch 200 ...` with no extension registered raises
+  `Invalid_opcode 200`.
+
+**Effort:** small. ~50 lines + tests.
+
+### Phase B — Extension registry module
+
+`hosts/ocaml/lib/sx_vm_extensions.ml` per the sketch above. Pure plumbing, no
+opcodes yet.
+
+**Tests:**
+- Register a test extension with one opcode; dispatch finds it.
+- Duplicate opcode-id registration fails at startup.
+- `id_of_name` and `state_of_extension` lookups work.
+
+**Effort:** small. ~150 lines + tests.
+
+### Phase C — Compiler-side opcode lookup primitive
+
+Expose `extension-opcode-id` as an SX primitive in `hosts/ocaml/lib/`. The
+compiler in `lib/compiler.sx` can call it to emit extension opcodes by name.
+
+Does not require any extension to actually exist — the primitive returns
+`nil` for unknown names, and the compiler falls back.
+
+**Tests:**
+- Primitive returns nil for unknown name.
+- After registering a test extension, primitive returns the registered ID.
+
+**Effort:** small. Single primitive registration + compiler-side use docs.
+
+### Phase D — Test extension demonstrating end-to-end flow
+
+A dummy extension at `hosts/ocaml/extensions/test_ext.ml` registering one or
+two trivial opcodes (e.g. `OP_TEST_PUSH_42`, `OP_TEST_DOUBLE_TOS`). Wired
+into the build, available when running tests.
+
+Compiler test: write SX that triggers the test compiler-extension to emit
+`OP_TEST_PUSH_42`, then verify the VM executes it correctly via
+`bytecode-inspect` and `vm-trace`.
+
+**Tests:**
+- Bytecode emission via name lookup produces the right ID.
+- Execution produces the expected stack effect.
+- `bytecode-inspect` shows the opcode by name.
+- `vm-trace` correctly reports the extension opcode.
+
+**Effort:** small. ~100 lines including build wiring.
+
+### Phase E — JIT awareness (interpreted-only for v1)
+
+The JIT (lazy lambda compilation) currently compiles based on opcode ranges.
+Extension opcodes (≥128) should fall through to interpretation, not be
+JIT-compiled in v1.
+
+- Mark extension opcodes as "interpret only" in the JIT pre-analysis.
+- A lambda containing only core opcodes JIT-compiles as before.
+- A lambda containing any extension opcode runs interpreted.
+
+JITing extension opcodes is a follow-up project; v1 keeps the JIT scope
+unchanged and just makes it correctly route mixed bytecode.
+
+**Tests:**
+- Lambda with only core opcodes: JIT-compiled, fast path.
+- Lambda with extension opcode: interpreted, correct result.
+- Mixed lambda: interpreted, correct result.
+
+**Effort:** small-medium. Requires understanding the JIT's pre-analysis
+(per `project_jit_compilation.md` memory: "Lazy JIT implemented: lambda
+bodies compiled on first VM call, cached, failures sentinel-marked").
+Extension-opcode detection becomes another reason to mark a lambda
+"interpret-only."
+
+---
+
+## Acceptance criteria
+
+1. **Phase A-D pass their test suites.**
+2. **Zero regression on existing SX VM tests.** All language-port test
+   suites currently passing on the architecture branch (Erlang 530+, Haskell
+   285+, Datalog 276+, Smalltalk 625+, the SX core test suite, etc.) still
+   pass.
+3. **Test extension demonstrates the flow end-to-end.** SX source compiles
+   via the compiler with a registered extension opcode, executes through the
+   VM via the dispatch fallthrough, returns correct result.
+4. **Documentation:** README in `hosts/ocaml/extensions/` explaining the
+   pattern, with a worked example (the test extension is the canonical one).
+
+After acceptance, the Erlang-on-SX Phase 9 work in `lib/erlang/vm/` can use
+this mechanism. The Erlang loop's Blocker for 9a is resolved.
+
+---
+
+## Risk and mitigation
+
+**Risk: regression in core opcode dispatch.** A misplaced `match` arm could
+break something. *Mitigation:* run every existing language-port test suite
+before merging. The cost of this verification is real — probably an hour of
+machine time — but cheaper than discovering it after the fact.
+
+**Risk: opcode ID conflicts as more extensions land.** If Erlang Phase 9
+claims IDs 200-220 and Haskell wants 215-235, we have a problem.
+*Mitigation:* maintain a registry document at `hosts/ocaml/extensions/
+README.md` listing claimed ID ranges per extension. Convention: each
+extension claims a contiguous block at first registration; collisions caught
+at startup with a clear error.
+
+**Risk: extension state types leak through `Obj.magic`.** The extension state
+is type-erased in the registry. *Mitigation:* extensions cast in their own
+opcode handlers, never expose state to other extensions or the VM core.
+First-class modules / GADTs could add more type safety; deferred unless
+this becomes a concrete pain point.
+
+**Risk: extensions become a back door for kernel mutation.** An extension
+opcode handler has full access to the VM. *Mitigation:* extensions are
+build-time additions, not runtime; they're as trusted as the rest of the
+binary. Operators audit at build time, not runtime. Same trust model as
+any other compiled-in code.
+
+**Risk: shared `lib/guest/vm/` opcodes evolve under different language
+ports' needs.** *Mitigation:* the chiselling discipline (move to guest only
+on second use) ensures the shared opcodes are tested against at least two
+ports' actual usage before being considered stable.
+
+---
+
+## Open questions
+
+To be resolved during implementation, not blocking design approval:
+
+1. **Multi-byte opcode encoding.** If we need >256 opcodes total, the
+   leading-byte 248-255 schema accommodates it. Do we need multi-byte at
+   v1? Probably not — 200+ opcodes per port is more than any port should
+   reasonably want.
+2. **Extension ordering matters?** If two extensions register opcodes that
+   read the same VM state, ordering of registration could matter for
+   initialization. Probably not in practice; flag if it bites.
+3. **Hot-reload of extensions.** Out of scope for v1 (per non-goals). If
+   wanted later, the registry would need teardown + re-registration; the
+   `gen_server` `code_change/3` model from Erlang Phase 7 is a precedent.
+4. **Cross-extension opcode composition.** Can `guest_vm.OP_PERFORM` invoke
+   `erlang.OP_RECEIVE_SCAN`? In principle yes — handlers can do anything.
+   The interface is clean; the question is whether we want any conventions
+   to keep ergonomics tractable. Defer until composition appears in
+   practice.
+
+---
+
+## Implementation roadmap and sequencing
+
+This is a sister workstream to `loops/erlang`. Probably best as a single
+focused session (not a continuous loop — the work is bounded, ~1-2 weeks
+of focused effort, not iterative).
+
+Recommended sequencing:
+
+1. **A + B + C land together** as a single PR — they're tightly coupled and
+   easier to test as a unit. Branch: `loops/sx-vm-extensions` or similar.
+2. **D follows** in a second PR; demonstrates the end-to-end flow without
+   committing to any real language port's opcode design.
+3. **E (JIT integration)** as a third PR, once the basic mechanism is
+   battle-tested.
+4. **Extension scope check:** verify Erlang's Phase 9 sub-phases 9b-9g can
+   actually use this mechanism. If gaps surface, they're addressable
+   incrementally.
+5. **`hosts/ocaml/extensions/erlang.ml`** then becomes the *first real
+   consumer* — written by whoever takes over from the Erlang loop's stub
+   dispatcher. That's the integration moment that closes the loop.
+
+Estimated total effort: 1-2 weeks for one focused engineer with OCaml SX VM
+familiarity. Much less if the implementer already knows `sx_vm.ml`.
+
+---
+
+## Relationship to other plans
+
+- **`plans/erlang-on-sx.md` Phase 9:** unblocked by this work. Erlang loop
+  develops opcodes against a stub dispatcher in `lib/erlang/vm/`; once this
+  mechanism lands, swap stub for real registration via
+  `hosts/ocaml/extensions/erlang.ml`.
+- **`plans/fed-sx-design.md` §17.5:** documents this as Layer-1 prerequisite.
+  The shared-opcode discipline (lib/guest/vm/) is designed on top of this
+  mechanism's `lib/guest/vm/` namespace allocation.
+- **Future language ports (Haskell, Datalog, Smalltalk perf phases):** will
+  use the same mechanism. Each adds an extension module, claims an opcode
+  range, registers handlers. The `lib/guest/vm/` opcodes get
+  cross-referenced when the second port's needs justify chiselling.
+- **JIT roadmap (per `project_jit_architecture.md` memory):** extension
+  opcodes are interpreted in v1. JITing them is a logical follow-up but
+  a separate project.