# SX VM Opcode Extension Mechanism Mechanism in `hosts/ocaml/lib/` that lets language ports register specialized bytecode opcodes without modifying the SX VM core. Direct prerequisite for **erlang-on-sx Phase 9** (the BEAM analog) and a structural enabler for any future language port that wants performance-critical opcodes. Reference: `plans/erlang-on-sx.md` Phase 9, `plans/fed-sx-design.md` §17.5, `hosts/ocaml/lib/sx_vm.ml` (current VM). Status: **complete** on `loops/sx-vm-extensions` (Phases A-E landed 2026-05-14 / 2026-05-15). Ready for first real consumer (`hosts/ocaml/lib/extensions/erlang.ml`, replacing the Phase 9b stub dispatcher in `lib/erlang/vm/dispatcher.sx`). --- ## Goal Allow language ports to register custom bytecode opcodes in the SX VM, with: - **Zero overhead for core opcodes.** Existing opcodes (current ceiling 175, see `sx_vm.ml`) must dispatch identically. No regression for any existing language port or the core SX runtime. - **One additional dispatch step for extension opcodes.** Acceptable cost; the win comes from avoiding the general CEK machinery. - **Per-extension state slot.** Erlang's process scheduler, Haskell's thunk cache, etc. need somewhere to hang state alongside the VM. - **Compiler awareness.** The bytecode compiler (`lib/compiler.sx`) must be able to emit extension opcodes by name, looked up against the registered set. - **JIT compatibility.** Existing JIT (lazy lambda compilation) continues to work for code paths using only core opcodes. Extension opcodes are interpreted in v1; JITing them is a follow-up. ## Non-goals - **Hot opcode reload.** Adding/replacing opcodes mid-runtime is not in scope. Extensions are compile-time additions to the OCaml binary. (If needed, that's a separate project.) - **Per-instance opcode sets.** All running instances of the SX VM share the same opcode set determined at build time. Selective opcode loading per instance is out of scope. - **Opcode hot-swap or supersession.** Once registered, opcodes are stable for the lifetime of the binary. - **Language-port isolation at the dispatch layer.** Two language ports can see each other's opcodes (they share the dispatch table). Isolation is a build-time concern — don't compile in extensions you don't trust. --- ## Why now The Erlang-on-SX Phase 9 work needs this. Without it, Phase 9b-9g (the actual opcode implementations) have nowhere to plug in. The Erlang loop hit this dependency as a Blocker (`0abf05ed`); this design is what unblocks it. It also enables the **shared opcode pattern** discussed in `plans/fed-sx- design.md` §17.5: opcodes Erlang Phase 9 produces that other ports could plausibly use (pattern match, perform/handle, record access) get chiselled out to `lib/guest/vm/` when a second port has an actual second use. Without the extension mechanism, each port would have to fork the SX VM core or modify shared dispatch — neither acceptable. --- ## Architectural overview ``` ┌──────────────────────────────────────────┐ │ SX VM core (hosts/ocaml/lib/sx_vm.ml) │ │ │ │ ┌────────────────────────────────────┐ │ │ │ Bytecode dispatch loop │ │ │ │ │ │ │ │ match op with │ │ │ │ | 1 (OP_CONST) -> ... │ │ │ │ | 2 (OP_NIL) -> ... │ │ │ │ | ... │ │ │ │ | 175 -> ... (last core opcode) │ │ │ │ | op when op >= 200 -> │ │ │ │ !extension_dispatch_ref op │ │ ◄── new │ │ vm frame │ │ │ └────────────────────────────────────┘ │ │ │ │ ┌────────────────────────────────────┐ │ │ │ Extension registry │ │ │ │ opcode_id -> handler │ │ ◄── Phase B │ │ opcode_name -> opcode_id │ │ │ │ extension_state per extension │ │ │ └────────────────────────────────────┘ │ └──────────────────────────────────────────┘ ▲ │ register at startup ┌──────────────────┴──────────────────────┐ │ Extension modules │ │ hosts/ocaml/lib/extensions/erlang.ml │ │ hosts/ocaml/lib/extensions/haskell.ml │ │ hosts/ocaml/lib/extensions/datalog.ml │ │ hosts/ocaml/lib/extensions/guest_vm.ml │ ◄── shared opcodes └─────────────────────────────────────────┘ ``` ### Opcode ID space partition Current SX VM uses opcode IDs from 1 to 175 (per inspection of `sx_vm.ml`, ceiling at OP_DEC = 175). We partition the 0-255 space: | Range | Use | |---------|------------------------------------------------------------------| | 0 | reserved / NOP | | 1-199 | **core opcodes** — owned by the SX VM, locked schema | | 200-247 | **extension opcodes** — registered by extensions (ports + shared) | | 248-255 | reserved for future expansion / multi-byte opcodes | This gives the core 24 free slots above the current 175 ceiling for future core additions, and 48 slots for extensions. Erlang Phase 9 expects to need fewer than 30 specialized opcodes, so this is comfortable headroom. The plan originally proposed a finer split (`128-199` for `lib/guest/vm/` shared, `200-247` for ports). That distinction is preserved at the **naming level** (`guest_vm.OP_X` vs `erlang.OP_Y`) and policed by the registry (duplicate IDs fail at startup), without consuming separate ID ranges. The chiselling discipline (move an opcode to `guest_vm` when a second port uses it) operates at the source level. If we need more than 256 opcodes total, multi-byte opcodes (a leading 248-255 byte plus a second byte) extend the space without breaking the schema. ### Extension module signature ```ocaml (* hosts/ocaml/lib/sx_vm_extension.ml *) (** A handler for an extension opcode. Reads operands from bytecode, manipulates the VM stack, updates the frame's instruction pointer. May raise exceptions (which propagate via the existing VM error path). *) type handler = vm -> frame -> unit (** State an extension carries alongside the VM. Opaque to the VM core; extensions cast as needed. *) type extension_state = .. module type EXTENSION = sig (** Stable name for this extension (e.g. "erlang", "guest_vm"). *) val name : string (** Initialize per-instance state. Called once when the VM starts and the extension is loaded. *) val init : unit -> extension_state (** Opcodes this extension provides. Each is (opcode_id, opcode_name, handler). opcode_id must be in 200-247. Conflicts cause startup failure. *) val opcodes : extension_state -> (int * string * handler) list end ``` ### Registration and dispatch ```ocaml (* hosts/ocaml/lib/sx_vm_extensions.ml *) let extensions : (module EXTENSION) list ref = ref [] let states : (string, extension_state) Hashtbl.t = Hashtbl.create 8 let by_id : (int, handler) Hashtbl.t = Hashtbl.create 64 let by_name : (string, int) Hashtbl.t = Hashtbl.create 64 let register (m : (module EXTENSION)) = let module M = (val m) in let st = M.init () in Hashtbl.add states M.name st; List.iter (fun (id, name, h) -> if Hashtbl.mem by_id id then failwith (Printf.sprintf "Opcode %d (%s) already registered" id name); Hashtbl.add by_id id h; Hashtbl.add by_name name id ) (M.opcodes st); extensions := m :: !extensions let dispatch op vm frame = match Hashtbl.find_opt by_id op with | Some handler -> handler vm frame | None -> raise (Invalid_opcode op) let id_of_name name = Hashtbl.find_opt by_name name let state_of_extension name = Hashtbl.find_opt states name ``` Phase B installs this dispatcher into `Sx_vm.extension_dispatch_ref` at module init. Until then, the ref's default raises `Invalid_opcode op` for any opcode ≥ 200, which is the Phase A test condition. The dispatch path adds **one hashtable lookup per extension opcode**. Acceptable cost — and Erlang's specialized opcodes win >100× over going through the general CEK machine, so the overhead is negligible by comparison. ### Bytecode compiler integration The compiler (`lib/compiler.sx`) needs to know extension opcode IDs to emit them. New SX primitive exposed to the compiler: ```sx (extension-opcode-id "erlang.OP_PATTERN_TUPLE_2") ; → 200, or nil if not loaded ``` When the compiler wants to emit a specialized opcode, it queries by name. If the extension isn't loaded, the compiler falls back to the general path (emit a `CALL_PRIM` or general SX `case`). This means a language port's optimization is opt-in per build, and missing extensions degrade to slower correct execution rather than failure. Naming convention: `.OP_`. So `erlang.OP_PATTERN_TUPLE_2`, `guest_vm.OP_PERFORM`, etc. ### Per-extension state access Some opcodes need state beyond the VM stack (Erlang's scheduler, mailbox state, etc.). Extensions store state in their `init`-returned value, accessed via `state_of_extension`: ```ocaml let op_spawn vm frame = let st = Sx_vm_extensions.state_of_extension "erlang" |> Option.get |> Obj.magic in (* extension casts to its known type *) let body = pop vm in let pid = Erlang_scheduler.spawn st body in push vm (pid_value pid); frame.ip <- frame.ip + 1 ``` Shared scheduler state lives in the Erlang extension's state value. Other extensions don't see it. --- ## Phase plan Five sub-phases in dependency order. Each is testable in isolation. ### Phase A — Opcode ID partition + dispatch fallthrough - [x] Define `exception Invalid_opcode of int` in `sx_vm.ml`. - [x] Add `extension_dispatch_ref : (int -> vm -> frame -> unit) ref` whose default handler raises `Invalid_opcode op`. Forward-declared in the same style as the existing `jit_compile_ref`. - [x] Add `| op when op >= 200 -> !extension_dispatch_ref op vm frame` arm in the dispatch loop, immediately before the catch-all. - [x] Document the partition in a comment near the top of the opcode list. **Tests:** - All existing OCaml VM/CEK tests pass unchanged (zero regression for core). - Constructed bytecode using opcode 200 raises `Invalid_opcode 200` when no extension is registered. **Effort:** small. ~50 lines + tests. ### Phase B — Extension registry module `hosts/ocaml/lib/sx_vm_extensions.ml` per the sketch above. Pure plumbing, no opcodes yet. Phase B's module init installs the real `dispatch` into `Sx_vm.extension_dispatch_ref`, replacing Phase A's stub. - [x] `Sx_vm_extension` interface module (handler type, EXTENSION sig). - [x] `Sx_vm_extensions` registry module (`register`, `dispatch`, `id_of_name`, `state_of_extension`). - [x] Wire the registry's `dispatch` into `Sx_vm.extension_dispatch_ref` at module init. **Tests:** - Register a test extension with one opcode; dispatch finds it. - Duplicate opcode-id registration fails at startup. - `id_of_name` and `state_of_extension` lookups work. **Effort:** small. ~150 lines + tests. ### Phase C — Compiler-side opcode lookup primitive Expose `extension-opcode-id` as an SX primitive in `hosts/ocaml/lib/`. The compiler in `lib/compiler.sx` can call it to emit extension opcodes by name. Does not require any extension to actually exist — the primitive returns `nil` for unknown names, and the compiler falls back. - [x] Register `extension-opcode-id` in `sx_primitives.ml`. - [x] Returns `Integer id` when registered, `Nil` otherwise. **Tests:** - Primitive returns nil for unknown name. - After registering a test extension, primitive returns the registered ID. **Effort:** small. Single primitive registration + compiler-side use docs. ### Phase D — Test extension demonstrating end-to-end flow A dummy extension at `hosts/ocaml/lib/extensions/test_ext.ml` registering one or two trivial opcodes (e.g. `OP_TEST_PUSH_42`, `OP_TEST_DOUBLE_TOS`). Wired into the build, available when running tests. Compiler test: write SX that triggers the test compiler-extension to emit `OP_TEST_PUSH_42`, then verify the VM executes it correctly via `bytecode-inspect` and `vm-trace`. - [x] `test_ext.ml` registers two opcodes. - [x] Wired into the build (extensions registered at startup). - [x] Bytecode emission via name lookup produces the right ID. - [x] `bytecode-inspect` shows the opcode by name. **Tests:** - Bytecode emission via name lookup produces the right ID. - Execution produces the expected stack effect. - `bytecode-inspect` shows the opcode by name. - `vm-trace` correctly reports the extension opcode. **Effort:** small. ~100 lines including build wiring. ### Phase E — JIT awareness (interpreted-only for v1) The JIT (lazy lambda compilation) currently compiles based on opcode ranges. Extension opcodes (≥200) should fall through to interpretation, not be JIT-compiled in v1. - [x] Mark extension opcodes as "interpret only" in the JIT pre-analysis. - [x] Lambda containing only core opcodes JIT-compiles as before. - [x] Lambda containing any extension opcode runs interpreted. JITing extension opcodes is a follow-up project; v1 keeps the JIT scope unchanged and just makes it correctly route mixed bytecode. **Tests:** - Lambda with only core opcodes: JIT-compiled, fast path. - Lambda with extension opcode: interpreted, correct result. - Mixed lambda: interpreted, correct result. **Effort:** small-medium. Requires understanding the JIT's pre-analysis (per `project_jit_compilation.md` memory: "Lazy JIT implemented: lambda bodies compiled on first VM call, cached, failures sentinel-marked"). Extension-opcode detection becomes another reason to mark a lambda "interpret-only." --- ## Acceptance criteria 1. **Phase A-D pass their test suites.** 2. **Zero regression on existing SX VM tests.** All language-port test suites currently passing on the architecture branch (Erlang 530+, Haskell 285+, Datalog 276+, Smalltalk 625+, the SX core test suite, etc.) still pass. 3. **Test extension demonstrates the flow end-to-end.** SX source compiles via the compiler with a registered extension opcode, executes through the VM via the dispatch fallthrough, returns correct result. 4. **Documentation:** README in `hosts/ocaml/lib/extensions/` explaining the pattern, with a worked example (the test extension is the canonical one). After acceptance, the Erlang-on-SX Phase 9 work in `lib/erlang/vm/` can use this mechanism. The Erlang loop's Blocker for 9a is resolved. --- ## Risk and mitigation **Risk: regression in core opcode dispatch.** A misplaced `match` arm could break something. *Mitigation:* run every existing language-port conformance suite before merging. **Risk: opcode ID conflicts as more extensions land.** If Erlang Phase 9 claims IDs 200-220 and Haskell wants 215-235, we have a problem. *Mitigation:* maintain a registry document at `hosts/ocaml/lib/extensions/ README.md` listing claimed ID ranges per extension. Convention: each extension claims a contiguous block at first registration; collisions caught at startup with a clear error. **Risk: extension state types leak through `Obj.magic`.** The extension state is type-erased in the registry. *Mitigation:* extensions cast in their own opcode handlers, never expose state to other extensions or the VM core. First-class modules / GADTs could add more type safety; deferred unless this becomes a concrete pain point. **Risk: extensions become a back door for kernel mutation.** An extension opcode handler has full access to the VM. *Mitigation:* extensions are build-time additions, not runtime; they're as trusted as the rest of the binary. Operators audit at build time, not runtime. Same trust model as any other compiled-in code. **Risk: shared `lib/guest/vm/` opcodes evolve under different language ports' needs.** *Mitigation:* the chiselling discipline (move to guest only on second use) ensures the shared opcodes are tested against at least two ports' actual usage before being considered stable. --- ## Open questions To be resolved during implementation, not blocking design approval: 1. **Multi-byte opcode encoding.** If we need >256 opcodes total, the leading-byte 248-255 schema accommodates it. Do we need multi-byte at v1? Probably not — 48 extension opcodes is more than any single port should reasonably want. 2. **Extension ordering matters?** If two extensions register opcodes that read the same VM state, ordering of registration could matter for initialization. Probably not in practice; flag if it bites. 3. **Hot-reload of extensions.** Out of scope for v1 (per non-goals). If wanted later, the registry would need teardown + re-registration; the `gen_server` `code_change/3` model from Erlang Phase 7 is a precedent. 4. **Cross-extension opcode composition.** Can `guest_vm.OP_PERFORM` invoke `erlang.OP_RECEIVE_SCAN`? In principle yes — handlers can do anything. The interface is clean; the question is whether we want any conventions to keep ergonomics tractable. Defer until composition appears in practice. --- ## Implementation roadmap and sequencing This is a sister workstream to `loops/erlang`. Driven by Erlang Phase 9. Single bounded loop on `loops/sx-vm-extensions`, ~1-2 weeks. Recommended sequencing (one phase per loop fire): 1. **Phase A** — dispatch fallthrough. Smallest viable change to `sx_vm.ml`. 2. **Phase B** — extension registry module. 3. **Phase C** — compiler-side opcode lookup primitive. 4. **Phase D** — test extension demonstrating end-to-end flow. 5. **Phase E** — JIT awareness (interpret-only routing). After acceptance: - **`hosts/ocaml/lib/extensions/erlang.ml`** becomes the *first real consumer* — written by whoever takes over from the Erlang loop's stub dispatcher in `lib/erlang/vm/dispatcher.sx`. That's the integration moment that closes the loop. Estimated total effort: 1-2 weeks for one focused engineer with OCaml SX VM familiarity. --- ## Relationship to other plans - **`plans/erlang-on-sx.md` Phase 9:** unblocked by this work. Erlang loop develops opcodes against a stub dispatcher in `lib/erlang/vm/`; once this mechanism lands, swap stub for real registration via `hosts/ocaml/lib/extensions/erlang.ml`. - **`plans/fed-sx-design.md` §17.5:** documents this as Layer-1 prerequisite. The shared-opcode discipline (lib/guest/vm/) is designed on top of this mechanism's namespace allocation. - **Future language ports (Haskell, Datalog, Smalltalk perf phases):** will use the same mechanism. Each adds an extension module, claims an opcode range, registers handlers. The `lib/guest/vm/` opcodes get cross-referenced when the second port's needs justify chiselling. - **JIT roadmap (per `project_jit_architecture.md` memory):** extension opcodes are interpreted in v1. JITing them is a logical follow-up but a separate project. --- ## Progress log Newest first. - **2026-05-15** — Phase E done. Loop complete (acceptance criteria 1-4 all met). New `Sx_vm.bytecode_uses_extension_opcodes` walks bytecode operand-aware (CONST u16 indices, CALL_PRIM u16+u8, CLOSURE u16+dynamic upvalue descriptors) so values that happen to be ≥200 don't false-positive as extension opcodes. Wired into `jit_compile_lambda`: when the inner closure's bytecode contains any extension opcode, JIT returns None and the lambda runs interpreted via CEK (the dispatch fallthrough still routes extension opcodes through the registry — this just prevents the JIT from claiming ownership of code it can't optimise). 7 new foundation tests (`jit extension-opcode awareness` suite): pure core eligible, head/middle/post-CLOSURE detection, CONST + CALL_PRIM + CLOSURE-descriptor false-positive avoidance. +7 pass vs Phase D baseline (4833 vs 4826), 1111 pre-existing failures unchanged. Conformance suites green: erlang 530/530, haskell 285/285, datalog 276/276, prolog 590/590, smalltalk 847/847, common-lisp 487/487, apl 562/562, js 148/148, forth 632/638 (pre-existing), tcl 3/4 (pre-existing), ocaml-on-sx unit 607/607. Loop done. Hand-off: the Erlang loop's Phase 9b stub dispatcher in `lib/erlang/vm/dispatcher.sx` can now be replaced with a real `hosts/ocaml/lib/extensions/erlang.ml` consumer. - **2026-05-15** — Phase D done. New `hosts/ocaml/lib/extensions/` subtree wired into the `sx` library via `(include_subdirs unqualified)`. `extensions/test_ext.ml` is the canonical worked example: two operand-less opcodes (`test_ext.OP_TEST_PUSH_42` = 220, `test_ext.OP_TEST_DOUBLE_TOS` = 221) carrying `TestExtState` (an invocation counter that exercises the per-extension state slot). `extensions/README.md` documents the registration pattern, opcode-ID range conventions, and naming rules. `Sx_vm.opcode_name` now consults `extension_opcode_name_ref` (forward ref) so disassembly shows extension opcodes by name instead of `UNKNOWN_n`. Registry maintains `name_of_id_table` (reverse of `by_name`) and installs the lookup at module init alongside the dispatch ref. 5 new foundation tests (`extensions/test_ext` suite): `extension-opcode-id` finds OP_TEST_PUSH_42, end-to-end bytecode runs to 84, disassemble shows opcode names, unregistered ext opcodes still fall back to UNKNOWN_n, per-extension state counter increments. +5 pass vs Phase C baseline (4826 vs 4821), 1111 pre-existing failures unchanged. Conformance suites green: erlang 530/530, haskell 285/285, datalog 276/276, prolog 590/590, smalltalk 847/847, common-lisp 487/487, apl 562/562, js 148/148, forth 632/638 (pre-existing), tcl 3/4 (pre-existing), ocaml-on-sx unit 607/607. - **2026-05-15** — Phase C done. `extension-opcode-id` SX primitive registered from `sx_vm_extensions.ml` module init (avoids the `sx_primitives ↔ sx_vm` cycle by registering downstream of both). Accepts a string or symbol; returns `Integer id` for registered opcode names, `Nil` for unknown — so a missing extension at compile time degrades to a fallback rather than failure. 5 new foundation tests (`extension-opcode-id primitive` suite): registered lookup, unknown → nil, symbol arg, zero-arg rejection, integer-arg rejection. +5 pass vs Phase B baseline (4821 vs 4816), 1111 pre-existing failures unchanged. Conformance suites green: erlang 530/530, haskell 285/285, datalog 276/276, prolog 590/590, smalltalk 847/847, common-lisp 487/487, apl 562/562, js 148/148, forth 632/638 (pre-existing), tcl 3/4 (pre-existing), ocaml-on-sx unit 607/607. - **2026-05-14** — Phase B done. Added `hosts/ocaml/lib/sx_vm_extension.ml` (interface: `handler` type, `extension_state` extensible variant, `EXTENSION` module type) and `sx_vm_extensions.ml` (registry: `register`, `dispatch`, `id_of_name`, `state_of_extension`, `_reset_for_tests`). `let () = install_dispatch ()` at module init replaces Phase A's stub with the real registry dispatch — Phase A behavior preserved (empty registry still raises `Invalid_opcode` for unregistered ops). Registry rejects opcode IDs outside 200-247, duplicate IDs, duplicate names, and duplicate extension names. 9 new foundation tests (`vm-extension-registry` suite): id_of_name resolve+miss, state_of_extension resolve+miss, end-to-end VM dispatch (push 42), opcode composition (push 42 → double → 84), duplicate-id / out-of-range / duplicate-name rejection. +9 pass vs Phase A baseline (4816 vs 4807), 1111 pre-existing failures unchanged. Conformance suites green: erlang 530/530, haskell 285/285, datalog 276/276, prolog 590/590, smalltalk 847/847, common-lisp 487/487, apl 562/562, js 148/148, forth 632/638 (pre-existing), tcl 3/4 (pre-existing), ocaml-on-sx unit 607/607. - **2026-05-14** — Phase A done. Added `Invalid_opcode of int` exception, `extension_dispatch_ref` (default raises `Invalid_opcode op`), and the `| op when op >= 200 -> !extension_dispatch_ref op vm frame` arm before the catch-all in `sx_vm.ml`. Partition comment documents 1-199 core / 200-247 extensions / 248-255 reserved (current core ceiling is OP_DEC = 175). 4 new foundation tests (3 × Invalid_opcode for opcodes 200/224/247, 1 × Eval_error for opcode 199 to pin the threshold). Foundation 64/64; full OCaml test suite +4 pass vs baseline (4807 vs 4803), 1111 pre-existing failures unchanged. Conformance suites green: erlang 530/530, haskell 285/285, datalog 276/276, prolog 590/590, smalltalk 847/847, common-lisp 305/305, apl 562/562, js 148/148, forth 632/638 (pre-existing), tcl 3/4 (pre-existing), ocaml-on-sx unit 607/607. (Lua 0/16 and ocaml-conformance baseline programs not exercised — pre-existing scoreboard state and multi-hour runtime respectively.)