Files
rose-ash/plans/sx-vm-opcode-extension.md
giles 57af0f386f vm-ext: phase C — extension-opcode-id SX primitive
Registers extension-opcode-id from sx_vm_extensions.ml module init.
Lives downstream of both sx_primitives and sx_vm to avoid a build
cycle. Accepts a string or symbol; returns Integer id when the opcode
is registered, Nil otherwise.

Compilers (lib/compiler.sx) call this to emit extension opcodes by
name. Returning Nil rather than failing on unknown names lets a port's
optimization opt in per-build — missing extensions degrade to slower
correct execution.

Tests: 5 new foundation cases — registered lookup, unknown → nil,
symbol arg, zero-arg + integer-arg rejection. +5 pass vs Phase B
baseline, no regressions across 11 conformance suites.
2026-05-15 00:16:03 +00:00

23 KiB
Raw Blame History

SX VM Opcode Extension Mechanism

Mechanism in hosts/ocaml/lib/ that lets language ports register specialized bytecode opcodes without modifying the SX VM core. Direct prerequisite for erlang-on-sx Phase 9 (the BEAM analog) and a structural enabler for any future language port that wants performance-critical opcodes.

Reference: plans/erlang-on-sx.md Phase 9, plans/fed-sx-design.md §17.5, hosts/ocaml/lib/sx_vm.ml (current VM).

Status: in progress on loops/sx-vm-extensions.


Goal

Allow language ports to register custom bytecode opcodes in the SX VM, with:

  • Zero overhead for core opcodes. Existing opcodes (current ceiling 175, see sx_vm.ml) must dispatch identically. No regression for any existing language port or the core SX runtime.
  • One additional dispatch step for extension opcodes. Acceptable cost; the win comes from avoiding the general CEK machinery.
  • Per-extension state slot. Erlang's process scheduler, Haskell's thunk cache, etc. need somewhere to hang state alongside the VM.
  • Compiler awareness. The bytecode compiler (lib/compiler.sx) must be able to emit extension opcodes by name, looked up against the registered set.
  • JIT compatibility. Existing JIT (lazy lambda compilation) continues to work for code paths using only core opcodes. Extension opcodes are interpreted in v1; JITing them is a follow-up.

Non-goals

  • Hot opcode reload. Adding/replacing opcodes mid-runtime is not in scope. Extensions are compile-time additions to the OCaml binary. (If needed, that's a separate project.)
  • Per-instance opcode sets. All running instances of the SX VM share the same opcode set determined at build time. Selective opcode loading per instance is out of scope.
  • Opcode hot-swap or supersession. Once registered, opcodes are stable for the lifetime of the binary.
  • Language-port isolation at the dispatch layer. Two language ports can see each other's opcodes (they share the dispatch table). Isolation is a build-time concern — don't compile in extensions you don't trust.

Why now

The Erlang-on-SX Phase 9 work needs this. Without it, Phase 9b-9g (the actual opcode implementations) have nowhere to plug in. The Erlang loop hit this dependency as a Blocker (0abf05ed); this design is what unblocks it.

It also enables the shared opcode pattern discussed in plans/fed-sx- design.md §17.5: opcodes Erlang Phase 9 produces that other ports could plausibly use (pattern match, perform/handle, record access) get chiselled out to lib/guest/vm/ when a second port has an actual second use. Without the extension mechanism, each port would have to fork the SX VM core or modify shared dispatch — neither acceptable.


Architectural overview

                ┌──────────────────────────────────────────┐
                │ SX VM core (hosts/ocaml/lib/sx_vm.ml)    │
                │                                            │
                │  ┌────────────────────────────────────┐  │
                │  │ Bytecode dispatch loop             │  │
                │  │                                     │  │
                │  │ match op with                       │  │
                │  │   | 1  (OP_CONST) -> ...           │  │
                │  │   | 2  (OP_NIL)   -> ...           │  │
                │  │   | ...                            │  │
                │  │   | 175 -> ... (last core opcode)  │  │
                │  │   | op when op >= 200 ->            │  │
                │  │       !extension_dispatch_ref op    │  │ ◄── new
                │  │       vm frame                      │  │
                │  └────────────────────────────────────┘  │
                │                                            │
                │  ┌────────────────────────────────────┐  │
                │  │ Extension registry                 │  │
                │  │   opcode_id -> handler             │  │ ◄── Phase B
                │  │   opcode_name -> opcode_id         │  │
                │  │   extension_state per extension    │  │
                │  └────────────────────────────────────┘  │
                └──────────────────────────────────────────┘
                                   ▲
                                   │ register at startup
                ┌──────────────────┴──────────────────────┐
                │ Extension modules                       │
                │  hosts/ocaml/lib/extensions/erlang.ml   │
                │  hosts/ocaml/lib/extensions/haskell.ml  │
                │  hosts/ocaml/lib/extensions/datalog.ml  │
                │  hosts/ocaml/lib/extensions/guest_vm.ml │ ◄── shared opcodes
                └─────────────────────────────────────────┘

Opcode ID space partition

Current SX VM uses opcode IDs from 1 to 175 (per inspection of sx_vm.ml, ceiling at OP_DEC = 175). We partition the 0-255 space:

Range Use
0 reserved / NOP
1-199 core opcodes — owned by the SX VM, locked schema
200-247 extension opcodes — registered by extensions (ports + shared)
248-255 reserved for future expansion / multi-byte opcodes

This gives the core 24 free slots above the current 175 ceiling for future core additions, and 48 slots for extensions. Erlang Phase 9 expects to need fewer than 30 specialized opcodes, so this is comfortable headroom.

The plan originally proposed a finer split (128-199 for lib/guest/vm/ shared, 200-247 for ports). That distinction is preserved at the naming level (guest_vm.OP_X vs erlang.OP_Y) and policed by the registry (duplicate IDs fail at startup), without consuming separate ID ranges. The chiselling discipline (move an opcode to guest_vm when a second port uses it) operates at the source level.

If we need more than 256 opcodes total, multi-byte opcodes (a leading 248-255 byte plus a second byte) extend the space without breaking the schema.

Extension module signature

(* hosts/ocaml/lib/sx_vm_extension.ml *)

(** A handler for an extension opcode. Reads operands from bytecode,
    manipulates the VM stack, updates the frame's instruction pointer.
    May raise exceptions (which propagate via the existing VM error path). *)
type handler = vm -> frame -> unit

(** State an extension carries alongside the VM. Opaque to the VM core;
    extensions cast as needed. *)
type extension_state = ..

module type EXTENSION = sig
  (** Stable name for this extension (e.g. "erlang", "guest_vm"). *)
  val name : string

  (** Initialize per-instance state. Called once when the VM starts and the
      extension is loaded. *)
  val init : unit -> extension_state

  (** Opcodes this extension provides. Each is (opcode_id, opcode_name, handler).
      opcode_id must be in 200-247. Conflicts cause startup failure. *)
  val opcodes : extension_state -> (int * string * handler) list
end

Registration and dispatch

(* hosts/ocaml/lib/sx_vm_extensions.ml *)

let extensions : (module EXTENSION) list ref = ref []
let states : (string, extension_state) Hashtbl.t = Hashtbl.create 8
let by_id : (int, handler) Hashtbl.t = Hashtbl.create 64
let by_name : (string, int) Hashtbl.t = Hashtbl.create 64

let register (m : (module EXTENSION)) =
  let module M = (val m) in
  let st = M.init () in
  Hashtbl.add states M.name st;
  List.iter (fun (id, name, h) ->
    if Hashtbl.mem by_id id then
      failwith (Printf.sprintf "Opcode %d (%s) already registered" id name);
    Hashtbl.add by_id id h;
    Hashtbl.add by_name name id
  ) (M.opcodes st);
  extensions := m :: !extensions

let dispatch op vm frame =
  match Hashtbl.find_opt by_id op with
  | Some handler -> handler vm frame
  | None -> raise (Invalid_opcode op)

let id_of_name name = Hashtbl.find_opt by_name name
let state_of_extension name = Hashtbl.find_opt states name

Phase B installs this dispatcher into Sx_vm.extension_dispatch_ref at module init. Until then, the ref's default raises Invalid_opcode op for any opcode ≥ 200, which is the Phase A test condition.

The dispatch path adds one hashtable lookup per extension opcode. Acceptable cost — and Erlang's specialized opcodes win >100× over going through the general CEK machine, so the overhead is negligible by comparison.

Bytecode compiler integration

The compiler (lib/compiler.sx) needs to know extension opcode IDs to emit them. New SX primitive exposed to the compiler:

(extension-opcode-id "erlang.OP_PATTERN_TUPLE_2") ; → 200, or nil if not loaded

When the compiler wants to emit a specialized opcode, it queries by name. If the extension isn't loaded, the compiler falls back to the general path (emit a CALL_PRIM or general SX case). This means a language port's optimization is opt-in per build, and missing extensions degrade to slower correct execution rather than failure.

Naming convention: <extension-name>.OP_<NAME>. So erlang.OP_PATTERN_TUPLE_2, guest_vm.OP_PERFORM, etc.

Per-extension state access

Some opcodes need state beyond the VM stack (Erlang's scheduler, mailbox state, etc.). Extensions store state in their init-returned value, accessed via state_of_extension:

let op_spawn vm frame =
  let st = Sx_vm_extensions.state_of_extension "erlang"
           |> Option.get
           |> Obj.magic in   (* extension casts to its known type *)
  let body = pop vm in
  let pid = Erlang_scheduler.spawn st body in
  push vm (pid_value pid);
  frame.ip <- frame.ip + 1

Shared scheduler state lives in the Erlang extension's state value. Other extensions don't see it.


Phase plan

Five sub-phases in dependency order. Each is testable in isolation.

Phase A — Opcode ID partition + dispatch fallthrough

  • Define exception Invalid_opcode of int in sx_vm.ml.
  • Add extension_dispatch_ref : (int -> vm -> frame -> unit) ref whose default handler raises Invalid_opcode op. Forward-declared in the same style as the existing jit_compile_ref.
  • Add | op when op >= 200 -> !extension_dispatch_ref op vm frame arm in the dispatch loop, immediately before the catch-all.
  • Document the partition in a comment near the top of the opcode list.

Tests:

  • All existing OCaml VM/CEK tests pass unchanged (zero regression for core).
  • Constructed bytecode using opcode 200 raises Invalid_opcode 200 when no extension is registered.

Effort: small. ~50 lines + tests.

Phase B — Extension registry module

hosts/ocaml/lib/sx_vm_extensions.ml per the sketch above. Pure plumbing, no opcodes yet. Phase B's module init installs the real dispatch into Sx_vm.extension_dispatch_ref, replacing Phase A's stub.

  • Sx_vm_extension interface module (handler type, EXTENSION sig).
  • Sx_vm_extensions registry module (register, dispatch, id_of_name, state_of_extension).
  • Wire the registry's dispatch into Sx_vm.extension_dispatch_ref at module init.

Tests:

  • Register a test extension with one opcode; dispatch finds it.
  • Duplicate opcode-id registration fails at startup.
  • id_of_name and state_of_extension lookups work.

Effort: small. ~150 lines + tests.

Phase C — Compiler-side opcode lookup primitive

Expose extension-opcode-id as an SX primitive in hosts/ocaml/lib/. The compiler in lib/compiler.sx can call it to emit extension opcodes by name.

Does not require any extension to actually exist — the primitive returns nil for unknown names, and the compiler falls back.

  • Register extension-opcode-id in sx_primitives.ml.
  • Returns Integer id when registered, Nil otherwise.

Tests:

  • Primitive returns nil for unknown name.
  • After registering a test extension, primitive returns the registered ID.

Effort: small. Single primitive registration + compiler-side use docs.

Phase D — Test extension demonstrating end-to-end flow

A dummy extension at hosts/ocaml/lib/extensions/test_ext.ml registering one or two trivial opcodes (e.g. OP_TEST_PUSH_42, OP_TEST_DOUBLE_TOS). Wired into the build, available when running tests.

Compiler test: write SX that triggers the test compiler-extension to emit OP_TEST_PUSH_42, then verify the VM executes it correctly via bytecode-inspect and vm-trace.

  • test_ext.ml registers two opcodes.
  • Wired into the build (extensions registered at startup).
  • Bytecode emission via name lookup produces the right ID.
  • bytecode-inspect shows the opcode by name.

Tests:

  • Bytecode emission via name lookup produces the right ID.
  • Execution produces the expected stack effect.
  • bytecode-inspect shows the opcode by name.
  • vm-trace correctly reports the extension opcode.

Effort: small. ~100 lines including build wiring.

Phase E — JIT awareness (interpreted-only for v1)

The JIT (lazy lambda compilation) currently compiles based on opcode ranges. Extension opcodes (≥200) should fall through to interpretation, not be JIT-compiled in v1.

  • Mark extension opcodes as "interpret only" in the JIT pre-analysis.
  • Lambda containing only core opcodes JIT-compiles as before.
  • Lambda containing any extension opcode runs interpreted.

JITing extension opcodes is a follow-up project; v1 keeps the JIT scope unchanged and just makes it correctly route mixed bytecode.

Tests:

  • Lambda with only core opcodes: JIT-compiled, fast path.
  • Lambda with extension opcode: interpreted, correct result.
  • Mixed lambda: interpreted, correct result.

Effort: small-medium. Requires understanding the JIT's pre-analysis (per project_jit_compilation.md memory: "Lazy JIT implemented: lambda bodies compiled on first VM call, cached, failures sentinel-marked"). Extension-opcode detection becomes another reason to mark a lambda "interpret-only."


Acceptance criteria

  1. Phase A-D pass their test suites.
  2. Zero regression on existing SX VM tests. All language-port test suites currently passing on the architecture branch (Erlang 530+, Haskell 285+, Datalog 276+, Smalltalk 625+, the SX core test suite, etc.) still pass.
  3. Test extension demonstrates the flow end-to-end. SX source compiles via the compiler with a registered extension opcode, executes through the VM via the dispatch fallthrough, returns correct result.
  4. Documentation: README in hosts/ocaml/lib/extensions/ explaining the pattern, with a worked example (the test extension is the canonical one).

After acceptance, the Erlang-on-SX Phase 9 work in lib/erlang/vm/ can use this mechanism. The Erlang loop's Blocker for 9a is resolved.


Risk and mitigation

Risk: regression in core opcode dispatch. A misplaced match arm could break something. Mitigation: run every existing language-port conformance suite before merging.

Risk: opcode ID conflicts as more extensions land. If Erlang Phase 9 claims IDs 200-220 and Haskell wants 215-235, we have a problem. Mitigation: maintain a registry document at hosts/ocaml/lib/extensions/ README.md listing claimed ID ranges per extension. Convention: each extension claims a contiguous block at first registration; collisions caught at startup with a clear error.

Risk: extension state types leak through Obj.magic. The extension state is type-erased in the registry. Mitigation: extensions cast in their own opcode handlers, never expose state to other extensions or the VM core. First-class modules / GADTs could add more type safety; deferred unless this becomes a concrete pain point.

Risk: extensions become a back door for kernel mutation. An extension opcode handler has full access to the VM. Mitigation: extensions are build-time additions, not runtime; they're as trusted as the rest of the binary. Operators audit at build time, not runtime. Same trust model as any other compiled-in code.

Risk: shared lib/guest/vm/ opcodes evolve under different language ports' needs. Mitigation: the chiselling discipline (move to guest only on second use) ensures the shared opcodes are tested against at least two ports' actual usage before being considered stable.


Open questions

To be resolved during implementation, not blocking design approval:

  1. Multi-byte opcode encoding. If we need >256 opcodes total, the leading-byte 248-255 schema accommodates it. Do we need multi-byte at v1? Probably not — 48 extension opcodes is more than any single port should reasonably want.
  2. Extension ordering matters? If two extensions register opcodes that read the same VM state, ordering of registration could matter for initialization. Probably not in practice; flag if it bites.
  3. Hot-reload of extensions. Out of scope for v1 (per non-goals). If wanted later, the registry would need teardown + re-registration; the gen_server code_change/3 model from Erlang Phase 7 is a precedent.
  4. Cross-extension opcode composition. Can guest_vm.OP_PERFORM invoke erlang.OP_RECEIVE_SCAN? In principle yes — handlers can do anything. The interface is clean; the question is whether we want any conventions to keep ergonomics tractable. Defer until composition appears in practice.

Implementation roadmap and sequencing

This is a sister workstream to loops/erlang. Driven by Erlang Phase 9. Single bounded loop on loops/sx-vm-extensions, ~1-2 weeks.

Recommended sequencing (one phase per loop fire):

  1. Phase A — dispatch fallthrough. Smallest viable change to sx_vm.ml.
  2. Phase B — extension registry module.
  3. Phase C — compiler-side opcode lookup primitive.
  4. Phase D — test extension demonstrating end-to-end flow.
  5. Phase E — JIT awareness (interpret-only routing).

After acceptance:

  • hosts/ocaml/lib/extensions/erlang.ml becomes the first real consumer — written by whoever takes over from the Erlang loop's stub dispatcher in lib/erlang/vm/dispatcher.sx. That's the integration moment that closes the loop.

Estimated total effort: 1-2 weeks for one focused engineer with OCaml SX VM familiarity.


Relationship to other plans

  • plans/erlang-on-sx.md Phase 9: unblocked by this work. Erlang loop develops opcodes against a stub dispatcher in lib/erlang/vm/; once this mechanism lands, swap stub for real registration via hosts/ocaml/lib/extensions/erlang.ml.
  • plans/fed-sx-design.md §17.5: documents this as Layer-1 prerequisite. The shared-opcode discipline (lib/guest/vm/) is designed on top of this mechanism's namespace allocation.
  • Future language ports (Haskell, Datalog, Smalltalk perf phases): will use the same mechanism. Each adds an extension module, claims an opcode range, registers handlers. The lib/guest/vm/ opcodes get cross-referenced when the second port's needs justify chiselling.
  • JIT roadmap (per project_jit_architecture.md memory): extension opcodes are interpreted in v1. JITing them is a logical follow-up but a separate project.

Progress log

Newest first.

  • 2026-05-15 — Phase C done. extension-opcode-id SX primitive registered from sx_vm_extensions.ml module init (avoids the sx_primitives ↔ sx_vm cycle by registering downstream of both). Accepts a string or symbol; returns Integer id for registered opcode names, Nil for unknown — so a missing extension at compile time degrades to a fallback rather than failure. 5 new foundation tests (extension-opcode-id primitive suite): registered lookup, unknown → nil, symbol arg, zero-arg rejection, integer-arg rejection. +5 pass vs Phase B baseline (4821 vs 4816), 1111 pre-existing failures unchanged. Conformance suites green: erlang 530/530, haskell 285/285, datalog 276/276, prolog 590/590, smalltalk 847/847, common-lisp 487/487, apl 562/562, js 148/148, forth 632/638 (pre-existing), tcl 3/4 (pre-existing), ocaml-on-sx unit 607/607.

  • 2026-05-14 — Phase B done. Added hosts/ocaml/lib/sx_vm_extension.ml (interface: handler type, extension_state extensible variant, EXTENSION module type) and sx_vm_extensions.ml (registry: register, dispatch, id_of_name, state_of_extension, _reset_for_tests). let () = install_dispatch () at module init replaces Phase A's stub with the real registry dispatch — Phase A behavior preserved (empty registry still raises Invalid_opcode for unregistered ops). Registry rejects opcode IDs outside 200-247, duplicate IDs, duplicate names, and duplicate extension names. 9 new foundation tests (vm-extension-registry suite): id_of_name resolve+miss, state_of_extension resolve+miss, end-to-end VM dispatch (push 42), opcode composition (push 42 → double → 84), duplicate-id / out-of-range / duplicate-name rejection. +9 pass vs Phase A baseline (4816 vs 4807), 1111 pre-existing failures unchanged. Conformance suites green: erlang 530/530, haskell 285/285, datalog 276/276, prolog 590/590, smalltalk 847/847, common-lisp 487/487, apl 562/562, js 148/148, forth 632/638 (pre-existing), tcl 3/4 (pre-existing), ocaml-on-sx unit 607/607.

  • 2026-05-14 — Phase A done. Added Invalid_opcode of int exception, extension_dispatch_ref (default raises Invalid_opcode op), and the | op when op >= 200 -> !extension_dispatch_ref op vm frame arm before the catch-all in sx_vm.ml. Partition comment documents 1-199 core / 200-247 extensions / 248-255 reserved (current core ceiling is OP_DEC = 175). 4 new foundation tests (3 × Invalid_opcode for opcodes 200/224/247, 1 × Eval_error for opcode 199 to pin the threshold). Foundation 64/64; full OCaml test suite +4 pass vs baseline (4807 vs 4803), 1111 pre-existing failures unchanged. Conformance suites green: erlang 530/530, haskell 285/285, datalog 276/276, prolog 590/590, smalltalk 847/847, common-lisp 305/305, apl 562/562, js 148/148, forth 632/638 (pre-existing), tcl 3/4 (pre-existing), ocaml-on-sx unit 607/607. (Lua 0/16 and ocaml-conformance baseline programs not exercised — pre-existing scoreboard state and multi-hour runtime respectively.)