Files
rose-ash/plans/sx-vm-opcode-extension.md
giles cf597f1b5f vm-ext: phase A — extension dispatch fallthrough in sx_vm.ml
Adds Invalid_opcode of int exception and extension_dispatch_ref forward
ref (default raises Invalid_opcode op), plus the |op when op >= 200 arm
before the catch-all in the bytecode dispatch loop. Partition comment
documents 1-199 core / 200-247 extensions / 248-255 reserved.

Phase B will install the real registry's dispatch into the ref at module
init, replacing this stub.

Tests: 4 new foundation cases (Invalid_opcode for 200/224/247, Eval_error
for 199 to pin the threshold). +4 pass vs baseline, no regressions.
2026-05-14 22:29:50 +00:00

475 lines
21 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# SX VM Opcode Extension Mechanism
Mechanism in `hosts/ocaml/lib/` that lets language ports register specialized
bytecode opcodes without modifying the SX VM core. Direct prerequisite for
**erlang-on-sx Phase 9** (the BEAM analog) and a structural enabler for any
future language port that wants performance-critical opcodes.
Reference: `plans/erlang-on-sx.md` Phase 9, `plans/fed-sx-design.md` §17.5,
`hosts/ocaml/lib/sx_vm.ml` (current VM).
Status: **in progress** on `loops/sx-vm-extensions`.
---
## Goal
Allow language ports to register custom bytecode opcodes in the SX VM, with:
- **Zero overhead for core opcodes.** Existing opcodes (current ceiling 175,
see `sx_vm.ml`) must dispatch identically. No regression for any existing
language port or the core SX runtime.
- **One additional dispatch step for extension opcodes.** Acceptable cost; the
win comes from avoiding the general CEK machinery.
- **Per-extension state slot.** Erlang's process scheduler, Haskell's thunk
cache, etc. need somewhere to hang state alongside the VM.
- **Compiler awareness.** The bytecode compiler (`lib/compiler.sx`) must be
able to emit extension opcodes by name, looked up against the registered
set.
- **JIT compatibility.** Existing JIT (lazy lambda compilation) continues to
work for code paths using only core opcodes. Extension opcodes are
interpreted in v1; JITing them is a follow-up.
## Non-goals
- **Hot opcode reload.** Adding/replacing opcodes mid-runtime is not in
scope. Extensions are compile-time additions to the OCaml binary. (If
needed, that's a separate project.)
- **Per-instance opcode sets.** All running instances of the SX VM share
the same opcode set determined at build time. Selective opcode loading
per instance is out of scope.
- **Opcode hot-swap or supersession.** Once registered, opcodes are stable
for the lifetime of the binary.
- **Language-port isolation at the dispatch layer.** Two language ports can
see each other's opcodes (they share the dispatch table). Isolation is a
build-time concern — don't compile in extensions you don't trust.
---
## Why now
The Erlang-on-SX Phase 9 work needs this. Without it, Phase 9b-9g (the actual
opcode implementations) have nowhere to plug in. The Erlang loop hit this
dependency as a Blocker (`0abf05ed`); this design is what unblocks it.
It also enables the **shared opcode pattern** discussed in `plans/fed-sx-
design.md` §17.5: opcodes Erlang Phase 9 produces that other ports could
plausibly use (pattern match, perform/handle, record access) get chiselled
out to `lib/guest/vm/` when a second port has an actual second use. Without
the extension mechanism, each port would have to fork the SX VM core or
modify shared dispatch — neither acceptable.
---
## Architectural overview
```
┌──────────────────────────────────────────┐
│ SX VM core (hosts/ocaml/lib/sx_vm.ml) │
│ │
│ ┌────────────────────────────────────┐ │
│ │ Bytecode dispatch loop │ │
│ │ │ │
│ │ match op with │ │
│ │ | 1 (OP_CONST) -> ... │ │
│ │ | 2 (OP_NIL) -> ... │ │
│ │ | ... │ │
│ │ | 175 -> ... (last core opcode) │ │
│ │ | op when op >= 200 -> │ │
│ │ !extension_dispatch_ref op │ │ ◄── new
│ │ vm frame │ │
│ └────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────┐ │
│ │ Extension registry │ │
│ │ opcode_id -> handler │ │ ◄── Phase B
│ │ opcode_name -> opcode_id │ │
│ │ extension_state per extension │ │
│ └────────────────────────────────────┘ │
└──────────────────────────────────────────┘
│ register at startup
┌──────────────────┴──────────────────────┐
│ Extension modules │
│ hosts/ocaml/lib/extensions/erlang.ml │
│ hosts/ocaml/lib/extensions/haskell.ml │
│ hosts/ocaml/lib/extensions/datalog.ml │
│ hosts/ocaml/lib/extensions/guest_vm.ml │ ◄── shared opcodes
└─────────────────────────────────────────┘
```
### Opcode ID space partition
Current SX VM uses opcode IDs from 1 to 175 (per inspection of `sx_vm.ml`,
ceiling at OP_DEC = 175). We partition the 0-255 space:
| Range | Use |
|---------|------------------------------------------------------------------|
| 0 | reserved / NOP |
| 1-199 | **core opcodes** — owned by the SX VM, locked schema |
| 200-247 | **extension opcodes** — registered by extensions (ports + shared) |
| 248-255 | reserved for future expansion / multi-byte opcodes |
This gives the core 24 free slots above the current 175 ceiling for future
core additions, and 48 slots for extensions. Erlang Phase 9 expects to need
fewer than 30 specialized opcodes, so this is comfortable headroom.
The plan originally proposed a finer split (`128-199` for `lib/guest/vm/`
shared, `200-247` for ports). That distinction is preserved at the **naming
level** (`guest_vm.OP_X` vs `erlang.OP_Y`) and policed by the registry
(duplicate IDs fail at startup), without consuming separate ID ranges. The
chiselling discipline (move an opcode to `guest_vm` when a second port uses
it) operates at the source level.
If we need more than 256 opcodes total, multi-byte opcodes (a leading 248-255
byte plus a second byte) extend the space without breaking the schema.
### Extension module signature
```ocaml
(* hosts/ocaml/lib/sx_vm_extension.ml *)
(** A handler for an extension opcode. Reads operands from bytecode,
manipulates the VM stack, updates the frame's instruction pointer.
May raise exceptions (which propagate via the existing VM error path). *)
type handler = vm -> frame -> unit
(** State an extension carries alongside the VM. Opaque to the VM core;
extensions cast as needed. *)
type extension_state = ..
module type EXTENSION = sig
(** Stable name for this extension (e.g. "erlang", "guest_vm"). *)
val name : string
(** Initialize per-instance state. Called once when the VM starts and the
extension is loaded. *)
val init : unit -> extension_state
(** Opcodes this extension provides. Each is (opcode_id, opcode_name, handler).
opcode_id must be in 200-247. Conflicts cause startup failure. *)
val opcodes : extension_state -> (int * string * handler) list
end
```
### Registration and dispatch
```ocaml
(* hosts/ocaml/lib/sx_vm_extensions.ml *)
let extensions : (module EXTENSION) list ref = ref []
let states : (string, extension_state) Hashtbl.t = Hashtbl.create 8
let by_id : (int, handler) Hashtbl.t = Hashtbl.create 64
let by_name : (string, int) Hashtbl.t = Hashtbl.create 64
let register (m : (module EXTENSION)) =
let module M = (val m) in
let st = M.init () in
Hashtbl.add states M.name st;
List.iter (fun (id, name, h) ->
if Hashtbl.mem by_id id then
failwith (Printf.sprintf "Opcode %d (%s) already registered" id name);
Hashtbl.add by_id id h;
Hashtbl.add by_name name id
) (M.opcodes st);
extensions := m :: !extensions
let dispatch op vm frame =
match Hashtbl.find_opt by_id op with
| Some handler -> handler vm frame
| None -> raise (Invalid_opcode op)
let id_of_name name = Hashtbl.find_opt by_name name
let state_of_extension name = Hashtbl.find_opt states name
```
Phase B installs this dispatcher into `Sx_vm.extension_dispatch_ref` at
module init. Until then, the ref's default raises `Invalid_opcode op` for
any opcode ≥ 200, which is the Phase A test condition.
The dispatch path adds **one hashtable lookup per extension opcode**.
Acceptable cost — and Erlang's specialized opcodes win >100× over going
through the general CEK machine, so the overhead is negligible by comparison.
### Bytecode compiler integration
The compiler (`lib/compiler.sx`) needs to know extension opcode IDs to emit
them. New SX primitive exposed to the compiler:
```sx
(extension-opcode-id "erlang.OP_PATTERN_TUPLE_2") ; → 200, or nil if not loaded
```
When the compiler wants to emit a specialized opcode, it queries by name. If
the extension isn't loaded, the compiler falls back to the general path
(emit a `CALL_PRIM` or general SX `case`). This means a language port's
optimization is opt-in per build, and missing extensions degrade to slower
correct execution rather than failure.
Naming convention: `<extension-name>.OP_<NAME>`. So `erlang.OP_PATTERN_TUPLE_2`,
`guest_vm.OP_PERFORM`, etc.
### Per-extension state access
Some opcodes need state beyond the VM stack (Erlang's scheduler, mailbox
state, etc.). Extensions store state in their `init`-returned value, accessed
via `state_of_extension`:
```ocaml
let op_spawn vm frame =
let st = Sx_vm_extensions.state_of_extension "erlang"
|> Option.get
|> Obj.magic in (* extension casts to its known type *)
let body = pop vm in
let pid = Erlang_scheduler.spawn st body in
push vm (pid_value pid);
frame.ip <- frame.ip + 1
```
Shared scheduler state lives in the Erlang extension's state value. Other
extensions don't see it.
---
## Phase plan
Five sub-phases in dependency order. Each is testable in isolation.
### Phase A — Opcode ID partition + dispatch fallthrough
- [x] Define `exception Invalid_opcode of int` in `sx_vm.ml`.
- [x] Add `extension_dispatch_ref : (int -> vm -> frame -> unit) ref`
whose default handler raises `Invalid_opcode op`. Forward-declared in
the same style as the existing `jit_compile_ref`.
- [x] Add `| op when op >= 200 -> !extension_dispatch_ref op vm frame` arm
in the dispatch loop, immediately before the catch-all.
- [x] Document the partition in a comment near the top of the opcode list.
**Tests:**
- All existing OCaml VM/CEK tests pass unchanged (zero regression for core).
- Constructed bytecode using opcode 200 raises `Invalid_opcode 200` when no
extension is registered.
**Effort:** small. ~50 lines + tests.
### Phase B — Extension registry module
`hosts/ocaml/lib/sx_vm_extensions.ml` per the sketch above. Pure plumbing, no
opcodes yet. Phase B's module init installs the real `dispatch` into
`Sx_vm.extension_dispatch_ref`, replacing Phase A's stub.
- [ ] `Sx_vm_extension` interface module (handler type, EXTENSION sig).
- [ ] `Sx_vm_extensions` registry module (`register`, `dispatch`,
`id_of_name`, `state_of_extension`).
- [ ] Wire the registry's `dispatch` into `Sx_vm.extension_dispatch_ref` at
module init.
**Tests:**
- Register a test extension with one opcode; dispatch finds it.
- Duplicate opcode-id registration fails at startup.
- `id_of_name` and `state_of_extension` lookups work.
**Effort:** small. ~150 lines + tests.
### Phase C — Compiler-side opcode lookup primitive
Expose `extension-opcode-id` as an SX primitive in `hosts/ocaml/lib/`. The
compiler in `lib/compiler.sx` can call it to emit extension opcodes by name.
Does not require any extension to actually exist — the primitive returns
`nil` for unknown names, and the compiler falls back.
- [ ] Register `extension-opcode-id` in `sx_primitives.ml`.
- [ ] Returns `Integer id` when registered, `Nil` otherwise.
**Tests:**
- Primitive returns nil for unknown name.
- After registering a test extension, primitive returns the registered ID.
**Effort:** small. Single primitive registration + compiler-side use docs.
### Phase D — Test extension demonstrating end-to-end flow
A dummy extension at `hosts/ocaml/lib/extensions/test_ext.ml` registering
one or two trivial opcodes (e.g. `OP_TEST_PUSH_42`, `OP_TEST_DOUBLE_TOS`).
Wired into the build, available when running tests.
Compiler test: write SX that triggers the test compiler-extension to emit
`OP_TEST_PUSH_42`, then verify the VM executes it correctly via
`bytecode-inspect` and `vm-trace`.
- [ ] `test_ext.ml` registers two opcodes.
- [ ] Wired into the build (extensions registered at startup).
- [ ] Bytecode emission via name lookup produces the right ID.
- [ ] `bytecode-inspect` shows the opcode by name.
**Tests:**
- Bytecode emission via name lookup produces the right ID.
- Execution produces the expected stack effect.
- `bytecode-inspect` shows the opcode by name.
- `vm-trace` correctly reports the extension opcode.
**Effort:** small. ~100 lines including build wiring.
### Phase E — JIT awareness (interpreted-only for v1)
The JIT (lazy lambda compilation) currently compiles based on opcode ranges.
Extension opcodes (≥200) should fall through to interpretation, not be
JIT-compiled in v1.
- [ ] Mark extension opcodes as "interpret only" in the JIT pre-analysis.
- [ ] Lambda containing only core opcodes JIT-compiles as before.
- [ ] Lambda containing any extension opcode runs interpreted.
JITing extension opcodes is a follow-up project; v1 keeps the JIT scope
unchanged and just makes it correctly route mixed bytecode.
**Tests:**
- Lambda with only core opcodes: JIT-compiled, fast path.
- Lambda with extension opcode: interpreted, correct result.
- Mixed lambda: interpreted, correct result.
**Effort:** small-medium. Requires understanding the JIT's pre-analysis
(per `project_jit_compilation.md` memory: "Lazy JIT implemented: lambda
bodies compiled on first VM call, cached, failures sentinel-marked").
Extension-opcode detection becomes another reason to mark a lambda
"interpret-only."
---
## Acceptance criteria
1. **Phase A-D pass their test suites.**
2. **Zero regression on existing SX VM tests.** All language-port test
suites currently passing on the architecture branch (Erlang 530+, Haskell
285+, Datalog 276+, Smalltalk 625+, the SX core test suite, etc.) still
pass.
3. **Test extension demonstrates the flow end-to-end.** SX source compiles
via the compiler with a registered extension opcode, executes through the
VM via the dispatch fallthrough, returns correct result.
4. **Documentation:** README in `hosts/ocaml/lib/extensions/` explaining the
pattern, with a worked example (the test extension is the canonical one).
After acceptance, the Erlang-on-SX Phase 9 work in `lib/erlang/vm/` can use
this mechanism. The Erlang loop's Blocker for 9a is resolved.
---
## Risk and mitigation
**Risk: regression in core opcode dispatch.** A misplaced `match` arm could
break something. *Mitigation:* run every existing language-port conformance
suite before merging.
**Risk: opcode ID conflicts as more extensions land.** If Erlang Phase 9
claims IDs 200-220 and Haskell wants 215-235, we have a problem.
*Mitigation:* maintain a registry document at `hosts/ocaml/lib/extensions/
README.md` listing claimed ID ranges per extension. Convention: each
extension claims a contiguous block at first registration; collisions caught
at startup with a clear error.
**Risk: extension state types leak through `Obj.magic`.** The extension state
is type-erased in the registry. *Mitigation:* extensions cast in their own
opcode handlers, never expose state to other extensions or the VM core.
First-class modules / GADTs could add more type safety; deferred unless
this becomes a concrete pain point.
**Risk: extensions become a back door for kernel mutation.** An extension
opcode handler has full access to the VM. *Mitigation:* extensions are
build-time additions, not runtime; they're as trusted as the rest of the
binary. Operators audit at build time, not runtime. Same trust model as
any other compiled-in code.
**Risk: shared `lib/guest/vm/` opcodes evolve under different language
ports' needs.** *Mitigation:* the chiselling discipline (move to guest only
on second use) ensures the shared opcodes are tested against at least two
ports' actual usage before being considered stable.
---
## Open questions
To be resolved during implementation, not blocking design approval:
1. **Multi-byte opcode encoding.** If we need >256 opcodes total, the
leading-byte 248-255 schema accommodates it. Do we need multi-byte at
v1? Probably not — 48 extension opcodes is more than any single port
should reasonably want.
2. **Extension ordering matters?** If two extensions register opcodes that
read the same VM state, ordering of registration could matter for
initialization. Probably not in practice; flag if it bites.
3. **Hot-reload of extensions.** Out of scope for v1 (per non-goals). If
wanted later, the registry would need teardown + re-registration; the
`gen_server` `code_change/3` model from Erlang Phase 7 is a precedent.
4. **Cross-extension opcode composition.** Can `guest_vm.OP_PERFORM` invoke
`erlang.OP_RECEIVE_SCAN`? In principle yes — handlers can do anything.
The interface is clean; the question is whether we want any conventions
to keep ergonomics tractable. Defer until composition appears in
practice.
---
## Implementation roadmap and sequencing
This is a sister workstream to `loops/erlang`. Driven by Erlang Phase 9.
Single bounded loop on `loops/sx-vm-extensions`, ~1-2 weeks.
Recommended sequencing (one phase per loop fire):
1. **Phase A** — dispatch fallthrough. Smallest viable change to `sx_vm.ml`.
2. **Phase B** — extension registry module.
3. **Phase C** — compiler-side opcode lookup primitive.
4. **Phase D** — test extension demonstrating end-to-end flow.
5. **Phase E** — JIT awareness (interpret-only routing).
After acceptance:
- **`hosts/ocaml/lib/extensions/erlang.ml`** becomes the *first real
consumer* — written by whoever takes over from the Erlang loop's stub
dispatcher in `lib/erlang/vm/dispatcher.sx`. That's the integration
moment that closes the loop.
Estimated total effort: 1-2 weeks for one focused engineer with OCaml SX VM
familiarity.
---
## Relationship to other plans
- **`plans/erlang-on-sx.md` Phase 9:** unblocked by this work. Erlang loop
develops opcodes against a stub dispatcher in `lib/erlang/vm/`; once this
mechanism lands, swap stub for real registration via
`hosts/ocaml/lib/extensions/erlang.ml`.
- **`plans/fed-sx-design.md` §17.5:** documents this as Layer-1 prerequisite.
The shared-opcode discipline (lib/guest/vm/) is designed on top of this
mechanism's namespace allocation.
- **Future language ports (Haskell, Datalog, Smalltalk perf phases):** will
use the same mechanism. Each adds an extension module, claims an opcode
range, registers handlers. The `lib/guest/vm/` opcodes get
cross-referenced when the second port's needs justify chiselling.
- **JIT roadmap (per `project_jit_architecture.md` memory):** extension
opcodes are interpreted in v1. JITing them is a logical follow-up but
a separate project.
---
## Progress log
Newest first.
- **2026-05-14** — Phase A done. Added `Invalid_opcode of int` exception,
`extension_dispatch_ref` (default raises `Invalid_opcode op`), and the
`| op when op >= 200 -> !extension_dispatch_ref op vm frame` arm before the
catch-all in `sx_vm.ml`. Partition comment documents 1-199 core / 200-247
extensions / 248-255 reserved (current core ceiling is OP_DEC = 175).
4 new foundation tests (3 × Invalid_opcode for opcodes 200/224/247, 1 ×
Eval_error for opcode 199 to pin the threshold). Foundation 64/64;
full OCaml test suite +4 pass vs baseline (4807 vs 4803), 1111 pre-existing
failures unchanged. Conformance suites green: erlang 530/530, haskell
285/285, datalog 276/276, prolog 590/590, smalltalk 847/847, common-lisp
305/305, apl 562/562, js 148/148, forth 632/638 (pre-existing), tcl 3/4
(pre-existing), ocaml-on-sx unit 607/607. (Lua 0/16 and ocaml-conformance
baseline programs not exercised — pre-existing scoreboard state and
multi-hour runtime respectively.)