Files
rose-ash/plans/sx-vm-opcode-extension.md

431 lines
18 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# SX VM Opcode Extension Mechanism
Mechanism in `hosts/ocaml/evaluator/` that lets language ports register
specialized bytecode opcodes without modifying the SX VM core. Direct
prerequisite for **erlang-on-sx Phase 9** (the BEAM analog) and a structural
enabler for any future language port that wants performance-critical opcodes.
Reference: `plans/erlang-on-sx.md` Phase 9, `plans/fed-sx-design.md` §17.5,
`hosts/ocaml/lib/sx_vm.ml` (current VM).
Status: **design** — implementation pending. Sister workstream to the
`loops/erlang` loop, but lives in `hosts/`, not `lib/erlang/`.
---
## Goal
Allow language ports to register custom bytecode opcodes in the SX VM, with:
- **Zero overhead for core opcodes.** Existing 37 opcodes (per `sx_vm.ml`)
must dispatch identically. No regression for any existing language port or
the core SX runtime.
- **One additional dispatch step for extension opcodes.** Acceptable cost; the
win comes from avoiding the general CEK machinery.
- **Per-extension state slot.** Erlang's process scheduler, Haskell's thunk
cache, etc. need somewhere to hang state alongside the VM.
- **Compiler awareness.** The bytecode compiler (`lib/compiler.sx`) must be
able to emit extension opcodes by name, looked up against the registered
set.
- **JIT compatibility.** Existing JIT (lazy lambda compilation) continues to
work for code paths using only core opcodes. Extension opcodes are
interpreted in v1; JITing them is a follow-up.
## Non-goals
- **Hot opcode reload.** Adding/replacing opcodes mid-runtime is not in
scope. Extensions are compile-time additions to the OCaml binary. (If
needed, that's a separate project.)
- **Per-instance opcode sets.** All running instances of the SX VM share
the same opcode set determined at build time. Selective opcode loading
per instance is out of scope.
- **Opcode hot-swap or supersession.** Once registered, opcodes are stable
for the lifetime of the binary.
- **Language-port isolation at the dispatch layer.** Two language ports can
see each other's opcodes (they share the dispatch table). Isolation is a
build-time concern — don't compile in extensions you don't trust.
---
## Why now
The Erlang-on-SX Phase 9 work needs this. Without it, Phase 9b-9g (the actual
opcode implementations) have nowhere to plug in. The Erlang loop will hit
this dependency as a Blocker; this design is what unblocks it.
It also enables the **shared opcode pattern** discussed in `plans/fed-sx-
design.md` §17.5: opcodes Erlang Phase 9 produces that other ports could
plausibly use (pattern match, perform/handle, record access) get chiselled
out to `lib/guest/vm/` when a second port has an actual second use. Without
the extension mechanism, each port would have to fork the SX VM core or
modify shared dispatch — neither acceptable.
---
## Architectural overview
```
┌──────────────────────────────────────────┐
│ SX VM core (hosts/ocaml/lib/sx_vm.ml) │
│ │
│ ┌────────────────────────────────────┐ │
│ │ Bytecode dispatch loop │ │
│ │ │ │
│ │ match op with │ │
│ │ | 1 (OP_CONST) -> ... │ │
│ │ | 2 (OP_NIL) -> ... │ │
│ │ | ... │ │
│ │ | 199 -> ... (last core opcode) │ │
│ │ | op when op >= 200 -> │ │
│ │ Extensions.dispatch op vm │ │ ◄── new
│ │ frame │ │
│ └────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────┐ │
│ │ Extension registry │ │
│ │ opcode_id -> handler │ │ ◄── new
│ │ opcode_name -> opcode_id │ │
│ │ extension_state per extension │ │
│ └────────────────────────────────────┘ │
└──────────────────────────────────────────┘
│ register at startup
┌──────────────────┴──────────────────────┐
│ Extension modules │
│ hosts/ocaml/extensions/erlang.ml │
│ hosts/ocaml/extensions/haskell.ml │
│ hosts/ocaml/extensions/datalog.ml │
│ hosts/ocaml/extensions/guest_vm.ml │ ◄── shared opcodes
└─────────────────────────────────────────┘
```
### Opcode ID space partition
Current SX VM uses opcode IDs in roughly the range 1-162 (per inspection of
`sx_vm.ml`). We partition the 0-255 space:
| Range | Use |
|-------|-----|
| 0 | reserved / NOP |
| 1-127 | **core opcodes** — owned by the SX VM, locked schema |
| 128-199 | **`lib/guest/vm/` shared opcodes** — chiselled-out shared opcodes |
| 200-247 | **language-port opcodes** — registered by extensions |
| 248-255 | reserved for future expansion / multi-byte opcodes |
This gives ~50 slots for shared opcodes (Phase 1-2 of `lib/guest/vm/` will
not exhaust this; we can renegotiate if it does), ~50 for any single language
port's specialized opcodes, and clean separation that makes it obvious which
opcodes are stable (core), shared (guest), or port-specific (extension).
If we need more than 256 opcodes total, multi-byte opcodes (a leading 248-255
byte plus a second byte) extend the space without breaking the schema.
### Extension module signature
```ocaml
(* hosts/ocaml/lib/sx_vm_extension.ml *)
(** A handler for an extension opcode. Reads operands from bytecode,
manipulates the VM stack, updates the frame's instruction pointer.
May raise exceptions (which propagate via the existing VM error path). *)
type handler = vm -> frame -> unit
(** State an extension carries alongside the VM. Opaque to the VM core;
extensions cast as needed. *)
type extension_state = ..
module type EXTENSION = sig
(** Stable name for this extension (e.g. "erlang", "guest_vm"). *)
val name : string
(** Initialize per-instance state. Called once when the VM starts and the
extension is loaded. *)
val init : unit -> extension_state
(** Opcodes this extension provides. Each is (opcode_id, opcode_name, handler).
opcode_id must be in the range allowed for this extension's tier
(128-199 for guest, 200-247 for ports). Conflicts cause startup failure. *)
val opcodes : extension_state -> (int * string * handler) list
end
```
### Registration and dispatch
```ocaml
(* hosts/ocaml/lib/sx_vm_extensions.ml *)
let extensions : (module EXTENSION) list ref = ref []
let states : (string, extension_state) Hashtbl.t = Hashtbl.create 8
let by_id : (int, handler) Hashtbl.t = Hashtbl.create 64
let by_name : (string, int) Hashtbl.t = Hashtbl.create 64
let register (m : (module EXTENSION)) =
let module M = (val m) in
let st = M.init () in
Hashtbl.add states M.name st;
List.iter (fun (id, name, h) ->
if Hashtbl.mem by_id id then
failwith (Printf.sprintf "Opcode %d (%s) already registered" id name);
Hashtbl.add by_id id h;
Hashtbl.add by_name name id
) (M.opcodes st);
extensions := m :: !extensions
let dispatch op vm frame =
match Hashtbl.find_opt by_id op with
| Some handler -> handler vm frame
| None -> raise (Invalid_opcode op)
let id_of_name name = Hashtbl.find_opt by_name name
let state_of_extension name = Hashtbl.find_opt states name
```
The dispatch path adds **one hashtable lookup per extension opcode**.
Acceptable cost — and Erlang's specialized opcodes win >100× over going
through the general CEK machine, so the overhead is negligible by comparison.
### Bytecode compiler integration
The compiler (`lib/compiler.sx`) needs to know extension opcode IDs to emit
them. New SX primitive exposed to the compiler:
```sx
(extension-opcode-id "erlang.OP_PATTERN_TUPLE_2") ; → 200, or nil if not loaded
```
When the compiler wants to emit a specialized opcode, it queries by name. If
the extension isn't loaded, the compiler falls back to the general path
(emit a `CALL_PRIM` or general SX `case`). This means a language port's
optimization is opt-in per build, and missing extensions degrade to slower
correct execution rather than failure.
Naming convention: `<extension-name>.OP_<NAME>`. So `erlang.OP_PATTERN_TUPLE_2`,
`guest_vm.OP_PERFORM`, etc.
### Per-extension state access
Some opcodes need state beyond the VM stack (Erlang's scheduler, mailbox
state, etc.). Extensions store state in their `init`-returned value, accessed
via `state_of_extension`:
```ocaml
let op_spawn vm frame =
let st = Sx_vm_extensions.state_of_extension "erlang"
|> Option.get
|> Obj.magic in (* extension casts to its known type *)
let body = pop vm in
let pid = Erlang_scheduler.spawn st body in
push vm (pid_value pid);
frame.ip <- frame.ip + 1
```
Shared scheduler state lives in the Erlang extension's state value. Other
extensions don't see it.
---
## Phase plan
Five sub-phases in dependency order. Each is testable in isolation.
### Phase A — Opcode ID partition + dispatch fallthrough
Smallest viable change to `sx_vm.ml`:
- Add the `| op when op >= 128 -> Sx_vm_extensions.dispatch op vm frame`
fallthrough case.
- Document the partition in a comment at the top of the opcode list.
**Tests:**
- All existing SX VM tests pass unchanged (zero regression for core).
- Calling `dispatch 200 ...` with no extension registered raises
`Invalid_opcode 200`.
**Effort:** small. ~50 lines + tests.
### Phase B — Extension registry module
`hosts/ocaml/lib/sx_vm_extensions.ml` per the sketch above. Pure plumbing, no
opcodes yet.
**Tests:**
- Register a test extension with one opcode; dispatch finds it.
- Duplicate opcode-id registration fails at startup.
- `id_of_name` and `state_of_extension` lookups work.
**Effort:** small. ~150 lines + tests.
### Phase C — Compiler-side opcode lookup primitive
Expose `extension-opcode-id` as an SX primitive in `hosts/ocaml/lib/`. The
compiler in `lib/compiler.sx` can call it to emit extension opcodes by name.
Does not require any extension to actually exist — the primitive returns
`nil` for unknown names, and the compiler falls back.
**Tests:**
- Primitive returns nil for unknown name.
- After registering a test extension, primitive returns the registered ID.
**Effort:** small. Single primitive registration + compiler-side use docs.
### Phase D — Test extension demonstrating end-to-end flow
A dummy extension at `hosts/ocaml/extensions/test_ext.ml` registering one or
two trivial opcodes (e.g. `OP_TEST_PUSH_42`, `OP_TEST_DOUBLE_TOS`). Wired
into the build, available when running tests.
Compiler test: write SX that triggers the test compiler-extension to emit
`OP_TEST_PUSH_42`, then verify the VM executes it correctly via
`bytecode-inspect` and `vm-trace`.
**Tests:**
- Bytecode emission via name lookup produces the right ID.
- Execution produces the expected stack effect.
- `bytecode-inspect` shows the opcode by name.
- `vm-trace` correctly reports the extension opcode.
**Effort:** small. ~100 lines including build wiring.
### Phase E — JIT awareness (interpreted-only for v1)
The JIT (lazy lambda compilation) currently compiles based on opcode ranges.
Extension opcodes (≥128) should fall through to interpretation, not be
JIT-compiled in v1.
- Mark extension opcodes as "interpret only" in the JIT pre-analysis.
- A lambda containing only core opcodes JIT-compiles as before.
- A lambda containing any extension opcode runs interpreted.
JITing extension opcodes is a follow-up project; v1 keeps the JIT scope
unchanged and just makes it correctly route mixed bytecode.
**Tests:**
- Lambda with only core opcodes: JIT-compiled, fast path.
- Lambda with extension opcode: interpreted, correct result.
- Mixed lambda: interpreted, correct result.
**Effort:** small-medium. Requires understanding the JIT's pre-analysis
(per `project_jit_compilation.md` memory: "Lazy JIT implemented: lambda
bodies compiled on first VM call, cached, failures sentinel-marked").
Extension-opcode detection becomes another reason to mark a lambda
"interpret-only."
---
## Acceptance criteria
1. **Phase A-D pass their test suites.**
2. **Zero regression on existing SX VM tests.** All language-port test
suites currently passing on the architecture branch (Erlang 530+, Haskell
285+, Datalog 276+, Smalltalk 625+, the SX core test suite, etc.) still
pass.
3. **Test extension demonstrates the flow end-to-end.** SX source compiles
via the compiler with a registered extension opcode, executes through the
VM via the dispatch fallthrough, returns correct result.
4. **Documentation:** README in `hosts/ocaml/extensions/` explaining the
pattern, with a worked example (the test extension is the canonical one).
After acceptance, the Erlang-on-SX Phase 9 work in `lib/erlang/vm/` can use
this mechanism. The Erlang loop's Blocker for 9a is resolved.
---
## Risk and mitigation
**Risk: regression in core opcode dispatch.** A misplaced `match` arm could
break something. *Mitigation:* run every existing language-port test suite
before merging. The cost of this verification is real — probably an hour of
machine time — but cheaper than discovering it after the fact.
**Risk: opcode ID conflicts as more extensions land.** If Erlang Phase 9
claims IDs 200-220 and Haskell wants 215-235, we have a problem.
*Mitigation:* maintain a registry document at `hosts/ocaml/extensions/
README.md` listing claimed ID ranges per extension. Convention: each
extension claims a contiguous block at first registration; collisions caught
at startup with a clear error.
**Risk: extension state types leak through `Obj.magic`.** The extension state
is type-erased in the registry. *Mitigation:* extensions cast in their own
opcode handlers, never expose state to other extensions or the VM core.
First-class modules / GADTs could add more type safety; deferred unless
this becomes a concrete pain point.
**Risk: extensions become a back door for kernel mutation.** An extension
opcode handler has full access to the VM. *Mitigation:* extensions are
build-time additions, not runtime; they're as trusted as the rest of the
binary. Operators audit at build time, not runtime. Same trust model as
any other compiled-in code.
**Risk: shared `lib/guest/vm/` opcodes evolve under different language
ports' needs.** *Mitigation:* the chiselling discipline (move to guest only
on second use) ensures the shared opcodes are tested against at least two
ports' actual usage before being considered stable.
---
## Open questions
To be resolved during implementation, not blocking design approval:
1. **Multi-byte opcode encoding.** If we need >256 opcodes total, the
leading-byte 248-255 schema accommodates it. Do we need multi-byte at
v1? Probably not — 200+ opcodes per port is more than any port should
reasonably want.
2. **Extension ordering matters?** If two extensions register opcodes that
read the same VM state, ordering of registration could matter for
initialization. Probably not in practice; flag if it bites.
3. **Hot-reload of extensions.** Out of scope for v1 (per non-goals). If
wanted later, the registry would need teardown + re-registration; the
`gen_server` `code_change/3` model from Erlang Phase 7 is a precedent.
4. **Cross-extension opcode composition.** Can `guest_vm.OP_PERFORM` invoke
`erlang.OP_RECEIVE_SCAN`? In principle yes — handlers can do anything.
The interface is clean; the question is whether we want any conventions
to keep ergonomics tractable. Defer until composition appears in
practice.
---
## Implementation roadmap and sequencing
This is a sister workstream to `loops/erlang`. Probably best as a single
focused session (not a continuous loop — the work is bounded, ~1-2 weeks
of focused effort, not iterative).
Recommended sequencing:
1. **A + B + C land together** as a single PR — they're tightly coupled and
easier to test as a unit. Branch: `loops/sx-vm-extensions` or similar.
2. **D follows** in a second PR; demonstrates the end-to-end flow without
committing to any real language port's opcode design.
3. **E (JIT integration)** as a third PR, once the basic mechanism is
battle-tested.
4. **Extension scope check:** verify Erlang's Phase 9 sub-phases 9b-9g can
actually use this mechanism. If gaps surface, they're addressable
incrementally.
5. **`hosts/ocaml/extensions/erlang.ml`** then becomes the *first real
consumer* — written by whoever takes over from the Erlang loop's stub
dispatcher. That's the integration moment that closes the loop.
Estimated total effort: 1-2 weeks for one focused engineer with OCaml SX VM
familiarity. Much less if the implementer already knows `sx_vm.ml`.
---
## Relationship to other plans
- **`plans/erlang-on-sx.md` Phase 9:** unblocked by this work. Erlang loop
develops opcodes against a stub dispatcher in `lib/erlang/vm/`; once this
mechanism lands, swap stub for real registration via
`hosts/ocaml/extensions/erlang.ml`.
- **`plans/fed-sx-design.md` §17.5:** documents this as Layer-1 prerequisite.
The shared-opcode discipline (lib/guest/vm/) is designed on top of this
mechanism's `lib/guest/vm/` namespace allocation.
- **Future language ports (Haskell, Datalog, Smalltalk perf phases):** will
use the same mechanism. Each adds an extension module, claims an opcode
range, registers handlers. The `lib/guest/vm/` opcodes get
cross-referenced when the second port's needs justify chiselling.
- **JIT roadmap (per `project_jit_architecture.md` memory):** extension
opcodes are interpreted in v1. JITing them is a logical follow-up but
a separate project.