vm-ext: bootstrap loops/sx-vm-extensions plan + loop briefing
plans/sx-vm-opcode-extension.md ports over from loops/erlang (f6a68656)
with the opcode partition adjusted to match real VM usage: 1-199 core
(current ceiling 175 = OP_DEC), 200-247 extensions, 248-255 reserved.
plans/agent-briefings/sx-vm-extensions-loop.md captures the per-fire
workflow and ground rules.
This commit is contained in:
86
plans/agent-briefings/sx-vm-extensions-loop.md
Normal file
86
plans/agent-briefings/sx-vm-extensions-loop.md
Normal file
@@ -0,0 +1,86 @@
|
|||||||
|
# sx-vm-extensions loop agent
|
||||||
|
|
||||||
|
Role: drives `plans/sx-vm-opcode-extension.md` to completion. One phase per
|
||||||
|
fire (A → B → C → D → E). Bounded loop — after Phase E acceptance, the loop
|
||||||
|
is done.
|
||||||
|
|
||||||
|
```
|
||||||
|
description: sx-vm-extensions queue loop
|
||||||
|
subagent_type: general-purpose
|
||||||
|
run_in_background: true
|
||||||
|
isolation: worktree (already on loops/sx-vm-extensions)
|
||||||
|
```
|
||||||
|
|
||||||
|
## What this loop is for
|
||||||
|
|
||||||
|
Mechanism in `hosts/ocaml/lib/` that lets language ports register specialized
|
||||||
|
bytecode opcodes without modifying the SX VM core. Direct prerequisite for
|
||||||
|
**erlang-on-sx Phase 9** (the BEAM analog) and a structural enabler for any
|
||||||
|
future language port that wants performance-critical opcodes.
|
||||||
|
|
||||||
|
## The queue
|
||||||
|
|
||||||
|
Per `plans/sx-vm-opcode-extension.md`, in order:
|
||||||
|
|
||||||
|
- **Phase A** — Opcode ID partition + dispatch fallthrough in `sx_vm.ml`.
|
||||||
|
Add `Invalid_opcode of int` exception, `extension_dispatch_ref`, the
|
||||||
|
`| op when op >= 200 -> !extension_dispatch_ref op vm frame` arm, and a
|
||||||
|
partition comment near the opcode list.
|
||||||
|
- **Phase B** — Extension registry module (`sx_vm_extensions.ml`).
|
||||||
|
`register`, `dispatch`, `id_of_name`, `state_of_extension`. Wire dispatch
|
||||||
|
into Phase A's ref at module init.
|
||||||
|
- **Phase C** — Compiler-side opcode lookup primitive (`extension-opcode-id`).
|
||||||
|
- **Phase D** — Test extension at `hosts/ocaml/lib/extensions/test_ext.ml`,
|
||||||
|
end-to-end SX → bytecode → VM dispatch flow.
|
||||||
|
- **Phase E** — JIT awareness: extension opcodes mark a lambda as
|
||||||
|
interpret-only.
|
||||||
|
|
||||||
|
## Per-fire workflow (hard)
|
||||||
|
|
||||||
|
1. Read `plans/sx-vm-opcode-extension.md` — find the first un-ticked phase.
|
||||||
|
2. Implement the phase (only files in `hosts/ocaml/**` and the plan file).
|
||||||
|
3. Build via `sx_build target=ocaml`.
|
||||||
|
4. Run regression: every existing language-port conformance suite plus
|
||||||
|
the OCaml unit tests. The list lives at `lib/<lang>/conformance.sh` —
|
||||||
|
13 suites at last count (apl, common-lisp, datalog, erlang, forth, guest,
|
||||||
|
haskell, js, lua, ocaml, prolog, smalltalk, tcl).
|
||||||
|
5. If green, commit (short factual message — `vm-ext: phase A — dispatch
|
||||||
|
fallthrough` style).
|
||||||
|
6. Tick the `[ ]` for the completed phase in the plan, append one dated
|
||||||
|
line to the Progress log (newest first).
|
||||||
|
7. Stop. Wait for the next fire.
|
||||||
|
|
||||||
|
## Ground rules (hard)
|
||||||
|
|
||||||
|
- **Scope:** only `hosts/ocaml/**` and `plans/sx-vm-opcode-extension.md`.
|
||||||
|
Do **not** edit `lib/<lang>/**`, `spec/**`, `shared/**`, or any other
|
||||||
|
language port's tests.
|
||||||
|
- **One phase per fire.** Don't combine phases even if a phase looks small.
|
||||||
|
The point of the loop is incremental commits.
|
||||||
|
- **Commit locally only.** Do **not** push. Do **not** touch `main`.
|
||||||
|
- **Worktree:** you are on `loops/sx-vm-extensions` in
|
||||||
|
`/root/rose-ash-loops/sx-vm-extensions`.
|
||||||
|
- **OCaml SX VM gotchas:**
|
||||||
|
- `vm` and `frame` types are defined in `sx_vm.ml`, not `sx_types.ml`.
|
||||||
|
Forward refs (like the existing `jit_compile_ref` pattern) are how
|
||||||
|
sibling modules avoid circular dependency.
|
||||||
|
- Current core opcode ceiling is 175 (OP_DEC). The extension threshold
|
||||||
|
is 200, leaving 24 spare slots for future core opcodes.
|
||||||
|
- JIT compilation is lazy per-lambda. See `project_jit_compilation.md`
|
||||||
|
in memory for the cache + sentinel pattern.
|
||||||
|
- **SX edits:** `sx-tree` MCP tools only (none expected for this loop, but
|
||||||
|
if needed).
|
||||||
|
- **OCaml edits:** Edit/Write tools are fine — these aren't `.sx` files.
|
||||||
|
|
||||||
|
## Done condition
|
||||||
|
|
||||||
|
Phase E acceptance: all 13 (or however many exist at the time) language-port
|
||||||
|
conformance suites pass, OCaml unit tests pass, the test extension from
|
||||||
|
Phase D demonstrates end-to-end flow including JIT routing. Loop is
|
||||||
|
complete; mark and stop.
|
||||||
|
|
||||||
|
## After acceptance
|
||||||
|
|
||||||
|
Hand off to the Erlang loop: `hosts/ocaml/lib/extensions/erlang.ml` becomes
|
||||||
|
the first real consumer, written against this mechanism instead of the
|
||||||
|
Phase 9b stub dispatcher in `lib/erlang/vm/dispatcher.sx`.
|
||||||
459
plans/sx-vm-opcode-extension.md
Normal file
459
plans/sx-vm-opcode-extension.md
Normal file
@@ -0,0 +1,459 @@
|
|||||||
|
# SX VM Opcode Extension Mechanism
|
||||||
|
|
||||||
|
Mechanism in `hosts/ocaml/lib/` that lets language ports register specialized
|
||||||
|
bytecode opcodes without modifying the SX VM core. Direct prerequisite for
|
||||||
|
**erlang-on-sx Phase 9** (the BEAM analog) and a structural enabler for any
|
||||||
|
future language port that wants performance-critical opcodes.
|
||||||
|
|
||||||
|
Reference: `plans/erlang-on-sx.md` Phase 9, `plans/fed-sx-design.md` §17.5,
|
||||||
|
`hosts/ocaml/lib/sx_vm.ml` (current VM).
|
||||||
|
|
||||||
|
Status: **in progress** on `loops/sx-vm-extensions`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Goal
|
||||||
|
|
||||||
|
Allow language ports to register custom bytecode opcodes in the SX VM, with:
|
||||||
|
|
||||||
|
- **Zero overhead for core opcodes.** Existing opcodes (current ceiling 175,
|
||||||
|
see `sx_vm.ml`) must dispatch identically. No regression for any existing
|
||||||
|
language port or the core SX runtime.
|
||||||
|
- **One additional dispatch step for extension opcodes.** Acceptable cost; the
|
||||||
|
win comes from avoiding the general CEK machinery.
|
||||||
|
- **Per-extension state slot.** Erlang's process scheduler, Haskell's thunk
|
||||||
|
cache, etc. need somewhere to hang state alongside the VM.
|
||||||
|
- **Compiler awareness.** The bytecode compiler (`lib/compiler.sx`) must be
|
||||||
|
able to emit extension opcodes by name, looked up against the registered
|
||||||
|
set.
|
||||||
|
- **JIT compatibility.** Existing JIT (lazy lambda compilation) continues to
|
||||||
|
work for code paths using only core opcodes. Extension opcodes are
|
||||||
|
interpreted in v1; JITing them is a follow-up.
|
||||||
|
|
||||||
|
## Non-goals
|
||||||
|
|
||||||
|
- **Hot opcode reload.** Adding/replacing opcodes mid-runtime is not in
|
||||||
|
scope. Extensions are compile-time additions to the OCaml binary. (If
|
||||||
|
needed, that's a separate project.)
|
||||||
|
- **Per-instance opcode sets.** All running instances of the SX VM share
|
||||||
|
the same opcode set determined at build time. Selective opcode loading
|
||||||
|
per instance is out of scope.
|
||||||
|
- **Opcode hot-swap or supersession.** Once registered, opcodes are stable
|
||||||
|
for the lifetime of the binary.
|
||||||
|
- **Language-port isolation at the dispatch layer.** Two language ports can
|
||||||
|
see each other's opcodes (they share the dispatch table). Isolation is a
|
||||||
|
build-time concern — don't compile in extensions you don't trust.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Why now
|
||||||
|
|
||||||
|
The Erlang-on-SX Phase 9 work needs this. Without it, Phase 9b-9g (the actual
|
||||||
|
opcode implementations) have nowhere to plug in. The Erlang loop hit this
|
||||||
|
dependency as a Blocker (`0abf05ed`); this design is what unblocks it.
|
||||||
|
|
||||||
|
It also enables the **shared opcode pattern** discussed in `plans/fed-sx-
|
||||||
|
design.md` §17.5: opcodes Erlang Phase 9 produces that other ports could
|
||||||
|
plausibly use (pattern match, perform/handle, record access) get chiselled
|
||||||
|
out to `lib/guest/vm/` when a second port has an actual second use. Without
|
||||||
|
the extension mechanism, each port would have to fork the SX VM core or
|
||||||
|
modify shared dispatch — neither acceptable.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Architectural overview
|
||||||
|
|
||||||
|
```
|
||||||
|
┌──────────────────────────────────────────┐
|
||||||
|
│ SX VM core (hosts/ocaml/lib/sx_vm.ml) │
|
||||||
|
│ │
|
||||||
|
│ ┌────────────────────────────────────┐ │
|
||||||
|
│ │ Bytecode dispatch loop │ │
|
||||||
|
│ │ │ │
|
||||||
|
│ │ match op with │ │
|
||||||
|
│ │ | 1 (OP_CONST) -> ... │ │
|
||||||
|
│ │ | 2 (OP_NIL) -> ... │ │
|
||||||
|
│ │ | ... │ │
|
||||||
|
│ │ | 175 -> ... (last core opcode) │ │
|
||||||
|
│ │ | op when op >= 200 -> │ │
|
||||||
|
│ │ !extension_dispatch_ref op │ │ ◄── new
|
||||||
|
│ │ vm frame │ │
|
||||||
|
│ └────────────────────────────────────┘ │
|
||||||
|
│ │
|
||||||
|
│ ┌────────────────────────────────────┐ │
|
||||||
|
│ │ Extension registry │ │
|
||||||
|
│ │ opcode_id -> handler │ │ ◄── Phase B
|
||||||
|
│ │ opcode_name -> opcode_id │ │
|
||||||
|
│ │ extension_state per extension │ │
|
||||||
|
│ └────────────────────────────────────┘ │
|
||||||
|
└──────────────────────────────────────────┘
|
||||||
|
▲
|
||||||
|
│ register at startup
|
||||||
|
┌──────────────────┴──────────────────────┐
|
||||||
|
│ Extension modules │
|
||||||
|
│ hosts/ocaml/lib/extensions/erlang.ml │
|
||||||
|
│ hosts/ocaml/lib/extensions/haskell.ml │
|
||||||
|
│ hosts/ocaml/lib/extensions/datalog.ml │
|
||||||
|
│ hosts/ocaml/lib/extensions/guest_vm.ml │ ◄── shared opcodes
|
||||||
|
└─────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### Opcode ID space partition
|
||||||
|
|
||||||
|
Current SX VM uses opcode IDs from 1 to 175 (per inspection of `sx_vm.ml`,
|
||||||
|
ceiling at OP_DEC = 175). We partition the 0-255 space:
|
||||||
|
|
||||||
|
| Range | Use |
|
||||||
|
|---------|------------------------------------------------------------------|
|
||||||
|
| 0 | reserved / NOP |
|
||||||
|
| 1-199 | **core opcodes** — owned by the SX VM, locked schema |
|
||||||
|
| 200-247 | **extension opcodes** — registered by extensions (ports + shared) |
|
||||||
|
| 248-255 | reserved for future expansion / multi-byte opcodes |
|
||||||
|
|
||||||
|
This gives the core 24 free slots above the current 175 ceiling for future
|
||||||
|
core additions, and 48 slots for extensions. Erlang Phase 9 expects to need
|
||||||
|
fewer than 30 specialized opcodes, so this is comfortable headroom.
|
||||||
|
|
||||||
|
The plan originally proposed a finer split (`128-199` for `lib/guest/vm/`
|
||||||
|
shared, `200-247` for ports). That distinction is preserved at the **naming
|
||||||
|
level** (`guest_vm.OP_X` vs `erlang.OP_Y`) and policed by the registry
|
||||||
|
(duplicate IDs fail at startup), without consuming separate ID ranges. The
|
||||||
|
chiselling discipline (move an opcode to `guest_vm` when a second port uses
|
||||||
|
it) operates at the source level.
|
||||||
|
|
||||||
|
If we need more than 256 opcodes total, multi-byte opcodes (a leading 248-255
|
||||||
|
byte plus a second byte) extend the space without breaking the schema.
|
||||||
|
|
||||||
|
### Extension module signature
|
||||||
|
|
||||||
|
```ocaml
|
||||||
|
(* hosts/ocaml/lib/sx_vm_extension.ml *)
|
||||||
|
|
||||||
|
(** A handler for an extension opcode. Reads operands from bytecode,
|
||||||
|
manipulates the VM stack, updates the frame's instruction pointer.
|
||||||
|
May raise exceptions (which propagate via the existing VM error path). *)
|
||||||
|
type handler = vm -> frame -> unit
|
||||||
|
|
||||||
|
(** State an extension carries alongside the VM. Opaque to the VM core;
|
||||||
|
extensions cast as needed. *)
|
||||||
|
type extension_state = ..
|
||||||
|
|
||||||
|
module type EXTENSION = sig
|
||||||
|
(** Stable name for this extension (e.g. "erlang", "guest_vm"). *)
|
||||||
|
val name : string
|
||||||
|
|
||||||
|
(** Initialize per-instance state. Called once when the VM starts and the
|
||||||
|
extension is loaded. *)
|
||||||
|
val init : unit -> extension_state
|
||||||
|
|
||||||
|
(** Opcodes this extension provides. Each is (opcode_id, opcode_name, handler).
|
||||||
|
opcode_id must be in 200-247. Conflicts cause startup failure. *)
|
||||||
|
val opcodes : extension_state -> (int * string * handler) list
|
||||||
|
end
|
||||||
|
```
|
||||||
|
|
||||||
|
### Registration and dispatch
|
||||||
|
|
||||||
|
```ocaml
|
||||||
|
(* hosts/ocaml/lib/sx_vm_extensions.ml *)
|
||||||
|
|
||||||
|
let extensions : (module EXTENSION) list ref = ref []
|
||||||
|
let states : (string, extension_state) Hashtbl.t = Hashtbl.create 8
|
||||||
|
let by_id : (int, handler) Hashtbl.t = Hashtbl.create 64
|
||||||
|
let by_name : (string, int) Hashtbl.t = Hashtbl.create 64
|
||||||
|
|
||||||
|
let register (m : (module EXTENSION)) =
|
||||||
|
let module M = (val m) in
|
||||||
|
let st = M.init () in
|
||||||
|
Hashtbl.add states M.name st;
|
||||||
|
List.iter (fun (id, name, h) ->
|
||||||
|
if Hashtbl.mem by_id id then
|
||||||
|
failwith (Printf.sprintf "Opcode %d (%s) already registered" id name);
|
||||||
|
Hashtbl.add by_id id h;
|
||||||
|
Hashtbl.add by_name name id
|
||||||
|
) (M.opcodes st);
|
||||||
|
extensions := m :: !extensions
|
||||||
|
|
||||||
|
let dispatch op vm frame =
|
||||||
|
match Hashtbl.find_opt by_id op with
|
||||||
|
| Some handler -> handler vm frame
|
||||||
|
| None -> raise (Invalid_opcode op)
|
||||||
|
|
||||||
|
let id_of_name name = Hashtbl.find_opt by_name name
|
||||||
|
let state_of_extension name = Hashtbl.find_opt states name
|
||||||
|
```
|
||||||
|
|
||||||
|
Phase B installs this dispatcher into `Sx_vm.extension_dispatch_ref` at
|
||||||
|
module init. Until then, the ref's default raises `Invalid_opcode op` for
|
||||||
|
any opcode ≥ 200, which is the Phase A test condition.
|
||||||
|
|
||||||
|
The dispatch path adds **one hashtable lookup per extension opcode**.
|
||||||
|
Acceptable cost — and Erlang's specialized opcodes win >100× over going
|
||||||
|
through the general CEK machine, so the overhead is negligible by comparison.
|
||||||
|
|
||||||
|
### Bytecode compiler integration
|
||||||
|
|
||||||
|
The compiler (`lib/compiler.sx`) needs to know extension opcode IDs to emit
|
||||||
|
them. New SX primitive exposed to the compiler:
|
||||||
|
|
||||||
|
```sx
|
||||||
|
(extension-opcode-id "erlang.OP_PATTERN_TUPLE_2") ; → 200, or nil if not loaded
|
||||||
|
```
|
||||||
|
|
||||||
|
When the compiler wants to emit a specialized opcode, it queries by name. If
|
||||||
|
the extension isn't loaded, the compiler falls back to the general path
|
||||||
|
(emit a `CALL_PRIM` or general SX `case`). This means a language port's
|
||||||
|
optimization is opt-in per build, and missing extensions degrade to slower
|
||||||
|
correct execution rather than failure.
|
||||||
|
|
||||||
|
Naming convention: `<extension-name>.OP_<NAME>`. So `erlang.OP_PATTERN_TUPLE_2`,
|
||||||
|
`guest_vm.OP_PERFORM`, etc.
|
||||||
|
|
||||||
|
### Per-extension state access
|
||||||
|
|
||||||
|
Some opcodes need state beyond the VM stack (Erlang's scheduler, mailbox
|
||||||
|
state, etc.). Extensions store state in their `init`-returned value, accessed
|
||||||
|
via `state_of_extension`:
|
||||||
|
|
||||||
|
```ocaml
|
||||||
|
let op_spawn vm frame =
|
||||||
|
let st = Sx_vm_extensions.state_of_extension "erlang"
|
||||||
|
|> Option.get
|
||||||
|
|> Obj.magic in (* extension casts to its known type *)
|
||||||
|
let body = pop vm in
|
||||||
|
let pid = Erlang_scheduler.spawn st body in
|
||||||
|
push vm (pid_value pid);
|
||||||
|
frame.ip <- frame.ip + 1
|
||||||
|
```
|
||||||
|
|
||||||
|
Shared scheduler state lives in the Erlang extension's state value. Other
|
||||||
|
extensions don't see it.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Phase plan
|
||||||
|
|
||||||
|
Five sub-phases in dependency order. Each is testable in isolation.
|
||||||
|
|
||||||
|
### Phase A — Opcode ID partition + dispatch fallthrough
|
||||||
|
|
||||||
|
- [ ] Define `exception Invalid_opcode of int` in `sx_vm.ml`.
|
||||||
|
- [ ] Add `extension_dispatch_ref : (int -> vm -> frame -> unit) ref`
|
||||||
|
whose default handler raises `Invalid_opcode op`. Forward-declared in
|
||||||
|
the same style as the existing `jit_compile_ref`.
|
||||||
|
- [ ] Add `| op when op >= 200 -> !extension_dispatch_ref op vm frame` arm
|
||||||
|
in the dispatch loop, immediately before the catch-all.
|
||||||
|
- [ ] Document the partition in a comment near the top of the opcode list.
|
||||||
|
|
||||||
|
**Tests:**
|
||||||
|
- All existing OCaml VM/CEK tests pass unchanged (zero regression for core).
|
||||||
|
- Constructed bytecode using opcode 200 raises `Invalid_opcode 200` when no
|
||||||
|
extension is registered.
|
||||||
|
|
||||||
|
**Effort:** small. ~50 lines + tests.
|
||||||
|
|
||||||
|
### Phase B — Extension registry module
|
||||||
|
|
||||||
|
`hosts/ocaml/lib/sx_vm_extensions.ml` per the sketch above. Pure plumbing, no
|
||||||
|
opcodes yet. Phase B's module init installs the real `dispatch` into
|
||||||
|
`Sx_vm.extension_dispatch_ref`, replacing Phase A's stub.
|
||||||
|
|
||||||
|
- [ ] `Sx_vm_extension` interface module (handler type, EXTENSION sig).
|
||||||
|
- [ ] `Sx_vm_extensions` registry module (`register`, `dispatch`,
|
||||||
|
`id_of_name`, `state_of_extension`).
|
||||||
|
- [ ] Wire the registry's `dispatch` into `Sx_vm.extension_dispatch_ref` at
|
||||||
|
module init.
|
||||||
|
|
||||||
|
**Tests:**
|
||||||
|
- Register a test extension with one opcode; dispatch finds it.
|
||||||
|
- Duplicate opcode-id registration fails at startup.
|
||||||
|
- `id_of_name` and `state_of_extension` lookups work.
|
||||||
|
|
||||||
|
**Effort:** small. ~150 lines + tests.
|
||||||
|
|
||||||
|
### Phase C — Compiler-side opcode lookup primitive
|
||||||
|
|
||||||
|
Expose `extension-opcode-id` as an SX primitive in `hosts/ocaml/lib/`. The
|
||||||
|
compiler in `lib/compiler.sx` can call it to emit extension opcodes by name.
|
||||||
|
|
||||||
|
Does not require any extension to actually exist — the primitive returns
|
||||||
|
`nil` for unknown names, and the compiler falls back.
|
||||||
|
|
||||||
|
- [ ] Register `extension-opcode-id` in `sx_primitives.ml`.
|
||||||
|
- [ ] Returns `Integer id` when registered, `Nil` otherwise.
|
||||||
|
|
||||||
|
**Tests:**
|
||||||
|
- Primitive returns nil for unknown name.
|
||||||
|
- After registering a test extension, primitive returns the registered ID.
|
||||||
|
|
||||||
|
**Effort:** small. Single primitive registration + compiler-side use docs.
|
||||||
|
|
||||||
|
### Phase D — Test extension demonstrating end-to-end flow
|
||||||
|
|
||||||
|
A dummy extension at `hosts/ocaml/lib/extensions/test_ext.ml` registering
|
||||||
|
one or two trivial opcodes (e.g. `OP_TEST_PUSH_42`, `OP_TEST_DOUBLE_TOS`).
|
||||||
|
Wired into the build, available when running tests.
|
||||||
|
|
||||||
|
Compiler test: write SX that triggers the test compiler-extension to emit
|
||||||
|
`OP_TEST_PUSH_42`, then verify the VM executes it correctly via
|
||||||
|
`bytecode-inspect` and `vm-trace`.
|
||||||
|
|
||||||
|
- [ ] `test_ext.ml` registers two opcodes.
|
||||||
|
- [ ] Wired into the build (extensions registered at startup).
|
||||||
|
- [ ] Bytecode emission via name lookup produces the right ID.
|
||||||
|
- [ ] `bytecode-inspect` shows the opcode by name.
|
||||||
|
|
||||||
|
**Tests:**
|
||||||
|
- Bytecode emission via name lookup produces the right ID.
|
||||||
|
- Execution produces the expected stack effect.
|
||||||
|
- `bytecode-inspect` shows the opcode by name.
|
||||||
|
- `vm-trace` correctly reports the extension opcode.
|
||||||
|
|
||||||
|
**Effort:** small. ~100 lines including build wiring.
|
||||||
|
|
||||||
|
### Phase E — JIT awareness (interpreted-only for v1)
|
||||||
|
|
||||||
|
The JIT (lazy lambda compilation) currently compiles based on opcode ranges.
|
||||||
|
Extension opcodes (≥200) should fall through to interpretation, not be
|
||||||
|
JIT-compiled in v1.
|
||||||
|
|
||||||
|
- [ ] Mark extension opcodes as "interpret only" in the JIT pre-analysis.
|
||||||
|
- [ ] Lambda containing only core opcodes JIT-compiles as before.
|
||||||
|
- [ ] Lambda containing any extension opcode runs interpreted.
|
||||||
|
|
||||||
|
JITing extension opcodes is a follow-up project; v1 keeps the JIT scope
|
||||||
|
unchanged and just makes it correctly route mixed bytecode.
|
||||||
|
|
||||||
|
**Tests:**
|
||||||
|
- Lambda with only core opcodes: JIT-compiled, fast path.
|
||||||
|
- Lambda with extension opcode: interpreted, correct result.
|
||||||
|
- Mixed lambda: interpreted, correct result.
|
||||||
|
|
||||||
|
**Effort:** small-medium. Requires understanding the JIT's pre-analysis
|
||||||
|
(per `project_jit_compilation.md` memory: "Lazy JIT implemented: lambda
|
||||||
|
bodies compiled on first VM call, cached, failures sentinel-marked").
|
||||||
|
Extension-opcode detection becomes another reason to mark a lambda
|
||||||
|
"interpret-only."
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Acceptance criteria
|
||||||
|
|
||||||
|
1. **Phase A-D pass their test suites.**
|
||||||
|
2. **Zero regression on existing SX VM tests.** All language-port test
|
||||||
|
suites currently passing on the architecture branch (Erlang 530+, Haskell
|
||||||
|
285+, Datalog 276+, Smalltalk 625+, the SX core test suite, etc.) still
|
||||||
|
pass.
|
||||||
|
3. **Test extension demonstrates the flow end-to-end.** SX source compiles
|
||||||
|
via the compiler with a registered extension opcode, executes through the
|
||||||
|
VM via the dispatch fallthrough, returns correct result.
|
||||||
|
4. **Documentation:** README in `hosts/ocaml/lib/extensions/` explaining the
|
||||||
|
pattern, with a worked example (the test extension is the canonical one).
|
||||||
|
|
||||||
|
After acceptance, the Erlang-on-SX Phase 9 work in `lib/erlang/vm/` can use
|
||||||
|
this mechanism. The Erlang loop's Blocker for 9a is resolved.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Risk and mitigation
|
||||||
|
|
||||||
|
**Risk: regression in core opcode dispatch.** A misplaced `match` arm could
|
||||||
|
break something. *Mitigation:* run every existing language-port conformance
|
||||||
|
suite before merging.
|
||||||
|
|
||||||
|
**Risk: opcode ID conflicts as more extensions land.** If Erlang Phase 9
|
||||||
|
claims IDs 200-220 and Haskell wants 215-235, we have a problem.
|
||||||
|
*Mitigation:* maintain a registry document at `hosts/ocaml/lib/extensions/
|
||||||
|
README.md` listing claimed ID ranges per extension. Convention: each
|
||||||
|
extension claims a contiguous block at first registration; collisions caught
|
||||||
|
at startup with a clear error.
|
||||||
|
|
||||||
|
**Risk: extension state types leak through `Obj.magic`.** The extension state
|
||||||
|
is type-erased in the registry. *Mitigation:* extensions cast in their own
|
||||||
|
opcode handlers, never expose state to other extensions or the VM core.
|
||||||
|
First-class modules / GADTs could add more type safety; deferred unless
|
||||||
|
this becomes a concrete pain point.
|
||||||
|
|
||||||
|
**Risk: extensions become a back door for kernel mutation.** An extension
|
||||||
|
opcode handler has full access to the VM. *Mitigation:* extensions are
|
||||||
|
build-time additions, not runtime; they're as trusted as the rest of the
|
||||||
|
binary. Operators audit at build time, not runtime. Same trust model as
|
||||||
|
any other compiled-in code.
|
||||||
|
|
||||||
|
**Risk: shared `lib/guest/vm/` opcodes evolve under different language
|
||||||
|
ports' needs.** *Mitigation:* the chiselling discipline (move to guest only
|
||||||
|
on second use) ensures the shared opcodes are tested against at least two
|
||||||
|
ports' actual usage before being considered stable.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Open questions
|
||||||
|
|
||||||
|
To be resolved during implementation, not blocking design approval:
|
||||||
|
|
||||||
|
1. **Multi-byte opcode encoding.** If we need >256 opcodes total, the
|
||||||
|
leading-byte 248-255 schema accommodates it. Do we need multi-byte at
|
||||||
|
v1? Probably not — 48 extension opcodes is more than any single port
|
||||||
|
should reasonably want.
|
||||||
|
2. **Extension ordering matters?** If two extensions register opcodes that
|
||||||
|
read the same VM state, ordering of registration could matter for
|
||||||
|
initialization. Probably not in practice; flag if it bites.
|
||||||
|
3. **Hot-reload of extensions.** Out of scope for v1 (per non-goals). If
|
||||||
|
wanted later, the registry would need teardown + re-registration; the
|
||||||
|
`gen_server` `code_change/3` model from Erlang Phase 7 is a precedent.
|
||||||
|
4. **Cross-extension opcode composition.** Can `guest_vm.OP_PERFORM` invoke
|
||||||
|
`erlang.OP_RECEIVE_SCAN`? In principle yes — handlers can do anything.
|
||||||
|
The interface is clean; the question is whether we want any conventions
|
||||||
|
to keep ergonomics tractable. Defer until composition appears in
|
||||||
|
practice.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Implementation roadmap and sequencing
|
||||||
|
|
||||||
|
This is a sister workstream to `loops/erlang`. Driven by Erlang Phase 9.
|
||||||
|
Single bounded loop on `loops/sx-vm-extensions`, ~1-2 weeks.
|
||||||
|
|
||||||
|
Recommended sequencing (one phase per loop fire):
|
||||||
|
|
||||||
|
1. **Phase A** — dispatch fallthrough. Smallest viable change to `sx_vm.ml`.
|
||||||
|
2. **Phase B** — extension registry module.
|
||||||
|
3. **Phase C** — compiler-side opcode lookup primitive.
|
||||||
|
4. **Phase D** — test extension demonstrating end-to-end flow.
|
||||||
|
5. **Phase E** — JIT awareness (interpret-only routing).
|
||||||
|
|
||||||
|
After acceptance:
|
||||||
|
|
||||||
|
- **`hosts/ocaml/lib/extensions/erlang.ml`** becomes the *first real
|
||||||
|
consumer* — written by whoever takes over from the Erlang loop's stub
|
||||||
|
dispatcher in `lib/erlang/vm/dispatcher.sx`. That's the integration
|
||||||
|
moment that closes the loop.
|
||||||
|
|
||||||
|
Estimated total effort: 1-2 weeks for one focused engineer with OCaml SX VM
|
||||||
|
familiarity.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Relationship to other plans
|
||||||
|
|
||||||
|
- **`plans/erlang-on-sx.md` Phase 9:** unblocked by this work. Erlang loop
|
||||||
|
develops opcodes against a stub dispatcher in `lib/erlang/vm/`; once this
|
||||||
|
mechanism lands, swap stub for real registration via
|
||||||
|
`hosts/ocaml/lib/extensions/erlang.ml`.
|
||||||
|
- **`plans/fed-sx-design.md` §17.5:** documents this as Layer-1 prerequisite.
|
||||||
|
The shared-opcode discipline (lib/guest/vm/) is designed on top of this
|
||||||
|
mechanism's namespace allocation.
|
||||||
|
- **Future language ports (Haskell, Datalog, Smalltalk perf phases):** will
|
||||||
|
use the same mechanism. Each adds an extension module, claims an opcode
|
||||||
|
range, registers handlers. The `lib/guest/vm/` opcodes get
|
||||||
|
cross-referenced when the second port's needs justify chiselling.
|
||||||
|
- **JIT roadmap (per `project_jit_architecture.md` memory):** extension
|
||||||
|
opcodes are interpreted in v1. JITing them is a logical follow-up but
|
||||||
|
a separate project.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Progress log
|
||||||
|
|
||||||
|
Newest first.
|
||||||
|
|
||||||
Reference in New Issue
Block a user