Compare commits

...

68 Commits

Author SHA1 Message Date
4fc73a97f4 go: lex.sx — keywords, ident/int/string/rune lits, comments, ops, ASI + 78 tests [consumes-lex]
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 23s
First Go-on-SX iteration. Tokenizer consumes lib/guest/lex.sx character-class
predicates. Automatic semicolon insertion per Go spec § Semicolons fires on
newline, EOF, and block comments containing a newline, after
ident/int/string/rune/{break,continue,fallthrough,return}/{++,--,),],}}.

Scoreboard + conformance.sh wired; lex 78/78. Plan Phase 1 sub-items
checked; floats/raw-strings/hex-ints still .

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 21:13:06 +00:00
0f7444e0d5 plans: Go-on-SX + sister lib/guest extraction plans (scheduler, bidirectional types)
- go-on-sx.md: rewrite of 2026-04-26 draft to integrate lib/guest framework.
  Adds Phase 3 (independent bidirectional type checker — first static-typed
  guest), Phase 10 (extraction enabler), chisel discipline, conformance
  scoreboard model. Phases 1-2 now consume lib/guest/core lex+pratt+ast.

- lib-guest-scheduler.md: NEW. Extraction plan for the fork/yield/block/
  resume scheduler shared by Erlang (addressed processes + mailboxes) and
  Go (anonymous channels + goroutines). Two-language rule blocks extraction
  until both consumers independently work; rejected-extraction is a valid
  outcome.

- lib-guest-static-types-bidirectional.md: NEW. Sister to lib/guest/hm.sx.
  Bidirectional checker kit (synth/check judgments, pluggable subtype +
  unify) for the languages HM doesn't fit — Go, Rust, TS, Swift, Kotlin,
  Scala 3, Hack. First consumer: Go-on-SX. Second TBD; recommendation
  TypeScript.

The three plans cross-reference each other. Go-on-SX implements scheduler +
checker independently of the kits; extraction is its own workstream once
two consumers exist.
2026-05-26 20:54:22 +00:00
abde5fbac1 Merge loops/erlang into architecture: Phase 8 host-primitive BIFs (crypto/cid/file:list_dir)
Wires the 3 previously-BLOCKED Phase 8 FFI BIFs against loops/fed-prims
primitives (merged at 380bc69f):

- crypto:hash/2 → crypto-sha256/sha512/sha3-256 (atom dispatch, raw-binary
  return via er-hex->bytes), +6 ffi tests
- cid:from_bytes/1 → CIDv1 raw-codec (0x55) + sha2-256 multihash assembled
  in SX; cid:to_string/1 → cid-from-sx of canonical er-format-value string,
  +7 ffi tests
- file:list_dir/1 → file-list-dir, {ok,[Binary]} / {error,Reason} reusing
  er-classify-file-error, +4 ffi tests

ffi suite 14 → 28 (3 BLOCKED negative-asserts flipped to functional tests).
httpc:request and sqlite:* remain BLOCKED — need HTTP-client and SQLite
host primitives which loops/fed-prims didn't deliver.

Full conformance 729/729 (eval 385, vm 78, ffi 28, all process suites).
2026-05-26 19:30:35 +00:00
b7fcd17e6e Merge remote-tracking branch 'origin/loops/erlang' into loops/erlang
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 3m3s
2026-05-18 22:03:43 +00:00
89ce7b857d erlang: wire file:list_dir/1 against file-list-dir (Phase 8, +4 ffi tests); 729/729, progress log 2026-05-18 22:01:03 +00:00
4591ac530b erlang: wire cid:from_bytes/1 + cid:to_string/1 against cid-from-bytes/cid-from-sx (Phase 8, +7 ffi tests) 2026-05-18 22:00:41 +00:00
250d0511c0 erlang: wire crypto:hash/2 against crypto-sha256/512/sha3-256 (Phase 8, +6 ffi tests) 2026-05-18 22:00:17 +00:00
380bc69f94 Merge loops/fed-prims into architecture: fed-sx host primitives (Phases A-I)
Pure-OCaml WASM-safe crypto/CID surface + native HTTP server:
- crypto-sha256/sha512 (FIPS 180-4), crypto-sha3-256 (FIPS 202)
- cbor-encode/decode (deterministic dag-cbor), cid-from-bytes/from-sx (CIDv1)
- ed25519-verify (RFC 8032), rsa-sha256-verify (PKCS#1 v1.5, RFC 8017)
- file-list-dir (native-safe), http-listen (native-only, bin/sx_server.ml)
Unblocks Erlang Phase 8 BIFs (erlang-on-sx.md blocker -> RESOLVED).
Merged: build green, 63 crypto tests pass, WASM boot OK, http test 6/6,
Erlang conformance 715/715, no regression.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 21:33:01 +00:00
77f17cc796 Merge loops/erlang into architecture: Phases 7-10 (hot reload, FFI BIFs, BIF registry, VM opcode extension + erlang_ext); fixes cyclic-env identity hang
# Conflicts:
#	hosts/ocaml/bin/run_tests.ml
#	plans/sx-vm-opcode-extension.md
2026-05-18 20:46:04 +00:00
4548461bfc fed-prims: Phase I — handoff (RESOLVED blocker + primitive->BIF mapping)
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 2m50s
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 18:48:35 +00:00
7d9dddcc80 fed-prims: Phase H — native-only http-listen HTTP/1.1 server + curl test
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 2m53s
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 18:25:24 +00:00
36be6bf44b fed-prims: Phase G — file-list-dir (Sys.readdir, sorted, native-safe)
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 2m52s
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 17:57:20 +00:00
c352d94cc6 erlang: log cyclic-env regression root-cause + fix in progress log 2026-05-18 17:34:24 +00:00
857fae1331 erlang: fix er-env-derived-from? to use identical? not = (cyclic-env hang on structural-= evaluators) 2026-05-18 17:33:48 +00:00
f8fc04840a fed-prims: Phase F — RSA-SHA256 PKCS#1 v1.5 verify, pure OCaml, RSA-2048 vector
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 3m9s
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 17:32:35 +00:00
76d1e9f53a fed-prims: Phase E — Ed25519 verify (RFC 8032), pure-OCaml bignum + edwards25519
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 3m2s
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 17:05:59 +00:00
d8b57784fe fed-prims: Phase D — CIDv1 (multihash + base32 multibase), pure OCaml, canonical IPFS vectors
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 3m2s
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 16:36:42 +00:00
bcaaa11916 fed-prims: Phase C — dag-cbor encode/decode, pure OCaml, RFC 8949 vectors + determinism
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 3m8s
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 16:10:36 +00:00
451bd4be62 fed-prims: Phase B — SHA3-256 (Keccak-f[1600]), pure OCaml, 4 NIST vectors
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 2m41s
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 15:43:51 +00:00
19932a42a9 fed-prims: Phase A — SHA-256 + SHA-512, pure OCaml, 7 NIST vectors
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 3m33s
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 15:17:35 +00:00
3629dd96a9 fed-prims: bootstrap plan + loop briefing
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 3m53s
Pure-OCaml crypto/CBOR/CID/Ed25519/RSA + native HTTP server in
hosts/ocaml/, the host-primitive surface Erlang Phase 8 BIFs and
fed-sx Milestone 1 are blocked on. WASM-safe lib boundary enforced.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 15:00:33 +00:00
a341041627 datalog: scoreboard bump (preserve before loops/erlang merge) 2026-05-18 14:48:00 +00:00
b073a82b33 erlang: Phase 10a — trace JIT/compiler architecture, scope into 10a.1-4, block on lib/compiler.sx 2026-05-15 09:03:50 +00:00
7996bcdacf erlang: 10b BIF-complete (10/18); control opcodes correctly gated on 10a + log 2026-05-15 08:59:11 +00:00
3b6241508c erlang: Phase 10b — ELEMENT + LISTS_REVERSE real (all 10 BIF opcodes done), +6 e2e tests 2026-05-15 08:58:41 +00:00
5774065341 erlang: 10b progress — 8/18 handlers real (hot-BIFs done) + log 2026-05-15 08:51:37 +00:00
708b5a2b12 erlang: Phase 10b — 7 more real hot-BIF handlers (HD/TL/TUPLE_SIZE/IS_*), +9 e2e tests 2026-05-15 08:51:01 +00:00
e6261c2519 erlang: mark 10b in-progress (vertical slice) + progress log 2026-05-15 08:44:29 +00:00
5c7ad01bd1 erlang: Phase 10b slice — real OP_BIF_LENGTH handler, end-to-end VM proof 2026-05-15 08:43:45 +00:00
33725de03b erlang: Phase 9g — ring bench on integrated binary (no regression); scope Phase 10 2026-05-15 08:36:05 +00:00
5fd358a7a7 erlang: Phase 9i — SX dispatcher consults extension-opcode-id (+6 vm tests, 715/715) 2026-05-15 08:30:52 +00:00
783e0cb5fe erlang: tick 9h + progress log 2026-05-15 08:25:32 +00:00
72896392c8 erlang: Phase 9h — erlang_ext.ml OCaml extension (opcodes 222-239, registered at startup) 2026-05-15 08:24:57 +00:00
12b56afcd3 erlang: Phase 9a integrated (cherry-pick + force-link); plan 9h/9i added 2026-05-15 08:11:55 +00:00
509197410f vm-ext: force-link Sx_vm_extensions into sx_server.exe (extension-opcode-id now live) 2026-05-15 08:10:33 +00:00
76614da154 vm-ext: phase E — JIT skips lambdas containing extension opcodes
Adds Sx_vm.bytecode_uses_extension_opcodes — an operand-aware
bytecode scanner that walks past CONST u16, CALL_PRIM u16+u8, and
CLOSURE u16+dynamic upvalue descriptors so operand bytes that happen
to be ≥200 don't false-positive as extension opcodes.

jit_compile_lambda calls the scanner on the inner closure's bytecode.
On hit it returns None — the lambda then runs through CEK
interpretation. The VM's dispatch fallthrough still routes the
extension opcodes themselves through the registry; this change just
prevents the JIT from claiming code it has no plan for.

Tests: 7 new foundation cases — pure core eligible, head/middle/
post-CLOSURE detection, CONST + CALL_PRIM + CLOSURE-descriptor false-
positive avoidance. +7 pass vs Phase D baseline, no regressions
across 11 conformance suites.

Loop complete: acceptance criteria 1-4 met. Hand-off to the Erlang
loop — lib/erlang/vm/dispatcher.sx's Phase 9b stub can now be
replaced with a real hosts/ocaml/lib/extensions/erlang.ml consumer.
2026-05-15 08:06:35 +00:00
4dfccc244d vm-ext: phase D — extensions/ subtree + test_ext + opcode_name lookup
lib/extensions/ becomes the new home for VM extensions, wired in via
(include_subdirs unqualified). README documents the registration
pattern, opcode-ID range conventions (200-209 guest_vm, 210-219
inline test, 220-229 test_ext, 230-247 ports), and naming rules.

extensions/test_ext.ml is the canonical worked example — two
operand-less opcodes (220 push 42, 221 double TOS) carrying a per-
extension state slot (TestExtState invocation counter). Test_ext.register
called from run_tests.ml at the start of the Phase D suite, on top of
the inline test_reg from earlier suites (disjoint opcode IDs).

Sx_vm.opcode_name now consults extension_opcode_name_ref (forward ref
in the same style as extension_dispatch_ref), so disassemble shows
extension opcodes by name instead of UNKNOWN_n. Registry maintains
name_of_id_table and installs the lookup at module init.

Tests: 5 new foundation cases — primitive resolves test_ext name,
end-to-end bytecode (push + double + return → 84), disassemble shows
"test_ext.OP_TEST_PUSH_42" / "test_ext.OP_TEST_DOUBLE_TOS",
unregistered ext opcodes still fall back to UNKNOWN_n, invocation
counter records the two dispatches. +5 pass vs Phase C baseline, no
regressions across 11 conformance suites.
2026-05-15 08:06:35 +00:00
58d7445559 vm-ext: phase C — extension-opcode-id SX primitive
Registers extension-opcode-id from sx_vm_extensions.ml module init.
Lives downstream of both sx_primitives and sx_vm to avoid a build
cycle. Accepts a string or symbol; returns Integer id when the opcode
is registered, Nil otherwise.

Compilers (lib/compiler.sx) call this to emit extension opcodes by
name. Returning Nil rather than failing on unknown names lets a port's
optimization opt in per-build — missing extensions degrade to slower
correct execution.

Tests: 5 new foundation cases — registered lookup, unknown → nil,
symbol arg, zero-arg + integer-arg rejection. +5 pass vs Phase B
baseline, no regressions across 11 conformance suites.
2026-05-15 08:06:35 +00:00
4e0a92ec00 vm-ext: phase B — extension registry module
sx_vm_extension.ml: handler type, extensible extension_state variant,
EXTENSION first-class module signature.

sx_vm_extensions.ml: register / dispatch / id_of_name /
state_of_extension. install_dispatch () runs at module init,
swapping Phase A's stub for the real registry. Rejects out-of-range
opcode IDs (must be 200-247), duplicate IDs, duplicate names, and
duplicate extension names.

Tests: 9 new foundation cases — lookup hits/misses, end-to-end VM
dispatch including opcode composition, all four rejection paths.
+9 pass vs Phase A baseline, no regressions across 11 conformance
suites.
2026-05-15 08:06:35 +00:00
85728621b0 vm-ext: phase A — extension dispatch fallthrough in sx_vm.ml
Adds Invalid_opcode of int exception and extension_dispatch_ref forward
ref (default raises Invalid_opcode op), plus the |op when op >= 200 arm
before the catch-all in the bytecode dispatch loop. Partition comment
documents 1-199 core / 200-247 extensions / 248-255 reserved.

Phase B will install the real registry's dispatch into the ref at module
init, replacing this stub.

Tests: 4 new foundation cases (Invalid_opcode for 200/224/247, Eval_error
for 199 to pin the threshold). +4 pass vs baseline, no regressions.
2026-05-15 08:06:35 +00:00
715fab86d2 Merge loops/sx-vm-extensions into architecture: hosts/ocaml VM opcode extension mechanism
5 phases (A-E) per plans/sx-vm-opcode-extension.md:

- A: Sx_vm dispatch fallthrough for opcodes ≥200 + Invalid_opcode + extension_dispatch_ref
- B: Sx_vm_extension interface + Sx_vm_extensions registry (register / dispatch /
     id_of_name / state_of_extension), installs into the dispatch_ref at module init
- C: extension-opcode-id SX primitive for compiler-side lookup
- D: lib/extensions/ subtree wired via include_subdirs, test_ext.ml as the canonical
     worked example, opcode_name forward-ref so disassemble shows ext opcodes by name
- E: bytecode_uses_extension_opcodes scanner + JIT skip path so lambdas containing
     extension opcodes run interpreted via CEK

26 new foundation tests across 5 suites, all green. Zero regressions across 11
language-port conformance suites (erlang 530, haskell 285, datalog 276, prolog 590,
smalltalk 847, common-lisp 487, apl 562, js 148, forth 632, tcl 3, ocaml-on-sx unit 607).

Hand-off: lib/erlang/vm/dispatcher.sx (Phase 9b stub) can now be replaced with a real
hosts/ocaml/lib/extensions/erlang.ml consumer.
2026-05-15 07:22:29 +00:00
f026177e63 vm-ext: phase E — JIT skips lambdas containing extension opcodes
Adds Sx_vm.bytecode_uses_extension_opcodes — an operand-aware
bytecode scanner that walks past CONST u16, CALL_PRIM u16+u8, and
CLOSURE u16+dynamic upvalue descriptors so operand bytes that happen
to be ≥200 don't false-positive as extension opcodes.

jit_compile_lambda calls the scanner on the inner closure's bytecode.
On hit it returns None — the lambda then runs through CEK
interpretation. The VM's dispatch fallthrough still routes the
extension opcodes themselves through the registry; this change just
prevents the JIT from claiming code it has no plan for.

Tests: 7 new foundation cases — pure core eligible, head/middle/
post-CLOSURE detection, CONST + CALL_PRIM + CLOSURE-descriptor false-
positive avoidance. +7 pass vs Phase D baseline, no regressions
across 11 conformance suites.

Loop complete: acceptance criteria 1-4 met. Hand-off to the Erlang
loop — lib/erlang/vm/dispatcher.sx's Phase 9b stub can now be
replaced with a real hosts/ocaml/lib/extensions/erlang.ml consumer.
2026-05-15 01:53:39 +00:00
f3192f7fda vm-ext: phase D — extensions/ subtree + test_ext + opcode_name lookup
lib/extensions/ becomes the new home for VM extensions, wired in via
(include_subdirs unqualified). README documents the registration
pattern, opcode-ID range conventions (200-209 guest_vm, 210-219
inline test, 220-229 test_ext, 230-247 ports), and naming rules.

extensions/test_ext.ml is the canonical worked example — two
operand-less opcodes (220 push 42, 221 double TOS) carrying a per-
extension state slot (TestExtState invocation counter). Test_ext.register
called from run_tests.ml at the start of the Phase D suite, on top of
the inline test_reg from earlier suites (disjoint opcode IDs).

Sx_vm.opcode_name now consults extension_opcode_name_ref (forward ref
in the same style as extension_dispatch_ref), so disassemble shows
extension opcodes by name instead of UNKNOWN_n. Registry maintains
name_of_id_table and installs the lookup at module init.

Tests: 5 new foundation cases — primitive resolves test_ext name,
end-to-end bytecode (push + double + return → 84), disassemble shows
"test_ext.OP_TEST_PUSH_42" / "test_ext.OP_TEST_DOUBLE_TOS",
unregistered ext opcodes still fall back to UNKNOWN_n, invocation
counter records the two dispatches. +5 pass vs Phase C baseline, no
regressions across 11 conformance suites.
2026-05-15 01:05:30 +00:00
57af0f386f vm-ext: phase C — extension-opcode-id SX primitive
Registers extension-opcode-id from sx_vm_extensions.ml module init.
Lives downstream of both sx_primitives and sx_vm to avoid a build
cycle. Accepts a string or symbol; returns Integer id when the opcode
is registered, Nil otherwise.

Compilers (lib/compiler.sx) call this to emit extension opcodes by
name. Returning Nil rather than failing on unknown names lets a port's
optimization opt in per-build — missing extensions degrade to slower
correct execution.

Tests: 5 new foundation cases — registered lookup, unknown → nil,
symbol arg, zero-arg + integer-arg rejection. +5 pass vs Phase B
baseline, no regressions across 11 conformance suites.
2026-05-15 00:16:03 +00:00
8c33a6f8d5 vm-ext: phase B — extension registry module
sx_vm_extension.ml: handler type, extensible extension_state variant,
EXTENSION first-class module signature.

sx_vm_extensions.ml: register / dispatch / id_of_name /
state_of_extension. install_dispatch () runs at module init,
swapping Phase A's stub for the real registry. Rejects out-of-range
opcode IDs (must be 200-247), duplicate IDs, duplicate names, and
duplicate extension names.

Tests: 9 new foundation cases — lookup hits/misses, end-to-end VM
dispatch including opcode composition, all four rejection paths.
+9 pass vs Phase A baseline, no regressions across 11 conformance
suites.
2026-05-14 23:28:24 +00:00
cf597f1b5f vm-ext: phase A — extension dispatch fallthrough in sx_vm.ml
Adds Invalid_opcode of int exception and extension_dispatch_ref forward
ref (default raises Invalid_opcode op), plus the |op when op >= 200 arm
before the catch-all in the bytecode dispatch loop. Partition comment
documents 1-199 core / 200-247 extensions / 248-255 reserved.

Phase B will install the real registry's dispatch into the ref at module
init, replacing this stub.

Tests: 4 new foundation cases (Invalid_opcode for 200/224/247, Eval_error
for 199 to pin the threshold). +4 pass vs baseline, no regressions.
2026-05-14 22:29:50 +00:00
183bfeebe1 vm-ext: bootstrap loops/sx-vm-extensions plan + loop briefing
plans/sx-vm-opcode-extension.md ports over from loops/erlang (f6a68656)
with the opcode partition adjusted to match real VM usage: 1-199 core
(current ceiling 175 = OP_DEC), 200-247 extensions, 248-255 reserved.

plans/agent-briefings/sx-vm-extensions-loop.md captures the per-fire
workflow and ground rules.
2026-05-14 22:29:15 +00:00
64b7263c5f erlang: Phase 9g — log perf-bench blocker on 9a; conformance half clean at 709/709 2026-05-14 21:28:10 +00:00
e8a5c2e1ba erlang: Phase 9f — hot-BIF opcode table (+18 vm tests) 2026-05-14 21:26:51 +00:00
3efd735283 erlang: Phase 9e — OP_SPAWN / OP_SEND + VM-process registry (+16 vm tests) 2026-05-14 21:20:37 +00:00
10623da0b0 erlang: Phase 9d — OP_RECEIVE_SCAN stub (+10 vm tests) 2026-05-14 21:13:40 +00:00
528b24a1cd erlang: Phase 9c — OP_PERFORM / OP_HANDLE stubs (+9 vm tests) 2026-05-14 21:08:12 +00:00
25924d6212 erlang: Phase 9b — stub VM dispatcher + 3 pattern opcodes (+19 vm tests) 2026-05-14 20:52:26 +00:00
0abf05ed83 erlang: log Phase 9a (opcode-extension) as Blocker — out of scope 2026-05-14 20:46:38 +00:00
f6a6865635 erlang: sync fed-sx + opcode-ext plans; add Phase 9 (specialized opcodes) 2026-05-14 20:45:05 +00:00
6636f9c170 erlang: extract ffi test suite (637/637, ffi 14/14) 2026-05-14 20:21:51 +00:00
29fd70f17a erlang: file:read_file/write_file/delete BIFs (+10 eval tests, 633/633) 2026-05-14 20:14:31 +00:00
3d092dd78e erlang: er-to-sx / er-of-sx term marshalling (+23 runtime tests) 2026-05-14 20:07:35 +00:00
2ee5e45515 erlang: migrate BIFs onto registry, delete cond dispatchers (600/600) 2026-05-14 19:41:30 +00:00
498d2533d8 erlang: Phase 8 BIF registry foundation (+18 runtime tests, 600/600) 2026-05-14 19:34:30 +00:00
925bbd0d42 erlang: Phase 7 capstone — full hot-reload ladder green (+5 eval tests) 2026-05-14 19:29:15 +00:00
b5e93df82e erlang: verify hot-reload call dispatch semantics (+6 eval tests) 2026-05-14 19:17:59 +00:00
582baf5bfd erlang: code:which/is_loaded/all_loaded introspection (+10 eval tests) 2026-05-14 19:08:34 +00:00
cd45ebcc7a erlang: code:purge/1 + code:soft_purge/1 (+10 eval tests) 2026-05-14 19:02:24 +00:00
89a6b30501 erlang: code:load_binary/3 hot-reload BIF (+8 eval tests) 2026-05-14 18:52:45 +00:00
0c389d4696 erlang: module-version slot (Phase 7 step 1, +13 runtime tests) 2026-05-14 17:35:02 +00:00
7602ec1a69 erlang: plan Phase 7 (hot code reload) + Phase 8 (FFI BIFs) 2026-05-14 16:19:34 +00:00
2db2d8e9f7 briefing: push to origin/loops/erlang after each commit
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 43s
2026-05-06 06:47:16 +00:00
45 changed files with 11101 additions and 230 deletions

View File

@@ -67,6 +67,14 @@ let rec deep_equal a b =
| NativeFn _, NativeFn _ -> a == b
| _ -> false
(* ====================================================================== *)
(* Test extensions for the VM extension registry suite (Phase B) *)
(* ====================================================================== *)
(* Extend the extensible variant from sx_vm_extension.ml so the test
extensions below can carry their own private state. *)
type Sx_vm_extension.extension_state += TestRegState of int ref
(* ====================================================================== *)
(* Build evaluator environment with test platform functions *)
(* ====================================================================== *)
@@ -1282,7 +1290,827 @@ let run_foundation_tests () =
let l = { l_params = ["x"]; l_body = Symbol "x"; l_closure = Sx_types.make_env (); l_name = None; l_compiled = None; l_call_count = 0; l_uid = Sx_types.next_lambda_uid () } in
assert_true "is_lambda" (Bool (Sx_types.is_lambda (Lambda l)));
ignore (Sx_types.set_lambda_name (Lambda l) "my-fn");
assert_eq "lambda name mutated" (String "my-fn") (lambda_name (Lambda l))
assert_eq "lambda name mutated" (String "my-fn") (lambda_name (Lambda l));
Printf.printf "\nSuite: crypto-sha2\n";
(* NIST FIPS 180-4 published vectors. *)
assert_eq "sha256 empty"
(String "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855")
(call "crypto-sha256" [String ""]);
assert_eq "sha256 abc"
(String "ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad")
(call "crypto-sha256" [String "abc"]);
assert_eq "sha256 896-bit"
(String "248d6a61d20638b8e5c026930c3e6039a33ce45964ff2167f6ecedd419db06c1")
(call "crypto-sha256"
[String "abcdbcdecdefdefgefghfghighijhijkijkljklmklmnlmnomnopnopq"]);
assert_eq "sha256 1M 'a'"
(String "cdc76e5c9914fb9281a1c7e284d73e67f1809a48a497200e046d39ccc7112cd0")
(call "crypto-sha256" [String (String.make 1000000 'a')]);
assert_eq "sha512 empty"
(String "cf83e1357eefb8bdf1542850d66d8007d620e4050b5715dc83f4a921d36ce9ce47d0d13c5d85f2b0ff8318d2877eec2f63b931bd47417a81a538327af927da3e")
(call "crypto-sha512" [String ""]);
assert_eq "sha512 abc"
(String "ddaf35a193617abacc417349ae20413112e6fa4e89a97ea20a9eeee64b55d39a2192992a274fc1a836ba3c23a3feebbd454d4423643ce80e2a9ac94fa54ca49f")
(call "crypto-sha512" [String "abc"]);
assert_eq "sha512 896-bit"
(String "8e959b75dae313da8cf4f72814fc143f8f7779c6eb9f7fa17299aeadb6889018501d289e4900f7e4331b99dec4b5433ac7d329eeb6dd26545e96e55b874be909")
(call "crypto-sha512"
[String ("abcdefghbcdefghicdefghijdefghijkefghijklfghijklmghijklmn"
^ "hijklmnoijklmnopjklmnopqklmnopqrlmnopqrsmnopqrstnopqrstu")]);
Printf.printf "\nSuite: crypto-sha3\n";
(* NIST FIPS 202 published vectors. *)
assert_eq "sha3-256 empty"
(String "a7ffc6f8bf1ed76651c14756a061d662f580ff4de43b49fa82d80a4b80f8434a")
(call "crypto-sha3-256" [String ""]);
assert_eq "sha3-256 abc"
(String "3a985da74fe225b2045c172d6bd390bd855f086e3e9d525b46bfe24511431532")
(call "crypto-sha3-256" [String "abc"]);
assert_eq "sha3-256 896-bit"
(String "41c0dba2a9d6240849100376a8235e2c82e1b9998a999e21db32dd97496d3376")
(call "crypto-sha3-256"
[String "abcdbcdecdefdefgefghfghighijhijkijkljklmklmnlmnomnopnopq"]);
(* 1600-bit message: 0xa3 * 200 — exercises multi-block absorb (>136B). *)
assert_eq "sha3-256 1600-bit 0xa3"
(String "79f38adec5c20307a98ef76e8324afbfd46cfd81b22e3973c65fa1bd9de31787")
(call "crypto-sha3-256" [String (String.make 200 '\xa3')]);
Printf.printf "\nSuite: dag-cbor\n";
let mkdict pairs =
let d = Sx_types.make_dict () in
List.iter (fun (k, v) -> Hashtbl.replace d k v) pairs;
Dict d
in
let enc v = call "cbor-encode" [v] in
(* RFC 8949 Appendix A — minimal-length deterministic encoding. *)
assert_eq "cbor 0" (String "\x00") (enc (Integer 0));
assert_eq "cbor 23" (String "\x17") (enc (Integer 23));
assert_eq "cbor 24" (String "\x18\x18") (enc (Integer 24));
assert_eq "cbor 100" (String "\x18\x64") (enc (Integer 100));
assert_eq "cbor 1000" (String "\x19\x03\xe8") (enc (Integer 1000));
assert_eq "cbor 1000000"
(String "\x1a\x00\x0f\x42\x40") (enc (Integer 1000000));
assert_eq "cbor -1" (String "\x20") (enc (Integer (-1)));
assert_eq "cbor -100" (String "\x38\x63") (enc (Integer (-100)));
assert_eq "cbor -1000" (String "\x39\x03\xe7") (enc (Integer (-1000)));
assert_eq "cbor false" (String "\xf4") (enc (Bool false));
assert_eq "cbor true" (String "\xf5") (enc (Bool true));
assert_eq "cbor null" (String "\xf6") (enc Nil);
assert_eq "cbor \"\"" (String "\x60") (enc (String ""));
assert_eq "cbor \"a\"" (String "\x61\x61") (enc (String "a"));
assert_eq "cbor \"IETF\"" (String "\x64IETF") (enc (String "IETF"));
assert_eq "cbor []" (String "\x80") (enc (List []));
assert_eq "cbor [1,2,3]"
(String "\x83\x01\x02\x03")
(enc (List [Integer 1; Integer 2; Integer 3]));
assert_eq "cbor [1,[2,3],[4,5]]"
(String "\x83\x01\x82\x02\x03\x82\x04\x05")
(enc (List [Integer 1;
List [Integer 2; Integer 3];
List [Integer 4; Integer 5]]));
assert_eq "cbor {}" (String "\xa0") (enc (mkdict []));
assert_eq "cbor {a:1,b:[2,3]}"
(String "\xa2\x61\x61\x01\x61\x62\x82\x02\x03")
(enc (mkdict ["a", Integer 1; "b", List [Integer 2; Integer 3]]));
assert_eq "cbor {a..e:A..E}"
(String "\xa5\x61\x61\x61\x41\x61\x62\x61\x42\x61\x63\x61\x43\x61\x64\x61\x44\x61\x65\x61\x45")
(enc (mkdict ["a", String "A"; "b", String "B"; "c", String "C";
"d", String "D"; "e", String "E"]));
(* Determinism: insertion order + key length must not change bytes.
Sort is length-then-bytewise → a, c, bb. *)
let d1 = mkdict ["bb", Integer 2; "a", Integer 1; "c", Integer 3] in
let d2 = mkdict ["c", Integer 3; "bb", Integer 2; "a", Integer 1] in
assert_eq "cbor det order-invariant" (enc d1) (enc d2);
assert_eq "cbor det length-then-bytewise"
(String "\xa3\x61\x61\x01\x61\x63\x03\x62\x62\x62\x02")
(enc d1);
(* Round-trip: decode . encode = identity (structural). *)
let roundtrip name v =
assert_eq ("cbor rt " ^ name) v (call "cbor-decode" [enc v])
in
roundtrip "int" (Integer 42);
roundtrip "neg" (Integer (-99999));
roundtrip "str" (String "hello world");
roundtrip "bool" (Bool true);
roundtrip "nil" Nil;
roundtrip "nested"
(List [Integer 1; String "x"; List [Bool false; Nil]]);
roundtrip "dict"
(mkdict ["k", List [Integer 7]; "name", String "z"]);
Printf.printf "\nSuite: cid\n";
let mh_sha256 s = Sx_cid.multihash 0x12 (Sx_cid.unhex (Sx_sha2.sha256_hex s)) in
(* Authoritative vectors (independently derived; match well-known
IPFS CIDs). raw "abc" and raw "" — codec 0x55. *)
assert_eq "cid raw abc"
(String "bafkreif2pall7dybz7vecqka3zo24irdwabwdi4wc55jznaq75q7eaavvu")
(call "cid-from-bytes" [Integer 0x55; String (mh_sha256 "abc")]);
assert_eq "cid raw empty"
(String "bafkreihdwdcefgh4dqkjv67uzcmw7ojee6xedzdetojuzjevtenxquvyku")
(call "cid-from-bytes" [Integer 0x55; String (mh_sha256 "")]);
(* dag-cbor {} — canonical empty-map CID (sha2-256, codec 0x71). *)
assert_eq "cid dag-cbor {}"
(String "bafyreigbtj4x7ip5legnfznufuopl4sg4knzc2cof6duas4b3q2fy6swua")
(call "cid-from-sx" [mkdict []]);
(* Determinism: dict key insertion order must not change the CID. *)
let cda = call "cid-from-sx" [mkdict ["b", Integer 2; "a", Integer 1]] in
let cdb = call "cid-from-sx" [mkdict ["a", Integer 1; "b", Integer 2]] in
assert_eq "cid det order-invariant" cda cdb;
assert_true "cid multibase 'b' prefix"
(Bool (match call "cid-from-sx" [mkdict []] with
| String s -> String.length s > 1 && s.[0] = 'b'
| _ -> false));
Printf.printf "\nSuite: ed25519\n";
let hx = Sx_ed25519.unhex in
let edv pk msg sg = call "ed25519-verify"
[String (hx pk); String (hx msg); String (hx sg)] in
(* RFC 8032 §7.1 TEST 1-3 (deterministic; re-derived independently). *)
assert_eq "ed25519 RFC T1"
(Bool true)
(edv "d75a980182b10ab7d54bfed3c964073a0ee172f3daa62325af021a68f707511a"
""
"e5564300c360ac729086e2cc806e828a84877f1eb8e5d974d873e065224901555fb8821590a33bacc61e39701cf9b46bd25bf5f0595bbe24655141438e7a100b");
assert_eq "ed25519 RFC T2"
(Bool true)
(edv "3d4017c3e843895a92b70aa74d1b7ebc9c982ccf2ec4968cc0cd55f12af4660c"
"72"
"92a009a9f0d4cab8720e820b5f642540a2b27b5416503f8fb3762223ebdb69da085ac1e43e15996e458f3613d0f11d8c387b2eaeb4302aeeb00d291612bb0c00");
assert_eq "ed25519 RFC T3"
(Bool true)
(edv "fc51cd8e6218a1a38da47ed00230f0580816ed13ba3303ac5deb911548908025"
"af82"
"6291d657deec24024827e69c3abe01a30ce548a284743a445e3680d7db5ac3ac18ff9b538d16f290ae67f760984dc6594a7c15e9716ed28dc027beceea1ec40a");
(* Tampered message -> false. *)
assert_eq "ed25519 tampered msg"
(Bool false)
(edv "fc51cd8e6218a1a38da47ed00230f0580816ed13ba3303ac5deb911548908025"
"af83"
"6291d657deec24024827e69c3abe01a30ce548a284743a445e3680d7db5ac3ac18ff9b538d16f290ae67f760984dc6594a7c15e9716ed28dc027beceea1ec40a");
(* Tampered signature -> false. *)
assert_eq "ed25519 tampered sig"
(Bool false)
(edv "d75a980182b10ab7d54bfed3c964073a0ee172f3daa62325af021a68f707511a"
""
"f5564300c360ac729086e2cc806e828a84877f1eb8e5d974d873e065224901555fb8821590a33bacc61e39701cf9b46bd25bf5f0595bbe24655141438e7a100b");
(* Total: wrong-length pubkey / sig -> false, no exception. *)
assert_eq "ed25519 short pubkey"
(Bool false)
(call "ed25519-verify" [String "abc"; String ""; String (String.make 64 '\000')]);
assert_eq "ed25519 short sig"
(Bool false)
(call "ed25519-verify"
[String (hx "d75a980182b10ab7d54bfed3c964073a0ee172f3daa62325af021a68f707511a");
String ""; String "short"]);
assert_eq "ed25519 non-string args"
(Bool false)
(call "ed25519-verify" [Integer 1; Integer 2; Integer 3]);
Printf.printf "\nSuite: rsa-sha256\n";
(* Fixed RSA-2048 vector: one-off python-cryptography keygen +
PKCS1v15/SHA-256 sign of "fed-sx phase F rsa test". *)
let rhx = Sx_rsa.unhex in
let spki = rhx "30820122300d06092a864886f70d01010105000382010f003082010a0282010100a117b573480bce5a08b54a98384001df26d062e9173caaee2e3a2d0045c6d16f99b2a1e7fb60763f65f95f8c39ff82c18b8590338042914331db3440a06d2dbe65a2f82c82f37d293f67a8b57a1f9014b55150a093cfee90257ef3b4a215d5ab002579bd92b6fcb3536777d51b639347d01e307ddafb209073dd9b8d6a507157c44c624a19b3b9275931472462870ae02132630159132a85c1c889adfb358b6bbd3760ce3fffe6285964833a10ee436d5bc33dfab7f9ed630a74e9a32e5688f5a7797f7cc839ad2494dd1c4c4a8fab844cd26208794bf2602c16b9d12bde434066d8c0dd2d20489f4070f883bae2b4508ead4a1b80b44c576e9e37bdb5df69f10203010001" in
let rmsg = rhx "6665642d73782070686173652046207273612074657374" in
let rsig = rhx "5e1593d674ed15c0172546d38efdf1aebd252f4b0c0dfbe1f7996fd569d0bfd9f3e8689ea2b14aa45b5fc3f0a05d4f23c6b02b8820d71f6998ea3b5b0d071bb33142236e388b1226ece3ec447d33b38999f189c37564cf052cf038de94c67b2ddf9a97d5a73554bb88818f615824517209a4083258965adace55658f344104eaa0d5f2f44ea00cfac8754674aade87b40d955cccd1ccd9b7649a08b66ce3bc5dba2de96b3e859488ded3ef9fb3744a1e3495fd14841d8319b3cc08054c729d1c02739ee314eba2b20fac46e463f47eb67183d8455583eca73ba37448164612dd9cd77877135d30d12084c2843f986a5b8ad59c6600f9855b91d7cbdf7c6c4b0e" in
let rsav s m g = call "rsa-sha256-verify" [String s; String m; String g] in
assert_eq "rsa valid" (Bool true) (rsav spki rmsg rsig);
assert_eq "rsa tampered msg" (Bool false)
(rsav spki (rmsg ^ "x") rsig);
assert_eq "rsa tampered sig" (Bool false)
(rsav spki rmsg
(rhx "5f1593d674ed15c0172546d38efdf1aebd252f4b0c0dfbe1f7996fd569d0bfd9f3e8689ea2b14aa45b5fc3f0a05d4f23c6b02b8820d71f6998ea3b5b0d071bb33142236e388b1226ece3ec447d33b38999f189c37564cf052cf038de94c67b2ddf9a97d5a73554bb88818f615824517209a4083258965adace55658f344104eaa0d5f2f44ea00cfac8754674aade87b40d955cccd1ccd9b7649a08b66ce3bc5dba2de96b3e859488ded3ef9fb3744a1e3495fd14841d8319b3cc08054c729d1c02739ee314eba2b20fac46e463f47eb67183d8455583eca73ba37448164612dd9cd77877135d30d12084c2843f986a5b8ad59c6600f9855b91d7cbdf7c6c4b0e"));
assert_eq "rsa garbage spki" (Bool false)
(rsav "not der" rmsg rsig);
assert_eq "rsa non-string args" (Bool false)
(call "rsa-sha256-verify" [Integer 1; Integer 2; Integer 3]);
Printf.printf "\nSuite: file-list-dir\n";
let expect_err nm f =
(try ignore (f ());
incr fail_count; Printf.printf " FAIL: %s — no error\n" nm
with Eval_error _ ->
incr pass_count; Printf.printf " PASS: %s\n" nm
| _ ->
incr fail_count; Printf.printf " FAIL: %s — wrong exn\n" nm)
in
let tmp = Filename.temp_file "fld" "" in
Sys.remove tmp; Unix.mkdir tmp 0o755;
let touch n = let oc = open_out (Filename.concat tmp n) in close_out oc in
touch "b.txt"; touch "a.txt"; touch "c.txt";
assert_eq "file-list-dir sorted"
(List [String "a.txt"; String "b.txt"; String "c.txt"])
(call "file-list-dir" [String tmp]);
expect_err "file-list-dir missing"
(fun () -> call "file-list-dir" [String (Filename.concat tmp "nope")]);
expect_err "file-list-dir not-a-dir"
(fun () -> call "file-list-dir" [String (Filename.concat tmp "a.txt")]);
expect_err "file-list-dir arity"
(fun () -> call "file-list-dir" []);
(* best-effort cleanup *)
(try List.iter (fun n -> Sys.remove (Filename.concat tmp n))
["a.txt"; "b.txt"; "c.txt"]; Unix.rmdir tmp
with _ -> ());
Printf.printf "\nSuite: vm-extension-dispatch\n";
let make_bc op = ({
vc_arity = 0; vc_rest_arity = -1; vc_locals = 0;
vc_bytecode = [| op |]; vc_constants = [||];
vc_bytecode_list = None; vc_constants_list = None;
} : Sx_types.vm_code) in
let expect_invalid_opcode label op =
let globals = Hashtbl.create 1 in
try
let _ = Sx_vm.execute_module (make_bc op) globals in
incr fail_count;
Printf.printf " FAIL: %s — expected Invalid_opcode, got a result\n" label
with
| Sx_vm.Invalid_opcode n when n = op ->
incr pass_count;
Printf.printf " PASS: %s\n" label
| exn ->
incr fail_count;
Printf.printf " FAIL: %s — unexpected: %s\n" label (Printexc.to_string exn)
in
expect_invalid_opcode "opcode 200 raises Invalid_opcode 200" 200;
expect_invalid_opcode "opcode 224 raises Invalid_opcode 224" 224;
expect_invalid_opcode "opcode 247 raises Invalid_opcode 247" 247;
(* Opcode 199 sits just below the extension threshold — should fall to the
catch-all (Eval_error), proving the threshold is at 200, not 199. *)
let globals = Hashtbl.create 1 in
(try
let _ = Sx_vm.execute_module (make_bc 199) globals in
incr fail_count;
Printf.printf " FAIL: opcode 199 — expected Eval_error, got a result\n"
with
| Sx_vm.Invalid_opcode _ ->
incr fail_count;
Printf.printf " FAIL: opcode 199 routed to extension dispatch (threshold wrong)\n"
| Sx_types.Eval_error _ ->
incr pass_count;
Printf.printf " PASS: opcode 199 stays in core (catch-all)\n"
| exn ->
incr fail_count;
Printf.printf " FAIL: opcode 199 — unexpected: %s\n" (Printexc.to_string exn));
Printf.printf "\nSuite: vm-extension-registry\n";
(* Sx_vm_extensions self-installs its dispatcher at module init. Reset
the registry so prior loaded extensions don't interfere with this
test. *)
Sx_vm_extensions._reset_for_tests ();
let module TestExt : Sx_vm_extension.EXTENSION = struct
let name = "test_reg"
let init () = TestRegState (ref 0)
let opcodes _st = [
(210, "test_reg.OP_PUSH_42", (fun vm _frame ->
Sx_vm.push vm (Sx_types.Integer 42)));
(211, "test_reg.OP_DOUBLE_TOS", (fun vm _frame ->
let v = Sx_vm.pop vm in
match v with
| Sx_types.Integer n -> Sx_vm.push vm (Sx_types.Integer (n * 2))
| _ -> failwith "OP_DOUBLE_TOS: not an integer"));
]
end in
Sx_vm_extensions.register (module TestExt);
(match Sx_vm_extensions.id_of_name "test_reg.OP_PUSH_42" with
| Some 210 ->
incr pass_count;
Printf.printf " PASS: id_of_name resolves opcode\n"
| other ->
incr fail_count;
Printf.printf " FAIL: id_of_name: got %s\n"
(match other with Some n -> string_of_int n | None -> "None"));
(match Sx_vm_extensions.id_of_name "nonexistent.OP" with
| None ->
incr pass_count;
Printf.printf " PASS: id_of_name returns None for unknown\n"
| Some _ ->
incr fail_count;
Printf.printf " FAIL: id_of_name should return None for unknown\n");
(match Sx_vm_extensions.state_of_extension "test_reg" with
| Some (TestRegState _) ->
incr pass_count;
Printf.printf " PASS: state_of_extension returns extension state\n"
| _ ->
incr fail_count;
Printf.printf " FAIL: state_of_extension lookup\n");
(match Sx_vm_extensions.state_of_extension "nonexistent" with
| None ->
incr pass_count;
Printf.printf " PASS: state_of_extension None for unknown\n"
| Some _ ->
incr fail_count;
Printf.printf " FAIL: state_of_extension should be None\n");
(* End-to-end dispatch through the VM. Bytecode runs OP_PUSH_42 then
OP_RETURN (50); execute_module pops the result. *)
let make_bc_seq bytes = ({
vc_arity = 0; vc_rest_arity = -1; vc_locals = 0;
vc_bytecode = bytes; vc_constants = [||];
vc_bytecode_list = None; vc_constants_list = None;
} : Sx_types.vm_code) in
(let globals = Hashtbl.create 1 in
try
match Sx_vm.execute_module (make_bc_seq [| 210; 50 |]) globals with
| Integer 42 ->
incr pass_count;
Printf.printf " PASS: dispatch routes opcode 210 -> push 42\n"
| other ->
incr fail_count;
Printf.printf " FAIL: dispatch opcode 210: got %s\n"
(Sx_types.inspect other)
with exn ->
incr fail_count;
Printf.printf " FAIL: dispatch opcode 210 raised: %s\n"
(Printexc.to_string exn));
(* Compose two extension opcodes: PUSH_42 then DOUBLE_TOS then RETURN.
Verifies that successive extension dispatches share VM state. *)
(let globals = Hashtbl.create 1 in
try
match Sx_vm.execute_module (make_bc_seq [| 210; 211; 50 |]) globals with
| Integer 84 ->
incr pass_count;
Printf.printf " PASS: extension opcodes compose (42 -> 84)\n"
| other ->
incr fail_count;
Printf.printf " FAIL: composed opcodes: got %s\n"
(Sx_types.inspect other)
with exn ->
incr fail_count;
Printf.printf " FAIL: composed opcodes raised: %s\n"
(Printexc.to_string exn));
(* Duplicate opcode-id detection. *)
let module DupExt : Sx_vm_extension.EXTENSION = struct
let name = "dup_check"
let init () = TestRegState (ref 0)
let opcodes _st = [
(210, "dup_check.OP_X", (fun _vm _frame -> ()));
]
end in
(try
Sx_vm_extensions.register (module DupExt);
incr fail_count;
Printf.printf " FAIL: duplicate opcode id should have raised\n"
with Failure _ ->
incr pass_count;
Printf.printf " PASS: duplicate opcode id rejected\n");
(* Out-of-range opcode-id detection. *)
let module OutExt : Sx_vm_extension.EXTENSION = struct
let name = "out_of_range"
let init () = TestRegState (ref 0)
let opcodes _st = [
(300, "out_of_range.OP_X", (fun _vm _frame -> ()));
]
end in
(try
Sx_vm_extensions.register (module OutExt);
incr fail_count;
Printf.printf " FAIL: out-of-range opcode should have raised\n"
with Failure _ ->
incr pass_count;
Printf.printf " PASS: out-of-range opcode rejected\n");
(* Duplicate extension-name detection. *)
let module SameNameExt : Sx_vm_extension.EXTENSION = struct
let name = "test_reg" (* same as TestExt above *)
let init () = TestRegState (ref 0)
let opcodes _st = []
end in
(try
Sx_vm_extensions.register (module SameNameExt);
incr fail_count;
Printf.printf " FAIL: duplicate extension name should have raised\n"
with Failure _ ->
incr pass_count;
Printf.printf " PASS: duplicate extension name rejected\n");
Printf.printf "\nSuite: extension-opcode-id primitive\n";
let prim = Hashtbl.find Sx_primitives.primitives "extension-opcode-id" in
(* Known opcode (registered by TestExt above). *)
(match prim [String "test_reg.OP_PUSH_42"] with
| Integer 210 ->
incr pass_count;
Printf.printf " PASS: primitive returns Integer for registered opcode\n"
| other ->
incr fail_count;
Printf.printf " FAIL: registered opcode lookup: got %s\n"
(Sx_types.inspect other));
(* Unknown opcode → Nil. *)
(match prim [String "nonexistent.OP_X"] with
| Nil ->
incr pass_count;
Printf.printf " PASS: primitive returns nil for unknown opcode\n"
| other ->
incr fail_count;
Printf.printf " FAIL: unknown opcode lookup: got %s\n"
(Sx_types.inspect other));
(* Symbol arg also accepted (compilers may pass quoted symbols). *)
(match prim [Symbol "test_reg.OP_DOUBLE_TOS"] with
| Integer 211 ->
incr pass_count;
Printf.printf " PASS: primitive accepts Symbol args\n"
| other ->
incr fail_count;
Printf.printf " FAIL: symbol arg: got %s\n" (Sx_types.inspect other));
(* Wrong arity / type raises Eval_error. *)
(try
let _ = prim [] in
incr fail_count;
Printf.printf " FAIL: zero args should have raised\n"
with Sx_types.Eval_error _ ->
incr pass_count;
Printf.printf " PASS: zero args rejected\n");
(try
let _ = prim [Integer 42] in
incr fail_count;
Printf.printf " FAIL: integer arg should have raised\n"
with Sx_types.Eval_error _ ->
incr pass_count;
Printf.printf " PASS: integer arg rejected\n");
Printf.printf "\nSuite: extensions/test_ext (canonical extension)\n";
(* Phase D: the real test extension lives at lib/extensions/test_ext.ml.
Register it on top of the inline test_reg from earlier suites — the
two use disjoint opcode IDs (210/211 vs 220/221) so they coexist. *)
Test_ext.register ();
(* Lookup via the public primitive should now find OP_TEST_PUSH_42. *)
(match prim [String "test_ext.OP_TEST_PUSH_42"] with
| Integer 220 ->
incr pass_count;
Printf.printf " PASS: extension-opcode-id finds test_ext.OP_TEST_PUSH_42\n"
| other ->
incr fail_count;
Printf.printf " FAIL: opcode lookup: got %s\n" (Sx_types.inspect other));
(* End-to-end: PUSH_42 + DOUBLE_TOS + RETURN. *)
(let globals = Hashtbl.create 1 in
try
match Sx_vm.execute_module (make_bc_seq [| 220; 221; 50 |]) globals with
| Integer 84 ->
incr pass_count;
Printf.printf " PASS: extensions/test_ext bytecode executes (84)\n"
| other ->
incr fail_count;
Printf.printf " FAIL: test_ext bytecode result: got %s\n"
(Sx_types.inspect other)
with exn ->
incr fail_count;
Printf.printf " FAIL: test_ext bytecode raised: %s\n"
(Printexc.to_string exn));
(* Disassembly: opcode_name should resolve 220/221 via the registry,
not fall back to UNKNOWN_220 / UNKNOWN_221. disassemble returns a
Dict; the instruction list lives at key "bytecode". *)
(let code = make_bc_seq [| 220; 221; 50 |] in
let dis = Sx_vm.disassemble code in
let entries = match dis with
| Dict d -> (match Hashtbl.find_opt d "bytecode" with
| Some (List es) -> es
| _ -> [])
| _ -> []
in
let names = List.filter_map (fun entry -> match entry with
| Dict d ->
(match Hashtbl.find_opt d "opcode" with
| Some (String name) -> Some name
| _ -> None)
| _ -> None) entries
in
let has name = List.mem name names in
if has "test_ext.OP_TEST_PUSH_42" && has "test_ext.OP_TEST_DOUBLE_TOS" then begin
incr pass_count;
Printf.printf " PASS: disassemble shows extension opcode names\n"
end else begin
incr fail_count;
Printf.printf " FAIL: disassemble names: [%s]\n" (String.concat ", " names)
end);
(* Sanity: opcode_name on an unregistered extension opcode still
returns UNKNOWN_n. Pick 230 — out of test_ext's range. *)
(match Sx_vm.opcode_name 230 with
| "UNKNOWN_230" ->
incr pass_count;
Printf.printf " PASS: unregistered ext opcode falls back to UNKNOWN_n\n"
| other ->
incr fail_count;
Printf.printf " FAIL: opcode_name 230: got %s\n" other);
(* Per-extension state: invocation_count should reflect the two opcodes
that ran in the dispatch test above. *)
(match Test_ext.invocation_count () with
| Some n when n >= 2 ->
incr pass_count;
Printf.printf " PASS: extension state recorded %d invocations\n" n
| other ->
incr fail_count;
Printf.printf " FAIL: invocation_count: %s\n"
(match other with Some n -> string_of_int n | None -> "None"));
Printf.printf "\nSuite: extensions/erlang_ext (Phase 9h)\n";
(* Register the Erlang opcode namespace. Disjoint id range (200-217)
from test_ext (220/221) so they coexist. *)
Erlang_ext.register ();
(match prim [String "erlang.OP_PATTERN_TUPLE"] with
| Integer 222 ->
incr pass_count;
Printf.printf " PASS: extension-opcode-id erlang.OP_PATTERN_TUPLE = 222\n"
| other ->
incr fail_count;
Printf.printf " FAIL: erlang.OP_PATTERN_TUPLE: got %s\n"
(Sx_types.inspect other));
(match prim [String "erlang.OP_BIF_IS_TUPLE"] with
| Integer 239 ->
incr pass_count;
Printf.printf " PASS: extension-opcode-id erlang.OP_BIF_IS_TUPLE = 239\n"
| other ->
incr fail_count;
Printf.printf " FAIL: erlang.OP_BIF_IS_TUPLE: got %s\n"
(Sx_types.inspect other));
(match prim [String "erlang.OP_NONEXISTENT"] with
| Nil ->
incr pass_count;
Printf.printf " PASS: unknown erlang opcode -> nil\n"
| other ->
incr fail_count;
Printf.printf " FAIL: unknown erlang opcode: got %s\n"
(Sx_types.inspect other));
(* Phase 10b vertical slice: erlang.OP_BIF_LENGTH (230) is a REAL
handler. Build [CONST 0; OP_BIF_LENGTH; RETURN] with an Erlang
list [1,2,3] in the constant pool; expect Integer 3. Proves the
full path: bytecode -> Sx_vm extension fallthrough -> erlang_ext
handler -> correct stack result. *)
(let mk_dict kvs =
let h = Hashtbl.create 4 in
List.iter (fun (k, v) -> Hashtbl.replace h k v) kvs;
Sx_types.Dict h in
let er_nil = mk_dict [("tag", Sx_types.String "nil")] in
let er_cons hd tl =
mk_dict [("tag", Sx_types.String "cons");
("head", hd); ("tail", tl)] in
let lst = er_cons (Sx_types.Integer 1)
(er_cons (Sx_types.Integer 2)
(er_cons (Sx_types.Integer 3) er_nil)) in
let code = ({
vc_arity = 0; vc_rest_arity = -1; vc_locals = 0;
vc_bytecode = [| 1; 0; 0; 230; 50 |];
vc_constants = [| lst |];
vc_bytecode_list = None; vc_constants_list = None;
} : Sx_types.vm_code) in
let globals = Hashtbl.create 1 in
try
match Sx_vm.execute_module code globals with
| Integer 3 ->
incr pass_count;
Printf.printf " PASS: erlang.OP_BIF_LENGTH [1,2,3] -> 3 (real handler, end-to-end)\n"
| other ->
incr fail_count;
Printf.printf " FAIL: OP_BIF_LENGTH result: got %s\n"
(Sx_types.inspect other)
with exn ->
incr fail_count;
Printf.printf " FAIL: OP_BIF_LENGTH raised: %s\n"
(Printexc.to_string exn));
(* More real handlers (Phase 10b batch): build a list/tuple constant
and exercise HD/TL/TUPLE_SIZE/IS_* end-to-end through the VM. *)
(let mk_dict kvs =
let h = Hashtbl.create 4 in
List.iter (fun (k, v) -> Hashtbl.replace h k v) kvs;
Sx_types.Dict h in
let er_nil = mk_dict [("tag", Sx_types.String "nil")] in
let er_cons hd tl = mk_dict [("tag", Sx_types.String "cons");
("head", hd); ("tail", tl)] in
let er_tuple es = mk_dict [("tag", Sx_types.String "tuple");
("elements", Sx_types.List es)] in
let er_atom nm = mk_dict [("tag", Sx_types.String "atom");
("name", Sx_types.String nm)] in
let lst3 = er_cons (Sx_types.Integer 7)
(er_cons (Sx_types.Integer 8)
(er_cons (Sx_types.Integer 9) er_nil)) in
let tup3 = er_tuple [Sx_types.Integer 1; Sx_types.Integer 2;
Sx_types.Integer 3] in
let run consts bc =
let code = ({
vc_arity = 0; vc_rest_arity = -1; vc_locals = 0;
vc_bytecode = bc; vc_constants = consts;
vc_bytecode_list = None; vc_constants_list = None;
} : Sx_types.vm_code) in
Sx_vm.execute_module code (Hashtbl.create 1) in
let nm = function
| Sx_types.Dict d ->
(match Hashtbl.find_opt d "name" with
| Some (Sx_types.String s) -> s | _ -> "?")
| _ -> "?" in
let check label want got =
if got = want then begin
incr pass_count;
Printf.printf " PASS: %s\n" label
end else begin
incr fail_count;
Printf.printf " FAIL: %s: got %s\n" label (Sx_types.inspect got)
end in
(* HD [7,8,9] -> 7 *)
check "OP_BIF_HD [7,8,9] -> 7" (Sx_types.Integer 7)
(run [| lst3 |] [| 1;0;0; 231; 50 |]);
(* TL [7,8,9] -> [8,9], check its HD = 8 *)
check "OP_BIF_TL then HD -> 8" (Sx_types.Integer 8)
(run [| lst3 |] [| 1;0;0; 232; 231; 50 |]);
(* TUPLE_SIZE {1,2,3} -> 3 *)
check "OP_BIF_TUPLE_SIZE {1,2,3} -> 3" (Sx_types.Integer 3)
(run [| tup3 |] [| 1;0;0; 234; 50 |]);
(* IS_INTEGER 42 -> true ; IS_INTEGER [..] -> false *)
(match run [| Sx_types.Integer 42 |] [| 1;0;0; 236; 50 |] with
| v when nm v = "true" ->
incr pass_count; Printf.printf " PASS: OP_BIF_IS_INTEGER 42 -> true\n"
| v -> incr fail_count;
Printf.printf " FAIL: IS_INTEGER 42: got %s\n" (Sx_types.inspect v));
(match run [| lst3 |] [| 1;0;0; 236; 50 |] with
| v when nm v = "false" ->
incr pass_count; Printf.printf " PASS: OP_BIF_IS_INTEGER list -> false\n"
| v -> incr fail_count;
Printf.printf " FAIL: IS_INTEGER list: got %s\n" (Sx_types.inspect v));
(* IS_ATOM atom -> true ; IS_LIST nil -> true ; IS_TUPLE tuple -> true *)
(match run [| er_atom "ok" |] [| 1;0;0; 237; 50 |] with
| v when nm v = "true" ->
incr pass_count; Printf.printf " PASS: OP_BIF_IS_ATOM ok -> true\n"
| v -> incr fail_count;
Printf.printf " FAIL: IS_ATOM: got %s\n" (Sx_types.inspect v));
(match run [| er_nil |] [| 1;0;0; 238; 50 |] with
| v when nm v = "true" ->
incr pass_count; Printf.printf " PASS: OP_BIF_IS_LIST nil -> true\n"
| v -> incr fail_count;
Printf.printf " FAIL: IS_LIST nil: got %s\n" (Sx_types.inspect v));
(match run [| tup3 |] [| 1;0;0; 239; 50 |] with
| v when nm v = "true" ->
incr pass_count; Printf.printf " PASS: OP_BIF_IS_TUPLE {..} -> true\n"
| v -> incr fail_count;
Printf.printf " FAIL: IS_TUPLE: got %s\n" (Sx_types.inspect v));
(match run [| tup3 |] [| 1;0;0; 238; 50 |] with
| v when nm v = "false" ->
incr pass_count; Printf.printf " PASS: OP_BIF_IS_LIST tuple -> false\n"
| v -> incr fail_count;
Printf.printf " FAIL: IS_LIST tuple: got %s\n" (Sx_types.inspect v));
(* ELEMENT: element(2, {1,2,3}) -> 2. Calling convention: push
Index then Tuple; opcode pops Tuple (TOS) then Index. *)
check "OP_BIF_ELEMENT element(2,{1,2,3}) -> 2" (Sx_types.Integer 2)
(run [| Sx_types.Integer 2; tup3 |] [| 1;0;0; 1;1;0; 233; 50 |]);
check "OP_BIF_ELEMENT element(1,{1,2,3}) -> 1" (Sx_types.Integer 1)
(run [| Sx_types.Integer 1; tup3 |] [| 1;0;0; 1;1;0; 233; 50 |]);
(* ELEMENT out of range raises *)
(let raised =
(try ignore (run [| Sx_types.Integer 9; tup3 |]
[| 1;0;0; 1;1;0; 233; 50 |]); false
with Sx_types.Eval_error _ -> true) in
if raised then begin
incr pass_count;
Printf.printf " PASS: OP_BIF_ELEMENT out-of-range raises\n"
end else begin
incr fail_count;
Printf.printf " FAIL: OP_BIF_ELEMENT out-of-range should raise\n"
end);
(* LISTS_REVERSE [7,8,9] -> [9,8,7]; verify HD = 9 then HD of TL = 8 *)
check "OP_BIF_LISTS_REVERSE then HD -> 9" (Sx_types.Integer 9)
(run [| lst3 |] [| 1;0;0; 235; 231; 50 |]);
check "OP_BIF_LISTS_REVERSE then TL,HD -> 8" (Sx_types.Integer 8)
(run [| lst3 |] [| 1;0;0; 235; 232; 231; 50 |]);
(* reverse preserves length *)
check "OP_BIF_LISTS_REVERSE then LENGTH -> 3" (Sx_types.Integer 3)
(run [| lst3 |] [| 1;0;0; 235; 230; 50 |]));
(* A still-stubbed opcode (222 = erlang.OP_PATTERN_TUPLE) raises the
not-wired Eval_error — confirms the honest-failure path remains
for opcodes whose real handlers haven't landed. *)
(let globals = Hashtbl.create 1 in
try
ignore (Sx_vm.execute_module (make_bc_seq [| 222; 50 |]) globals);
incr fail_count;
Printf.printf " FAIL: erlang.OP_PATTERN_TUPLE dispatch should have raised\n"
with
| Sx_types.Eval_error msg
when (let needle = "not yet wired" in
let nl = String.length needle and ml = String.length msg in
let rec scan i =
if i + nl > ml then false
else if String.sub msg i nl = needle then true
else scan (i + 1)
in scan 0) ->
incr pass_count;
Printf.printf " PASS: erlang opcode dispatch raises not-wired error\n"
| exn ->
incr fail_count;
Printf.printf " FAIL: unexpected exn: %s\n" (Printexc.to_string exn));
(match Erlang_ext.dispatch_count () with
| Some n when n >= 1 ->
incr pass_count;
Printf.printf " PASS: erlang_ext state recorded %d dispatch(es)\n" n
| other ->
incr fail_count;
Printf.printf " FAIL: dispatch_count: %s\n"
(match other with Some n -> string_of_int n | None -> "None"));
Printf.printf "\nSuite: jit extension-opcode awareness\n";
let scan = Sx_vm.bytecode_uses_extension_opcodes in
let no_consts = [||] in
(* Pure core ops: scan reports false. *)
(* OP_TRUE OP_RETURN *)
if not (scan [| 3; 50 |] no_consts) then begin
incr pass_count;
Printf.printf " PASS: pure core bytecode is JIT-eligible\n"
end else begin
incr fail_count;
Printf.printf " FAIL: pure core bytecode flagged as extension\n"
end;
(* Extension opcode anywhere → true. *)
if scan [| 220; 50 |] no_consts then begin
incr pass_count;
Printf.printf " PASS: extension opcode detected at head\n"
end else begin
incr fail_count;
Printf.printf " FAIL: extension opcode at head missed\n"
end;
(* Mixed: core + extension → true. *)
if scan [| 3; 220; 50 |] no_consts then begin
incr pass_count;
Printf.printf " PASS: extension opcode detected after core ops\n"
end else begin
incr fail_count;
Printf.printf " FAIL: extension opcode after core ops missed\n"
end;
(* Operand bytes ≥200 must NOT trigger. CONST u16 with index 220
into a synthetic constant pool — the operand is 220 (lo) 0 (hi),
not an opcode. The pool entry at 220 is irrelevant for the scan. *)
let big_consts = Array.make 256 Nil in
if not (scan [| 1; 220; 0; 50 |] big_consts) then begin
incr pass_count;
Printf.printf " PASS: CONST operand ≥200 not a false positive\n"
end else begin
incr fail_count;
Printf.printf " FAIL: CONST operand ≥200 false-positives as ext op\n"
end;
(* CALL_PRIM has 3 operand bytes (u16 + u8); all ≥200 should not
trigger. *)
if not (scan [| 52; 220; 200; 200; 50 |] big_consts) then begin
incr pass_count;
Printf.printf " PASS: CALL_PRIM operands ≥200 not a false positive\n"
end else begin
incr fail_count;
Printf.printf " FAIL: CALL_PRIM operands ≥200 false-positive\n"
end;
(* CLOSURE with upvalue descriptors: scan must skip the 2 + 2*n
dynamic operand bytes. Build a synthetic constant pool with a
Dict at index 0 declaring upvalue-count 1, descriptors that are
≥200 — the scan should skip them and not trigger.
Bytecode layout: CLOSURE 0 0 desc_is_local desc_index RETURN
op lo hi 210 220 50
With upvalue-count = 1, scan must advance past the 2-byte CLOSURE
operand AND the 2 descriptor bytes (210, 220), landing on RETURN. *)
let cl_consts = Array.make 1 Nil in
let dict = Hashtbl.create 1 in
Hashtbl.replace dict "upvalue-count" (Integer 1);
cl_consts.(0) <- Dict dict;
if not (scan [| 51; 0; 0; 210; 220; 50 |] cl_consts) then begin
incr pass_count;
Printf.printf " PASS: CLOSURE upvalue descriptors ≥200 skipped\n"
end else begin
incr fail_count;
Printf.printf " FAIL: CLOSURE upvalue descriptors false-positive\n"
end;
(* Sanity: opcode after CLOSURE+descriptors that IS an extension
opcode triggers correctly. *)
if scan [| 51; 0; 0; 210; 220; 221; 50 |] cl_consts then begin
incr pass_count;
Printf.printf " PASS: extension opcode after CLOSURE detected\n"
end else begin
incr fail_count;
Printf.printf " FAIL: extension opcode after CLOSURE missed\n"
end
(* ====================================================================== *)

View File

@@ -18,6 +18,20 @@
open Sx_types
(* Force-link Sx_vm_extensions so its module-init runs: installs the
extension dispatch fallthrough and registers the `extension-opcode-id`
SX primitive. Without a reference here OCaml dead-code-eliminates the
module from sx_server.exe (it's only otherwise reached from run_tests),
leaving guest-language opcode extensions (Erlang Phase 9, etc.)
invisible to the runtime. The applied call is a harmless lookup. *)
let () = ignore (Sx_vm_extensions.id_of_name "")
(* Register the Erlang opcode extension (Phase 9h) so
`extension-opcode-id "erlang.OP_*"` resolves to the host ids the SX
stub dispatcher consults. Guarded: a double-register raises Failure,
which we swallow so a re-entered server process doesn't die. *)
let () = try Erlang_ext.register () with Failure _ -> ()
(* ====================================================================== *)
(* Font measurement via otfm — reads OpenType/TrueType font tables *)
(* ====================================================================== *)
@@ -708,6 +722,139 @@ let setup_evaluator_bridge env =
match args with
| [e; expr] -> Sx_ref.eval_expr expr e
| _ -> raise (Eval_error "eval-in-env: (env expr)"));
(* fed-sx Milestone 1 Step 8 transport. NATIVE ONLY — sockets +
threads; deliberately absent from the WASM kernel (registered
here in bin/, never in lib/sx_primitives.ml). Minimal HTTP/1.1,
Connection: close. handler : req-dict -> resp-dict where
req = {:method :path :query :headers :body},
resp = {:status :headers :body}. Never returns. *)
Sx_primitives.register "http-listen" (fun args ->
let strip_cr s =
let n = String.length s in
if n > 0 && s.[n - 1] = '\r' then String.sub s 0 (n - 1) else s
in
match args with
| [port_v; handler] ->
let port = match port_v with
| Integer n -> n
| Number f -> int_of_float f
| _ -> raise (Eval_error "http-listen: (port handler)") in
let sock = Unix.socket Unix.PF_INET Unix.SOCK_STREAM 0 in
Unix.setsockopt sock Unix.SO_REUSEADDR true;
Unix.bind sock
(Unix.ADDR_INET (Unix.inet_addr_loopback, port));
Unix.listen sock 64;
(* SX runtime is shared across threads — serialize handler calls. *)
let mtx = Mutex.create () in
let reason = function
| 200 -> "OK" | 201 -> "Created" | 204 -> "No Content"
| 301 -> "Moved Permanently" | 302 -> "Found"
| 400 -> "Bad Request" | 401 -> "Unauthorized"
| 403 -> "Forbidden" | 404 -> "Not Found"
| 405 -> "Method Not Allowed" | 500 -> "Internal Server Error"
| _ -> "OK" in
let handle fd =
(try
let ic = Unix.in_channel_of_descr fd in
let oc = Unix.out_channel_of_descr fd in
let reqline = strip_cr (input_line ic) in
(match String.split_on_char ' ' reqline with
| meth :: target :: _ ->
let path, query =
match String.index_opt target '?' with
| Some i ->
String.sub target 0 i,
String.sub target (i + 1)
(String.length target - i - 1)
| None -> target, "" in
let headers = Sx_types.make_dict () in
let clen = ref 0 in
let rec rdh () =
let h = strip_cr (input_line ic) in
if h = "" then ()
else begin
(match String.index_opt h ':' with
| Some i ->
let name =
String.lowercase_ascii
(String.trim (String.sub h 0 i)) in
let value =
String.trim
(String.sub h (i + 1)
(String.length h - i - 1)) in
Hashtbl.replace headers name (String value);
if name = "content-length" then
(try clen := int_of_string value with _ -> ())
| None -> ());
rdh ()
end in
rdh ();
let body =
if !clen > 0 then begin
let b = Bytes.create !clen in
really_input ic b 0 !clen;
Bytes.unsafe_to_string b
end else "" in
let req = Sx_types.make_dict () in
Hashtbl.replace req "method" (String meth);
Hashtbl.replace req "path" (String path);
Hashtbl.replace req "query" (String query);
Hashtbl.replace req "headers" (Dict headers);
Hashtbl.replace req "body" (String body);
Mutex.lock mtx;
let resp =
(try Sx_runtime.sx_call handler [Dict req]
with e -> Mutex.unlock mtx; raise e) in
Mutex.unlock mtx;
let getk k = match resp with
| Dict h -> Hashtbl.find_opt h k | _ -> None in
let status = match getk "status" with
| Some (Integer n) -> n
| Some (Number f) -> int_of_float f
| _ -> 200 in
let rbody = match getk "body" with
| Some (String s) -> s
| Some v -> Sx_types.value_to_string v
| None -> "" in
let rhdrs = match getk "headers" with
| Some (Dict h) ->
Hashtbl.fold (fun k v acc ->
(k, (match v with
| String s -> s
| v -> Sx_types.value_to_string v)) :: acc)
h []
| _ -> [] in
let buf = Buffer.create 256 in
Buffer.add_string buf
(Printf.sprintf "HTTP/1.1 %d %s\r\n" status
(reason status));
List.iter (fun (k, v) ->
Buffer.add_string buf
(Printf.sprintf "%s: %s\r\n" k v)) rhdrs;
if not (List.exists
(fun (k, _) ->
String.lowercase_ascii k = "content-type")
rhdrs)
then Buffer.add_string buf
"Content-Type: text/plain\r\n";
Buffer.add_string buf
(Printf.sprintf "Content-Length: %d\r\n"
(String.length rbody));
Buffer.add_string buf "Connection: close\r\n\r\n";
Buffer.add_string buf rbody;
output_string oc (Buffer.contents buf);
flush oc
| _ -> ())
with _ -> ());
(try Unix.close fd with _ -> ())
in
while true do
let fd, _ = Unix.accept sock in
ignore (Thread.create handle fd)
done;
Nil
| _ -> raise (Eval_error "http-listen: (port handler)"));
bind "trampoline" (fun args ->
match args with
| [v] ->

49
hosts/ocaml/bin/test_http.sh Executable file
View File

@@ -0,0 +1,49 @@
#!/usr/bin/env bash
# Phase H test — native-only http-listen primitive.
# Starts sx_server with a tiny SX echo handler, drives it with curl
# (GET / POST / 404 / custom header), asserts, then kills it.
set -u
cd "$(dirname "$0")/.."
SRV=_build/default/bin/sx_server.exe
PORT=${HTTP_TEST_PORT:-8911}
PASS=0
FAIL=0
ok() { echo " PASS: $1"; PASS=$((PASS+1)); }
bad() { echo " FAIL: $1$2"; FAIL=$((FAIL+1)); }
if [ ! -x "$SRV" ]; then
echo "build sx_server.exe first (dune build bin/sx_server.exe)"; exit 1
fi
H='(begin (define (h req) (if (= (get req "path") "/echo") {:status 200 :headers {"X-Echo" (get req "method")} :body (str "M=" (get req "method") " P=" (get req "path") " Q=" (get req "query") " B=" (get req "body"))} {:status 404 :body "nope"})) (http-listen '"$PORT"' h))'
ESC=${H//\"/\\\"}
{ printf '(epoch 1)\n(eval "%s")\n' "$ESC"; sleep 30; } | "$SRV" >/tmp/test_http_srv.out 2>&1 &
SVPID=$!
trap 'kill $SVPID 2>/dev/null; wait 2>/dev/null' EXIT
up=0
for _ in $(seq 1 50); do
curl -s -o /dev/null "http://127.0.0.1:$PORT/echo" 2>/dev/null && { up=1; break; }
sleep 0.2
done
[ "$up" = 1 ] || { echo " FAIL: server did not start"; cat /tmp/test_http_srv.out; exit 1; }
# GET with query + custom response header.
g=$(curl -s -i "http://127.0.0.1:$PORT/echo?x=1" | tr -d '\r')
echo "$g" | grep -q '^HTTP/1.1 200 OK' && ok "GET status 200" || bad "GET status" "$g"
echo "$g" | grep -q '^X-Echo: GET' && ok "GET custom header" || bad "GET header" "$g"
echo "$g" | grep -q '^M=GET P=/echo Q=x=1 B=$' && ok "GET echo body" || bad "GET body" "$g"
# POST with body.
p=$(curl -s -X POST --data 'hello' "http://127.0.0.1:$PORT/echo")
[ "$p" = 'M=POST P=/echo Q= B=hello' ] && ok "POST body echoed" || bad "POST body" "$p"
# 404 path.
n=$(curl -s -i "http://127.0.0.1:$PORT/missing" | tr -d '\r')
echo "$n" | grep -q '^HTTP/1.1 404 Not Found' && ok "404 status" || bad "404 status" "$n"
echo "$n" | grep -q '^nope$' && ok "404 body" || bad "404 body" "$n"
echo "Results: $PASS passed, $FAIL failed"
[ "$FAIL" = 0 ]

View File

@@ -2,3 +2,7 @@
(name sx)
(wrapped false)
(libraries re re.pcre unix))
; Pull in extension modules from lib/extensions/ (test_ext.ml, etc).
; See plans/sx-vm-opcode-extension.md.
(include_subdirs unqualified)

View File

@@ -0,0 +1,71 @@
# SX VM extensions
Each `*.ml` file here is a VM extension — a first-class OCaml module that
registers specialized bytecode opcodes with `Sx_vm_extensions`. See
[`plans/sx-vm-opcode-extension.md`](../../../../plans/sx-vm-opcode-extension.md)
for the design.
## Pattern
```ocaml
(* lib/extensions/myport.ml *)
open Sx_types
type Sx_vm_extension.extension_state += MyportState of { ... }
module M : Sx_vm_extension.EXTENSION = struct
let name = "myport"
let init () = MyportState { ... }
let opcodes _st = [
(id, "myport.OP_NAME", handler);
...
]
end
let register () = Sx_vm_extensions.register (module M)
```
Then call `Myport.register ()` once at startup from any binary that
should have the extension loaded.
## Opcode-ID allocation
Range 200-247 (per `Sx_vm_extensions.extension_min` /
`extension_max`). Conventions:
| Range | Use |
|---------|-------------------------------------------------------------------------|
| 200-209 | reserved for `lib/guest/vm/` shared opcodes (chiselled out on 2nd use) |
| 210-219 | inline test extensions defined in `bin/run_tests.ml` |
| 220-229 | this directory's `test_ext` (the canonical template) |
| 230-247 | first-come-first-served by language ports (Erlang first) |
When a port claims a contiguous block, document it in the table above.
The registry rejects collisions at startup with a loud error — there is
no silent shadowing.
## Naming
Always prefix opcode names with the extension name plus a dot:
`myport.OP_<NAME>`. The prefix is a hard convention so that multiple
extensions can share the global opcode-name namespace cleanly.
## State
`extension_state` is an extensible variant. Add your case (e.g.
`MyportState of { ... }`) at the top of your file, return it from
`init`, and pattern-match it inside your handlers. Other extensions
cannot see your state — the variant case is private to your module.
## Testing
`test_ext.ml` is the canonical worked example. `bin/run_tests.ml`
calls `Test_ext.register ()`, then drives bytecode that exercises the
opcodes end-to-end (push, double, dispatch, disassemble, invocation
counter). Mirror this shape when adding a real port's extension.
## Build wiring
`lib/dune` has `(include_subdirs unqualified)`, so any `.ml` you drop
in here is automatically part of the `sx` library. Module name follows
the filename verbatim (`test_ext.ml``Test_ext`).

View File

@@ -0,0 +1,278 @@
(** {1 [erlang_ext] — Erlang-on-SX VM opcode extension (Phase 9h)}
Registers the Erlang opcode namespace in [Sx_vm_extensions] so that
[extension-opcode-id "erlang.OP_*"] resolves to a stable id. The SX
stub dispatcher in [lib/erlang/vm/dispatcher.sx] consults these ids
(Phase 9i) and falls back to its own local ids when the host
extension is absent.
Opcode ids occupy 222-239 in the extension partition (200-247).
222+ is chosen to clear the test extensions' reserved ids
(test_reg 210/211, test_ext 220/221) so all three coexist in
run_tests; production sx_server only registers this one. Names
mirror the SX stub dispatcher exactly:
- 222 erlang.OP_PATTERN_TUPLE - 231 erlang.OP_BIF_HD
- 223 erlang.OP_PATTERN_LIST - 232 erlang.OP_BIF_TL
- 224 erlang.OP_PATTERN_BINARY - 233 erlang.OP_BIF_ELEMENT
- 225 erlang.OP_PERFORM - 234 erlang.OP_BIF_TUPLE_SIZE
- 226 erlang.OP_HANDLE - 235 erlang.OP_BIF_LISTS_REVERSE
- 227 erlang.OP_RECEIVE_SCAN - 236 erlang.OP_BIF_IS_INTEGER
- 228 erlang.OP_SPAWN - 237 erlang.OP_BIF_IS_ATOM
- 229 erlang.OP_SEND - 238 erlang.OP_BIF_IS_LIST
- 230 erlang.OP_BIF_LENGTH - 239 erlang.OP_BIF_IS_TUPLE
{2 Handler status}
The bytecode compiler does not yet emit these opcodes — Erlang
programs run through the general CEK path and the working
specialization path is the SX stub dispatcher. So every handler
here raises a descriptive [Eval_error] rather than silently
corrupting the VM stack. This keeps the extension honest: the
namespace is registered and disassembles by name, [extension-opcode-id]
works, but actually dispatching an opcode (which only happens once a
future phase teaches the compiler to emit them) fails loudly with a
pointer to the phase that will wire it. Real stack-machine handlers
land alongside compiler emission in a later phase. *)
open Sx_types
(** Per-instance state: invocation counter, purely to exercise the
[extension_state] machinery (mirrors [test_ext]). *)
type Sx_vm_extension.extension_state += ErlangExtState of {
mutable dispatched : int;
}
let not_wired name =
raise (Eval_error
(Printf.sprintf
"%s: bytecode emission not yet wired (Phase 9j) — \
Erlang runs via CEK; specialization path is the SX stub \
dispatcher in lib/erlang/vm/dispatcher.sx"
name))
module M : Sx_vm_extension.EXTENSION = struct
let name = "erlang"
let init () = ErlangExtState { dispatched = 0 }
let opcodes st =
let bump () = match st with
| ErlangExtState s -> s.dispatched <- s.dispatched + 1
| _ -> ()
in
let op id nm =
(id, nm, (fun (_vm : Sx_vm.vm) (_frame : Sx_vm.frame) ->
bump (); not_wired nm))
in
(* Phase 10b vertical slice: one REAL register-machine handler.
erlang.OP_BIF_LENGTH (230) — pops an Erlang list off the VM
stack and pushes its length. Proves the full path works:
extension-opcode-id -> bytecode -> Sx_vm dispatch fallthrough
-> this handler -> correct stack result. The remaining 17
opcodes still raise not_wired until their handlers + compiler
emission land. Erlang lists are tagged dicts:
nil = {"tag" -> String "nil"}
cons = {"tag" -> String "cons"; "head" -> v; "tail" -> v} *)
let er_tag d =
match Hashtbl.find_opt d "tag" with
| Some (String s) -> s | _ -> ""
in
let op_bif_length =
(230, "erlang.OP_BIF_LENGTH",
(fun (vm : Sx_vm.vm) (_frame : Sx_vm.frame) ->
bump ();
let v = Sx_vm.pop vm in
let rec walk acc node =
match node with
| Dict d ->
(match er_tag d with
| "nil" -> acc
| "cons" ->
(match Hashtbl.find_opt d "tail" with
| Some t -> walk (acc + 1) t
| None -> raise (Eval_error
"erlang.OP_BIF_LENGTH: cons cell without :tail"))
| _ -> raise (Eval_error
"erlang.OP_BIF_LENGTH: not a proper list"))
| _ -> raise (Eval_error
"erlang.OP_BIF_LENGTH: not a proper list")
in
Sx_vm.push vm (Integer (walk 0 v))))
in
(* Phase 10b — simple hot-BIF handlers. Erlang bool is the atom
{"tag"->"atom"; "name"->"true"|"false"}; mk_atom builds it. *)
let mk_atom nm =
let h = Hashtbl.create 2 in
Hashtbl.replace h "tag" (String "atom");
Hashtbl.replace h "name" (String nm);
Dict h
in
let er_bool b = mk_atom (if b then "true" else "false") in
let is_tag v t = match v with
| Dict d -> er_tag d = t
| _ -> false
in
let op_bif_hd =
(231, "erlang.OP_BIF_HD",
(fun (vm : Sx_vm.vm) _f ->
bump ();
match Sx_vm.pop vm with
| Dict d when er_tag d = "cons" ->
(match Hashtbl.find_opt d "head" with
| Some h -> Sx_vm.push vm h
| None -> raise (Eval_error "erlang.OP_BIF_HD: cons without :head"))
| _ -> raise (Eval_error "erlang.OP_BIF_HD: not a cons")))
in
let op_bif_tl =
(232, "erlang.OP_BIF_TL",
(fun (vm : Sx_vm.vm) _f ->
bump ();
match Sx_vm.pop vm with
| Dict d when er_tag d = "cons" ->
(match Hashtbl.find_opt d "tail" with
| Some t -> Sx_vm.push vm t
| None -> raise (Eval_error "erlang.OP_BIF_TL: cons without :tail"))
| _ -> raise (Eval_error "erlang.OP_BIF_TL: not a cons")))
in
let op_bif_tuple_size =
(234, "erlang.OP_BIF_TUPLE_SIZE",
(fun (vm : Sx_vm.vm) _f ->
bump ();
match Sx_vm.pop vm with
| Dict d when er_tag d = "tuple" ->
let n = match Hashtbl.find_opt d "elements" with
| Some (List es) -> List.length es
| Some (ListRef r) -> List.length !r
| _ -> raise (Eval_error
"erlang.OP_BIF_TUPLE_SIZE: tuple without :elements")
in
Sx_vm.push vm (Integer n)
| _ -> raise (Eval_error "erlang.OP_BIF_TUPLE_SIZE: not a tuple")))
in
let op_bif_is_integer =
(236, "erlang.OP_BIF_IS_INTEGER",
(fun (vm : Sx_vm.vm) _f ->
bump ();
let v = Sx_vm.pop vm in
Sx_vm.push vm (er_bool (match v with Integer _ -> true | _ -> false))))
in
let op_bif_is_atom =
(237, "erlang.OP_BIF_IS_ATOM",
(fun (vm : Sx_vm.vm) _f ->
bump ();
let v = Sx_vm.pop vm in
Sx_vm.push vm (er_bool (is_tag v "atom"))))
in
let op_bif_is_list =
(238, "erlang.OP_BIF_IS_LIST",
(fun (vm : Sx_vm.vm) _f ->
bump ();
let v = Sx_vm.pop vm in
Sx_vm.push vm (er_bool (is_tag v "cons" || is_tag v "nil"))))
in
let op_bif_is_tuple =
(239, "erlang.OP_BIF_IS_TUPLE",
(fun (vm : Sx_vm.vm) _f ->
bump ();
let v = Sx_vm.pop vm in
Sx_vm.push vm (er_bool (is_tag v "tuple"))))
in
(* element/2 and lists:reverse/1 — pure stack transforms (no
bytecode operands). Calling convention: args pushed left→right,
so element/2 stack is [.. Index Tuple] (Tuple on top). Erlang
element/2 is 1-indexed. *)
let op_bif_element =
(233, "erlang.OP_BIF_ELEMENT",
(fun (vm : Sx_vm.vm) _f ->
bump ();
let tup = Sx_vm.pop vm in
let idx = Sx_vm.pop vm in
match tup, idx with
| Dict d, Integer i when er_tag d = "tuple" ->
let es = match Hashtbl.find_opt d "elements" with
| Some (List es) -> es
| Some (ListRef r) -> !r
| _ -> raise (Eval_error
"erlang.OP_BIF_ELEMENT: tuple without :elements")
in
let n = List.length es in
if i < 1 || i > n then
raise (Eval_error
(Printf.sprintf
"erlang.OP_BIF_ELEMENT: index %d out of range 1..%d" i n))
else
Sx_vm.push vm (List.nth es (i - 1))
| _, Integer _ ->
raise (Eval_error "erlang.OP_BIF_ELEMENT: 2nd arg not a tuple")
| _ ->
raise (Eval_error "erlang.OP_BIF_ELEMENT: 1st arg not an integer")))
in
let op_bif_lists_reverse =
(235, "erlang.OP_BIF_LISTS_REVERSE",
(fun (vm : Sx_vm.vm) _f ->
bump ();
let v = Sx_vm.pop vm in
let mk_nil () =
let h = Hashtbl.create 1 in
Hashtbl.replace h "tag" (String "nil"); Dict h in
let mk_cons hd tl =
let h = Hashtbl.create 3 in
Hashtbl.replace h "tag" (String "cons");
Hashtbl.replace h "head" hd;
Hashtbl.replace h "tail" tl;
Dict h in
let rec rev acc node =
match node with
| Dict d ->
(match er_tag d with
| "nil" -> acc
| "cons" ->
let hd = match Hashtbl.find_opt d "head" with
| Some x -> x
| None -> raise (Eval_error
"erlang.OP_BIF_LISTS_REVERSE: cons without :head") in
let tl = match Hashtbl.find_opt d "tail" with
| Some x -> x
| None -> raise (Eval_error
"erlang.OP_BIF_LISTS_REVERSE: cons without :tail") in
rev (mk_cons hd acc) tl
| _ -> raise (Eval_error
"erlang.OP_BIF_LISTS_REVERSE: not a proper list"))
| _ -> raise (Eval_error
"erlang.OP_BIF_LISTS_REVERSE: not a proper list")
in
Sx_vm.push vm (rev (mk_nil ()) v)))
in
[
op 222 "erlang.OP_PATTERN_TUPLE";
op 223 "erlang.OP_PATTERN_LIST";
op 224 "erlang.OP_PATTERN_BINARY";
op 225 "erlang.OP_PERFORM";
op 226 "erlang.OP_HANDLE";
op 227 "erlang.OP_RECEIVE_SCAN";
op 228 "erlang.OP_SPAWN";
op 229 "erlang.OP_SEND";
op_bif_length;
op_bif_hd;
op_bif_tl;
op_bif_element;
op_bif_tuple_size;
op_bif_lists_reverse;
op_bif_is_integer;
op_bif_is_atom;
op_bif_is_list;
op_bif_is_tuple;
]
end
(** Register [erlang] in [Sx_vm_extensions]. Idempotent only by failing
loudly — calling twice raises [Failure]. sx_server calls this once
at startup. *)
let register () = Sx_vm_extensions.register (module M : Sx_vm_extension.EXTENSION)
(** Read the dispatch counter from the live registry state. [None] if
[register] hasn't run. *)
let dispatch_count () =
match Sx_vm_extensions.state_of_extension "erlang" with
| Some (ErlangExtState s) -> Some s.dispatched
| _ -> None

View File

@@ -0,0 +1,67 @@
(** {1 [test_ext] — canonical example VM extension}
A minimal extension demonstrating the registration pattern from
[plans/sx-vm-opcode-extension.md]. The opcode IDs (220, 221) sit at
the top of the extension range, well clear of anything a real
language port would claim.
Two operand-less opcodes:
- [test_ext.OP_TEST_PUSH_42] (220) — pushes the integer 42.
- [test_ext.OP_TEST_DOUBLE_TOS] (221) — pops the integer on TOS,
pushes 2× it.
These are the smallest stack manipulations that prove the extension
mechanism wires through end-to-end (registry → dispatch → human-
readable disassembly). Real ports (Erlang Phase 9, future Haskell
perf phases) replace this template with their own opcode set.
Loading: [Test_ext.register ()] adds the extension to
[Sx_vm_extensions]. Run-time binaries that want the test opcodes
available call this once at startup. Unit tests in
[bin/run_tests.ml] do exactly that. *)
open Sx_types
(** Per-instance state for [test_ext]. Counts how many times the
handlers ran — purely so the extension has *some* state, exercising
the [extension_state] machinery. *)
type Sx_vm_extension.extension_state += TestExtState of {
mutable invocations : int;
}
module M : Sx_vm_extension.EXTENSION = struct
let name = "test_ext"
let init () = TestExtState { invocations = 0 }
let opcodes st =
let bump () = match st with
| TestExtState s -> s.invocations <- s.invocations + 1
| _ -> ()
in
[
(220, "test_ext.OP_TEST_PUSH_42",
(fun vm _frame -> bump (); Sx_vm.push vm (Integer 42)));
(221, "test_ext.OP_TEST_DOUBLE_TOS",
(fun vm _frame ->
bump ();
let v = Sx_vm.pop vm in
match v with
| Integer n -> Sx_vm.push vm (Integer (n * 2))
| _ -> raise (Eval_error
"test_ext.OP_TEST_DOUBLE_TOS: TOS is not an integer")));
]
end
(** Register [test_ext] in [Sx_vm_extensions]. Idempotent only by
failing loudly — calling twice raises [Failure]. Binaries call this
once at startup; tests may [_reset_for_tests] then re-register. *)
let register () = Sx_vm_extensions.register (module M : Sx_vm_extension.EXTENSION)
(** Read the invocation counter from the live registry state. Returns
[None] if [register] hasn't been called yet. *)
let invocation_count () =
match Sx_vm_extensions.state_of_extension "test_ext" with
| Some (TestExtState s) -> Some s.invocations
| _ -> None

142
hosts/ocaml/lib/sx_cbor.ml Normal file
View File

@@ -0,0 +1,142 @@
(** dag-cbor encode / decode — pure OCaml, WASM-safe.
RFC 8949 deterministic subset as constrained by IPLD dag-cbor
(RFC 8742): unsigned/negative ints, text strings, arrays, maps
with keys sorted by **length-then-bytewise**, bool, null, and
tag 42 (CID link, decode-side passthrough). Floats are not
supported (no fed-sx shape needs them yet) — encoding a [Number]
or decoding a float head raises. Reference: RFC 8949 §3, §4.2. *)
open Sx_types
exception Cbor_error of string
(* ---- Encoder ---- *)
let write_head buf major v =
let m = major lsl 5 in
if v < 24 then
Buffer.add_char buf (Char.chr (m lor v))
else if v < 0x100 then begin
Buffer.add_char buf (Char.chr (m lor 24));
Buffer.add_char buf (Char.chr v)
end else if v < 0x10000 then begin
Buffer.add_char buf (Char.chr (m lor 25));
Buffer.add_char buf (Char.chr ((v lsr 8) land 0xFF));
Buffer.add_char buf (Char.chr (v land 0xFF))
end else if v < 0x100000000 then begin
Buffer.add_char buf (Char.chr (m lor 26));
for i = 3 downto 0 do
Buffer.add_char buf (Char.chr ((v lsr (8 * i)) land 0xFF))
done
end else begin
Buffer.add_char buf (Char.chr (m lor 27));
for i = 7 downto 0 do
Buffer.add_char buf (Char.chr ((v lsr (8 * i)) land 0xFF))
done
end
(* dag-cbor map key order: shorter key first, then bytewise. *)
let key_order a b =
let la = String.length a and lb = String.length b in
if la <> lb then compare la lb else compare a b
let rec encode_into buf (v : value) : unit =
match v with
| Integer n ->
if n >= 0 then write_head buf 0 n
else write_head buf 1 (-1 - n)
| String s ->
write_head buf 3 (String.length s);
Buffer.add_string buf s
| Symbol s | Keyword s ->
write_head buf 3 (String.length s);
Buffer.add_string buf s
| Bool false -> Buffer.add_char buf '\xf4'
| Bool true -> Buffer.add_char buf '\xf5'
| Nil -> Buffer.add_char buf '\xf6'
| List items ->
write_head buf 4 (List.length items);
List.iter (encode_into buf) items
| Dict d ->
let keys = Hashtbl.fold (fun k _ acc -> k :: acc) d [] in
let keys = List.sort_uniq key_order keys in
write_head buf 5 (List.length keys);
List.iter (fun k ->
write_head buf 3 (String.length k);
Buffer.add_string buf k;
encode_into buf (Hashtbl.find d k)) keys
| Number _ ->
raise (Cbor_error "cbor-encode: floats unsupported (dag-cbor subset)")
| _ ->
raise (Cbor_error
("cbor-encode: unencodable value " ^ type_of v))
let encode (v : value) : string =
let buf = Buffer.create 64 in
encode_into buf v;
Buffer.contents buf
(* ---- Decoder ---- *)
let decode (s : string) : value =
let pos = ref 0 in
let len = String.length s in
let byte () =
if !pos >= len then raise (Cbor_error "cbor-decode: truncated");
let c = Char.code s.[!pos] in incr pos; c
in
let read_uint ai =
if ai < 24 then ai
else if ai = 24 then byte ()
else if ai = 25 then let a = byte () in let b = byte () in (a lsl 8) lor b
else if ai = 26 then begin
let v = ref 0 in
for _ = 0 to 3 do v := (!v lsl 8) lor byte () done; !v
end else if ai = 27 then begin
let v = ref 0 in
for _ = 0 to 7 do v := (!v lsl 8) lor byte () done; !v
end else raise (Cbor_error "cbor-decode: bad additional info")
in
let read_bytes n =
if !pos + n > len then raise (Cbor_error "cbor-decode: truncated");
let r = String.sub s !pos n in pos := !pos + n; r
in
let rec item () =
let b = byte () in
let major = b lsr 5 and ai = b land 0x1f in
match major with
| 0 -> Integer (read_uint ai)
| 1 -> Integer (-1 - read_uint ai)
| 2 -> String (read_bytes (read_uint ai))
| 3 -> String (read_bytes (read_uint ai))
| 4 ->
let n = read_uint ai in
List (List.init n (fun _ -> item ()))
| 5 ->
let n = read_uint ai in
let d = make_dict () in
for _ = 1 to n do
let k = match item () with
| String k -> k
| _ -> raise (Cbor_error "cbor-decode: non-string map key")
in
Hashtbl.replace d k (item ())
done;
Dict d
| 6 ->
(* Tag: tag-42 CID link → pass the inner item through. *)
ignore (read_uint ai); item ()
| 7 ->
(match ai with
| 20 -> Bool false
| 21 -> Bool true
| 22 -> Nil
| 23 -> Nil
| _ ->
raise (Cbor_error
"cbor-decode: floats/simple unsupported (dag-cbor subset)"))
| _ -> raise (Cbor_error "cbor-decode: bad major type")
in
let v = item () in
v

66
hosts/ocaml/lib/sx_cid.ml Normal file
View File

@@ -0,0 +1,66 @@
(** CIDv1 computation — pure OCaml, WASM-safe.
Multihash + CIDv1 + multibase base32-lower (RFC 4648, no pad,
multibase prefix 'b'). Codecs: dag-cbor 0x71, raw 0x55. Hash
codes: sha2-256 0x12, sha3-256 0x16. Reference: the multiformats
specs (unsigned-varint, multihash, cid, multibase). No deps. *)
open Sx_types
(* Unsigned LEB128 (multiformats unsigned-varint). *)
let varint (n : int) : string =
let buf = Buffer.create 4 in
let n = ref n in
let cont = ref true in
while !cont do
let b = !n land 0x7f in
n := !n lsr 7;
if !n = 0 then (Buffer.add_char buf (Char.chr b); cont := false)
else Buffer.add_char buf (Char.chr (b lor 0x80))
done;
Buffer.contents buf
(* RFC 4648 base32 lowercase, no padding. *)
let b32_alpha = "abcdefghijklmnopqrstuvwxyz234567"
let base32_lower (s : string) : string =
let buf = Buffer.create ((String.length s * 8 + 4) / 5) in
let acc = ref 0 and bits = ref 0 in
String.iter (fun c ->
acc := (!acc lsl 8) lor (Char.code c);
bits := !bits + 8;
while !bits >= 5 do
bits := !bits - 5;
Buffer.add_char buf b32_alpha.[(!acc lsr !bits) land 0x1f]
done) s;
if !bits > 0 then
Buffer.add_char buf b32_alpha.[(!acc lsl (5 - !bits)) land 0x1f];
Buffer.contents buf
(* "abef" -> the 2 raw bytes. *)
let unhex (h : string) : string =
let n = String.length h / 2 in
let b = Bytes.create n in
for i = 0 to n - 1 do
Bytes.set b i
(Char.chr (int_of_string ("0x" ^ String.sub h (2 * i) 2)))
done;
Bytes.unsafe_to_string b
(* multihash = varint(code) || varint(len) || digest *)
let multihash (code : int) (digest : string) : string =
varint code ^ varint (String.length digest) ^ digest
(* CIDv1 = 0x01 || varint(codec) || multihash ; multibase 'b' base32. *)
let cidv1 (codec : int) (mh : string) : string =
"b" ^ base32_lower ("\x01" ^ varint codec ^ mh)
let codec_dag_cbor = 0x71
let mh_sha2_256 = 0x12
(* Canonicalize an SX value: dag-cbor encode -> sha2-256 ->
multihash -> CIDv1 (dag-cbor codec). *)
let cid_from_sx (v : value) : string =
let cbor = Sx_cbor.encode v in
let digest = unhex (Sx_sha2.sha256_hex cbor) in
cidv1 codec_dag_cbor (multihash mh_sha2_256 digest)

View File

@@ -0,0 +1,289 @@
(** Ed25519 signature verification — pure OCaml, WASM-safe.
RFC 8032 §5.1.7 cofactorless verify over edwards25519. Includes a
minimal arbitrary-precision unsigned bignum (no Zarith / no deps)
and twisted-Edwards extended-coordinate point arithmetic. Verify
is total: malformed inputs return [false], never raise. SHA-512
is reused from {!Sx_sha2}. Reference: RFC 8032, RFC 7748. *)
(* ---- Minimal bignum: int array, little-endian, base 2^26. ---- *)
let bits = 26
let base = 1 lsl bits
let mask = base - 1
type bn = int array (* normalized: no high zero limbs, length >= 1 *)
let norm (a : bn) : bn =
let n = ref (Array.length a) in
while !n > 1 && a.(!n - 1) = 0 do decr n done;
if !n = Array.length a then a else Array.sub a 0 !n
let bzero : bn = [| 0 |]
let of_int n : bn =
if n = 0 then bzero
else begin
let r = ref [] and n = ref n in
while !n > 0 do r := (!n land mask) :: !r; n := !n lsr bits done;
norm (Array.of_list (List.rev !r))
end
let is_zero (a : bn) = Array.length a = 1 && a.(0) = 0
let cmp (a : bn) (b : bn) : int =
let a = norm a and b = norm b in
let la = Array.length a and lb = Array.length b in
if la <> lb then compare la lb
else begin
let r = ref 0 and i = ref (la - 1) in
while !r = 0 && !i >= 0 do
if a.(!i) <> b.(!i) then r := compare a.(!i) b.(!i);
decr i
done; !r
end
let add (a : bn) (b : bn) : bn =
let la = Array.length a and lb = Array.length b in
let n = (max la lb) + 1 in
let r = Array.make n 0 in
let carry = ref 0 in
for i = 0 to n - 1 do
let s = !carry
+ (if i < la then a.(i) else 0)
+ (if i < lb then b.(i) else 0) in
r.(i) <- s land mask; carry := s lsr bits
done;
norm r
(* a - b, requires a >= b *)
let sub (a : bn) (b : bn) : bn =
let la = Array.length a and lb = Array.length b in
let r = Array.make la 0 in
let borrow = ref 0 in
for i = 0 to la - 1 do
let s = a.(i) - !borrow - (if i < lb then b.(i) else 0) in
if s < 0 then (r.(i) <- s + base; borrow := 1)
else (r.(i) <- s; borrow := 0)
done;
norm r
let mul (a : bn) (b : bn) : bn =
let la = Array.length a and lb = Array.length b in
let r = Array.make (la + lb) 0 in
for i = 0 to la - 1 do
let carry = ref 0 in
for j = 0 to lb - 1 do
let s = r.(i + j) + a.(i) * b.(j) + !carry in
r.(i + j) <- s land mask; carry := s lsr bits
done;
r.(i + lb) <- r.(i + lb) + !carry
done;
norm r
let numbits (a : bn) : int =
let a = norm a in
let hi = Array.length a - 1 in
if hi = 0 && a.(0) = 0 then 0
else begin
let b = ref 0 and v = ref a.(hi) in
while !v > 0 do incr b; v := !v lsr 1 done;
hi * bits + !b
end
let bit (a : bn) (i : int) : int =
let limb = i / bits and off = i mod bits in
if limb >= Array.length a then 0 else (a.(limb) lsr off) land 1
(* r = a mod m (m > 0), binary long division. *)
let bn_mod (a : bn) (m : bn) : bn =
if cmp a m < 0 then norm a
else begin
let r = ref bzero in
for i = numbits a - 1 downto 0 do
(* r = r*2 + bit *)
r := add !r !r;
if bit a i = 1 then r := add !r [| 1 |];
if cmp !r m >= 0 then r := sub !r m
done;
!r
end
let div_small (a : bn) (d : int) : bn =
let la = Array.length a in
let q = Array.make la 0 in
let rem = ref 0 in
for i = la - 1 downto 0 do
let cur = (!rem lsl bits) lor a.(i) in
q.(i) <- cur / d; rem := cur mod d
done;
norm q
let powmod (b0 : bn) (e : bn) (m : bn) : bn =
let result = ref [| 1 |] and b = ref (bn_mod b0 m) in
let nb = numbits e in
for i = 0 to nb - 1 do
if bit e i = 1 then result := bn_mod (mul !result !b) m;
b := bn_mod (mul !b !b) m
done;
!result
let of_bytes_le (s : string) : bn =
let acc = ref bzero in
for i = String.length s - 1 downto 0 do
acc := add (mul !acc (of_int 256)) (of_int (Char.code s.[i]))
done;
!acc
let to_bytes_le (a : bn) (n : int) : string =
let b = Bytes.make n '\000' in
let cur = ref (norm a) in
for i = 0 to n - 1 do
let q = div_small !cur 256 in
let r =
let qm = mul q (of_int 256) in
let d = sub !cur qm in
if is_zero d then 0 else d.(0)
in
Bytes.set b i (Char.chr r);
cur := q
done;
Bytes.unsafe_to_string b
(* ---- Field GF(p), p = 2^255 - 19 ---- *)
let p =
let twop255 = Array.make 11 0 in (* 11*26 = 286 > 255 *)
let limb = 255 / bits and off = 255 mod bits in
twop255.(limb) <- 1 lsl off;
sub (norm twop255) (of_int 19)
let fmod a = bn_mod a p
let fadd a b = fmod (add a b)
let fsub a b = fmod (add a (sub p (fmod b)))
let fmul a b = fmod (mul a b)
let fpow a e = powmod a e p
let finv a = fpow a (sub p (of_int 2)) (* Fermat: a^(p-2) *)
(* group order L = 2^252 + 27742317777372353535851937790883648493 *)
let ell =
of_bytes_le
"\xed\xd3\xf5\x5c\x1a\x63\x12\x58\xd6\x9c\xf7\xa2\xde\xf9\xde\x14\
\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x10"
(* d = -121665 / 121666 mod p *)
let dconst =
let inv666 = finv (of_int 121666) in
fmod (mul (fsub (of_int 0) (of_int 121665)) inv666)
(* sqrt(-1) = 2^((p-1)/4) mod p *)
let sqrtm1 = fpow (of_int 2) (div_small (sub p (of_int 1)) 4)
(* ---- edwards25519 points in extended coords (X,Y,Z,T) ---- *)
type pt = { x : bn; y : bn; z : bn; t : bn }
let identity = { x = bzero; y = of_int 1; z = of_int 1; t = bzero }
(* add-2008-hwcd-3, complete for a = -1 on ed25519 *)
let padd (p1 : pt) (p2 : pt) : pt =
let a = fmul (fsub p1.y p1.x) (fsub p2.y p2.x) in
let b = fmul (fadd p1.y p1.x) (fadd p2.y p2.x) in
let c = fmul (fmul p1.t (fmul (of_int 2) dconst)) p2.t in
let dd = fmul (fmul p1.z (of_int 2)) p2.z in
let e = fsub b a in
let f = fsub dd c in
let g = fadd dd c in
let h = fadd b a in
{ x = fmul e f; y = fmul g h; t = fmul e h; z = fmul f g }
let scalar_mul (n : bn) (q : pt) : pt =
let r = ref identity in
for i = numbits n - 1 downto 0 do
r := padd !r !r;
if bit n i = 1 then r := padd !r q
done;
!r
let pnegate (q : pt) : pt =
{ q with x = fsub (of_int 0) q.x; t = fsub (of_int 0) q.t }
(* Decompress a 32-byte little-endian point encoding. *)
let decompress (s : string) : pt option =
if String.length s <> 32 then None
else begin
let sign = (Char.code s.[31] lsr 7) land 1 in
let s' = Bytes.of_string s in
Bytes.set s' 31 (Char.chr (Char.code s.[31] land 0x7f));
let y = of_bytes_le (Bytes.unsafe_to_string s') in
if cmp y p >= 0 then None
else begin
let y2 = fmul y y in
let u = fsub y2 (of_int 1) in
let v = fadd (fmul dconst y2) (of_int 1) in
(* x = u v^3 (u v^7)^((p-5)/8) *)
let v3 = fmul (fmul v v) v in
let v7 = fmul (fmul v3 v3) v in
let exp = div_small (sub p (of_int 5)) 8 in
let x0 = fmul (fmul u v3) (fpow (fmul u v7) exp) in
let vx2 = fmul v (fmul x0 x0) in
let x =
if cmp vx2 u = 0 then Some x0
else if cmp vx2 (fsub (of_int 0) u) = 0 then Some (fmul x0 sqrtm1)
else None
in
match x with
| None -> None
| Some x ->
if is_zero x && sign = 1 then None
else begin
let x = if (bit x 0) <> sign then fsub (of_int 0) x else x in
Some { x; y; z = of_int 1; t = fmul x y }
end
end
end
(* Encode a point to 32-byte little-endian (y with x-parity bit). *)
let encode (q : pt) : string =
let zi = finv q.z in
let x = fmul q.x zi and y = fmul q.y zi in
let b = Bytes.of_string (to_bytes_le y 32) in
let last = Char.code (Bytes.get b 31) lor ((bit x 0) lsl 7) in
Bytes.set b 31 (Char.chr last);
Bytes.unsafe_to_string b
(* base point: y = 4/5 mod p, x even (sign 0). *)
let base_point =
let by = fmul (of_int 4) (finv (of_int 5)) in
match decompress (to_bytes_le by 32) with
| Some pt -> pt
| None -> failwith "ed25519: base point decompress failed"
let unhex (h : string) : string =
let n = String.length h / 2 in
let b = Bytes.create n in
for i = 0 to n - 1 do
Bytes.set b i
(Char.chr (int_of_string ("0x" ^ String.sub h (2 * i) 2)))
done;
Bytes.unsafe_to_string b
let sha512_bytes s = unhex (Sx_sha2.sha512_hex s)
(* RFC 8032 §5.1.7 cofactorless: encode([S]B - [k]A) == R. *)
let verify ~pubkey ~msg ~sig_ : bool =
if String.length pubkey <> 32 || String.length sig_ <> 64 then false
else
let rb = String.sub sig_ 0 32 in
let sb = String.sub sig_ 32 32 in
let s = of_bytes_le sb in
if cmp s ell >= 0 then false
else
match decompress pubkey with
| None -> false
| Some a ->
let h = sha512_bytes (rb ^ pubkey ^ msg) in
let k = bn_mod (of_bytes_le h) ell in
let sb_pt = scalar_mul s base_point in
let ka = scalar_mul k a in
let chk = padd sb_pt (pnegate ka) in
(try encode chk = rb with _ -> false)

View File

@@ -3237,6 +3237,21 @@ let () =
with Sys_error msg -> raise (Eval_error ("file-read: " ^ msg)))
| _ -> raise (Eval_error "file-read: (path)"));
(* fed-sx Step 3 segment replay. Sorted names, no "."/".." ;
errors prefixed like file-read (msg carries enoent/enotdir). *)
register "file-list-dir" (fun args ->
match args with
| [String path] ->
(try
let names = Sys.readdir path in
let names =
Array.to_list names
|> List.filter (fun n -> n <> "." && n <> "..") in
let names = List.sort compare names in
List (List.map (fun n -> String n) names)
with Sys_error msg -> raise (Eval_error ("file-list-dir: " ^ msg)))
| _ -> raise (Eval_error "file-list-dir: (path)"));
register "file-write" (fun args ->
match args with
| [String path; String content] ->
@@ -4158,4 +4173,61 @@ let () =
Sx_types.jit_skipped_count := 0;
Sx_types.jit_threshold_skipped_count := 0;
Sx_types.jit_evicted_count := 0;
Nil)
Nil);
(* fed-sx host primitives — pure-OCaml crypto (WASM-safe). *)
register "crypto-sha256" (fun args ->
match args with
| [String s] -> String (Sx_sha2.sha256_hex s)
| _ -> raise (Eval_error "crypto-sha256: (bytes)"));
register "crypto-sha512" (fun args ->
match args with
| [String s] -> String (Sx_sha2.sha512_hex s)
| _ -> raise (Eval_error "crypto-sha512: (bytes)"));
register "crypto-sha3-256" (fun args ->
match args with
| [String s] -> String (Sx_sha3.sha3_256_hex s)
| _ -> raise (Eval_error "crypto-sha3-256: (bytes)"));
register "cbor-encode" (fun args ->
match args with
| [v] ->
(try String (Sx_cbor.encode v)
with Sx_cbor.Cbor_error m -> raise (Eval_error m))
| _ -> raise (Eval_error "cbor-encode: (value)"));
register "cbor-decode" (fun args ->
match args with
| [String s] ->
(try Sx_cbor.decode s
with Sx_cbor.Cbor_error m -> raise (Eval_error m))
| _ -> raise (Eval_error "cbor-decode: (bytes)"));
register "cid-from-bytes" (fun args ->
match args with
| [Integer codec; String mh] ->
String (Sx_cid.cidv1 codec mh)
| _ -> raise (Eval_error "cid-from-bytes: (codec multihash-bytes)"));
register "cid-from-sx" (fun args ->
match args with
| [v] ->
(try String (Sx_cid.cid_from_sx v)
with Sx_cbor.Cbor_error m -> raise (Eval_error m))
| _ -> raise (Eval_error "cid-from-sx: (value)"));
(* Verify is total: any malformed input -> false, never raises. *)
register "ed25519-verify" (fun args ->
match args with
| [String pk; String msg; String sg] ->
Bool (try Sx_ed25519.verify ~pubkey:pk ~msg ~sig_:sg
with _ -> false)
| _ -> Bool false);
register "rsa-sha256-verify" (fun args ->
match args with
| [String spki; String msg; String sg] ->
Bool (try Sx_rsa.verify ~spki ~msg ~sig_:sg with _ -> false)
| _ -> Bool false)

220
hosts/ocaml/lib/sx_rsa.ml Normal file
View File

@@ -0,0 +1,220 @@
(** RSASSA-PKCS1-v1_5 verification with SHA-256 — pure OCaml,
WASM-safe. Self-contained minimal bignum (modexp only), a tiny
DER reader for SubjectPublicKeyInfo, and the fixed SHA-256
DigestInfo prefix. Verify only on public data — constant time
not required. Reference: RFC 8017 §8.2.2, §9.2. No deps. *)
(* ---- Minimal unsigned bignum: int array, little-endian, base 2^26 ---- *)
let bits = 26
let base = 1 lsl bits
let mask = base - 1
type bn = int array
let norm a =
let n = ref (Array.length a) in
while !n > 1 && a.(!n - 1) = 0 do decr n done;
if !n = Array.length a then a else Array.sub a 0 !n
let bzero : bn = [| 0 |]
let is_zero a = Array.length a = 1 && a.(0) = 0
let cmp a b =
let a = norm a and b = norm b in
let la = Array.length a and lb = Array.length b in
if la <> lb then compare la lb
else begin
let r = ref 0 and i = ref (la - 1) in
while !r = 0 && !i >= 0 do
if a.(!i) <> b.(!i) then r := compare a.(!i) b.(!i);
decr i
done; !r
end
let add a b =
let la = Array.length a and lb = Array.length b in
let n = (max la lb) + 1 in
let r = Array.make n 0 and carry = ref 0 in
for i = 0 to n - 1 do
let s = !carry + (if i < la then a.(i) else 0)
+ (if i < lb then b.(i) else 0) in
r.(i) <- s land mask; carry := s lsr bits
done;
norm r
let sub a b = (* requires a >= b *)
let la = Array.length a and lb = Array.length b in
let r = Array.make la 0 and borrow = ref 0 in
for i = 0 to la - 1 do
let s = a.(i) - !borrow - (if i < lb then b.(i) else 0) in
if s < 0 then (r.(i) <- s + base; borrow := 1)
else (r.(i) <- s; borrow := 0)
done;
norm r
let mul a b =
let la = Array.length a and lb = Array.length b in
let r = Array.make (la + lb) 0 in
for i = 0 to la - 1 do
let carry = ref 0 in
for j = 0 to lb - 1 do
let s = r.(i + j) + a.(i) * b.(j) + !carry in
r.(i + j) <- s land mask; carry := s lsr bits
done;
r.(i + lb) <- r.(i + lb) + !carry
done;
norm r
let numbits a =
let a = norm a in
let hi = Array.length a - 1 in
if hi = 0 && a.(0) = 0 then 0
else begin
let b = ref 0 and v = ref a.(hi) in
while !v > 0 do incr b; v := !v lsr 1 done;
hi * bits + !b
end
let bit a i =
let limb = i / bits and off = i mod bits in
if limb >= Array.length a then 0 else (a.(limb) lsr off) land 1
let bn_mod a m = (* binary long division, m > 0 *)
if cmp a m < 0 then norm a
else begin
let r = ref bzero in
for i = numbits a - 1 downto 0 do
r := add !r !r;
if bit a i = 1 then r := add !r [| 1 |];
if cmp !r m >= 0 then r := sub !r m
done;
!r
end
let powmod b0 e m =
let result = ref [| 1 |] and b = ref (bn_mod b0 m) in
for i = 0 to numbits e - 1 do
if bit e i = 1 then result := bn_mod (mul !result !b) m;
b := bn_mod (mul !b !b) m
done;
!result
let of_bytes_be (s : string) : bn =
let acc = ref bzero in
for i = 0 to String.length s - 1 do
acc := add (mul !acc [| 256 |]) [| Char.code s.[i] |]
done;
!acc
let div_small a d =
let la = Array.length a in
let q = Array.make la 0 and rem = ref 0 in
for i = la - 1 downto 0 do
let cur = (!rem lsl bits) lor a.(i) in
q.(i) <- cur / d; rem := cur mod d
done;
norm q
let to_bytes_be (a : bn) (n : int) : string =
let b = Bytes.make n '\000' in
let cur = ref (norm a) in
for i = n - 1 downto 0 do
let q = div_small !cur 256 in
let r =
let d = sub !cur (mul q [| 256 |]) in
if is_zero d then 0 else d.(0)
in
Bytes.set b i (Char.chr r);
cur := q
done;
Bytes.unsafe_to_string b
(* ---- Minimal DER reader (for SubjectPublicKeyInfo) ---- *)
exception Der of string
(* Returns (tag, content_start, content_len, next). *)
let der_tlv s pos =
if pos + 2 > String.length s then raise (Der "short");
let tag = Char.code s.[pos] in
let l0 = Char.code s.[pos + 1] in
let len, hdr =
if l0 < 0x80 then l0, 2
else begin
let nb = l0 land 0x7f in
if pos + 2 + nb > String.length s then raise (Der "short len");
let v = ref 0 in
for i = 0 to nb - 1 do
v := (!v lsl 8) lor Char.code s.[pos + 2 + i]
done;
!v, 2 + nb
end
in
(tag, pos + hdr, len, pos + hdr + len)
(* SPKI DER -> (n, e) as bignums. *)
let parse_spki (der : string) : bn * bn =
let tag, c, _l, _ = der_tlv der 0 in
if tag <> 0x30 then raise (Der "spki: outer not SEQUENCE");
(* AlgorithmIdentifier SEQUENCE — skip. *)
let _, _, _, after_alg = der_tlv der c in
(* BIT STRING. *)
let bt, bc, bl, _ = der_tlv der after_alg in
if bt <> 0x03 then raise (Der "spki: expected BIT STRING");
(* First content byte = unused bits (must be 0). *)
let rpk_start = bc + 1 in
ignore bl;
let st, sc, _, _ = der_tlv der rpk_start in
if st <> 0x30 then raise (Der "spki: RSAPublicKey not SEQUENCE");
let nt, nc, nl, after_n = der_tlv der sc in
if nt <> 0x02 then raise (Der "spki: modulus not INTEGER");
let et, ec, el, _ = der_tlv der after_n in
if et <> 0x02 then raise (Der "spki: exponent not INTEGER");
let n = of_bytes_be (String.sub der nc nl) in
let e = of_bytes_be (String.sub der ec el) in
(n, e)
(* SHA-256 DigestInfo DER prefix (RFC 8017 §9.2 note 1). *)
let sha256_digestinfo_prefix =
"\x30\x31\x30\x0d\x06\x09\x60\x86\x48\x01\x65\x03\x04\x02\x01\x05\x00\x04\x20"
let unhex h =
let n = String.length h / 2 in
let b = Bytes.create n in
for i = 0 to n - 1 do
Bytes.set b i (Char.chr (int_of_string ("0x" ^ String.sub h (2 * i) 2)))
done;
Bytes.unsafe_to_string b
(* RSASSA-PKCS1-v1_5 verify with SHA-256. Total: any malformed
input yields false (caller wraps, but be defensive here too). *)
let verify ~spki ~msg ~sig_ : bool =
try
let n, e = parse_spki spki in
let k = (numbits n + 7) / 8 in
if String.length sig_ <> k then false
else begin
let s = of_bytes_be sig_ in
if cmp s n >= 0 then false
else begin
let m = powmod s e n in
let em = to_bytes_be m k in
(* EM = 0x00 01 FF..FF 00 || DigestInfo || H *)
let h = unhex (Sx_sha2.sha256_hex msg) in
let t = sha256_digestinfo_prefix ^ h in
let tlen = String.length t in
if k < tlen + 11 then false
else begin
let ok = ref (em.[0] = '\x00' && em.[1] = '\x01') in
let ps_end = k - tlen - 1 in
for i = 2 to ps_end - 1 do
if em.[i] <> '\xff' then ok := false
done;
if em.[ps_end] <> '\x00' then ok := false;
if String.sub em (ps_end + 1) tlen <> t then ok := false;
!ok
end
end
end
with _ -> false

212
hosts/ocaml/lib/sx_sha2.ml Normal file
View File

@@ -0,0 +1,212 @@
(** SHA-2 (SHA-256, SHA-512) — pure OCaml, WASM-safe.
No C stubs, no external deps. Used by the fed-sx host primitives
[crypto-sha256] / [crypto-sha512]. Reference: FIPS 180-4. *)
(* ---- SHA-256 (FIPS 180-4 §6.2). 32-bit words held in native int,
masked to 32 bits after every arithmetic op. ---- *)
let mask32 = 0xFFFFFFFF
let k256 = [|
0x428a2f98; 0x71374491; 0xb5c0fbcf; 0xe9b5dba5;
0x3956c25b; 0x59f111f1; 0x923f82a4; 0xab1c5ed5;
0xd807aa98; 0x12835b01; 0x243185be; 0x550c7dc3;
0x72be5d74; 0x80deb1fe; 0x9bdc06a7; 0xc19bf174;
0xe49b69c1; 0xefbe4786; 0x0fc19dc6; 0x240ca1cc;
0x2de92c6f; 0x4a7484aa; 0x5cb0a9dc; 0x76f988da;
0x983e5152; 0xa831c66d; 0xb00327c8; 0xbf597fc7;
0xc6e00bf3; 0xd5a79147; 0x06ca6351; 0x14292967;
0x27b70a85; 0x2e1b2138; 0x4d2c6dfc; 0x53380d13;
0x650a7354; 0x766a0abb; 0x81c2c92e; 0x92722c85;
0xa2bfe8a1; 0xa81a664b; 0xc24b8b70; 0xc76c51a3;
0xd192e819; 0xd6990624; 0xf40e3585; 0x106aa070;
0x19a4c116; 0x1e376c08; 0x2748774c; 0x34b0bcb5;
0x391c0cb3; 0x4ed8aa4a; 0x5b9cca4f; 0x682e6ff3;
0x748f82ee; 0x78a5636f; 0x84c87814; 0x8cc70208;
0x90befffa; 0xa4506ceb; 0xbef9a3f7; 0xc67178f2 |]
let rotr32 x n = ((x lsr n) lor (x lsl (32 - n))) land mask32
let sha256_hex (msg : string) : string =
let h = [| 0x6a09e667; 0xbb67ae85; 0x3c6ef372; 0xa54ff53a;
0x510e527f; 0x9b05688c; 0x1f83d9ab; 0x5be0cd19 |] in
let len = String.length msg in
(* Padded length: multiple of 64 bytes. *)
let bitlen = len * 8 in
let padlen =
let r = (len + 1) mod 64 in
if r <= 56 then 56 - r else 120 - r
in
let total = len + 1 + padlen + 8 in
let buf = Bytes.make total '\000' in
Bytes.blit_string msg 0 buf 0 len;
Bytes.set buf len '\x80';
(* 64-bit big-endian bit length (we cap at OCaml int range). *)
for i = 0 to 7 do
Bytes.set buf (total - 1 - i)
(Char.chr ((bitlen lsr (8 * i)) land 0xFF))
done;
let w = Array.make 64 0 in
let nblocks = total / 64 in
for b = 0 to nblocks - 1 do
let base = b * 64 in
for t = 0 to 15 do
let o = base + t * 4 in
w.(t) <-
(Char.code (Bytes.get buf o) lsl 24)
lor (Char.code (Bytes.get buf (o + 1)) lsl 16)
lor (Char.code (Bytes.get buf (o + 2)) lsl 8)
lor (Char.code (Bytes.get buf (o + 3)))
done;
for t = 16 to 63 do
let s0 =
(rotr32 w.(t - 15) 7) lxor (rotr32 w.(t - 15) 18)
lxor (w.(t - 15) lsr 3) in
let s1 =
(rotr32 w.(t - 2) 17) lxor (rotr32 w.(t - 2) 19)
lxor (w.(t - 2) lsr 10) in
w.(t) <- (w.(t - 16) + s0 + w.(t - 7) + s1) land mask32
done;
let a = ref h.(0) and bb = ref h.(1) and c = ref h.(2)
and d = ref h.(3) and e = ref h.(4) and f = ref h.(5)
and g = ref h.(6) and hh = ref h.(7) in
for t = 0 to 63 do
let s1 =
(rotr32 !e 6) lxor (rotr32 !e 11) lxor (rotr32 !e 25) in
let ch = (!e land !f) lxor ((lnot !e land mask32) land !g) in
let t1 = (!hh + s1 + ch + k256.(t) + w.(t)) land mask32 in
let s0 =
(rotr32 !a 2) lxor (rotr32 !a 13) lxor (rotr32 !a 22) in
let maj = (!a land !bb) lxor (!a land !c) lxor (!bb land !c) in
let t2 = (s0 + maj) land mask32 in
hh := !g; g := !f; f := !e;
e := (!d + t1) land mask32;
d := !c; c := !bb; bb := !a;
a := (t1 + t2) land mask32
done;
h.(0) <- (h.(0) + !a) land mask32;
h.(1) <- (h.(1) + !bb) land mask32;
h.(2) <- (h.(2) + !c) land mask32;
h.(3) <- (h.(3) + !d) land mask32;
h.(4) <- (h.(4) + !e) land mask32;
h.(5) <- (h.(5) + !f) land mask32;
h.(6) <- (h.(6) + !g) land mask32;
h.(7) <- (h.(7) + !hh) land mask32
done;
let out = Buffer.create 64 in
Array.iter (fun x -> Buffer.add_string out (Printf.sprintf "%08x" x)) h;
Buffer.contents out
(* ---- SHA-512 (FIPS 180-4 §6.4). 64-bit words via Int64.
128-bit length append; we only support messages whose bit length
fits in 64 bits (high word is always zero). ---- *)
let k512 = [|
0x428a2f98d728ae22L; 0x7137449123ef65cdL; 0xb5c0fbcfec4d3b2fL;
0xe9b5dba58189dbbcL; 0x3956c25bf348b538L; 0x59f111f1b605d019L;
0x923f82a4af194f9bL; 0xab1c5ed5da6d8118L; 0xd807aa98a3030242L;
0x12835b0145706fbeL; 0x243185be4ee4b28cL; 0x550c7dc3d5ffb4e2L;
0x72be5d74f27b896fL; 0x80deb1fe3b1696b1L; 0x9bdc06a725c71235L;
0xc19bf174cf692694L; 0xe49b69c19ef14ad2L; 0xefbe4786384f25e3L;
0x0fc19dc68b8cd5b5L; 0x240ca1cc77ac9c65L; 0x2de92c6f592b0275L;
0x4a7484aa6ea6e483L; 0x5cb0a9dcbd41fbd4L; 0x76f988da831153b5L;
0x983e5152ee66dfabL; 0xa831c66d2db43210L; 0xb00327c898fb213fL;
0xbf597fc7beef0ee4L; 0xc6e00bf33da88fc2L; 0xd5a79147930aa725L;
0x06ca6351e003826fL; 0x142929670a0e6e70L; 0x27b70a8546d22ffcL;
0x2e1b21385c26c926L; 0x4d2c6dfc5ac42aedL; 0x53380d139d95b3dfL;
0x650a73548baf63deL; 0x766a0abb3c77b2a8L; 0x81c2c92e47edaee6L;
0x92722c851482353bL; 0xa2bfe8a14cf10364L; 0xa81a664bbc423001L;
0xc24b8b70d0f89791L; 0xc76c51a30654be30L; 0xd192e819d6ef5218L;
0xd69906245565a910L; 0xf40e35855771202aL; 0x106aa07032bbd1b8L;
0x19a4c116b8d2d0c8L; 0x1e376c085141ab53L; 0x2748774cdf8eeb99L;
0x34b0bcb5e19b48a8L; 0x391c0cb3c5c95a63L; 0x4ed8aa4ae3418acbL;
0x5b9cca4f7763e373L; 0x682e6ff3d6b2b8a3L; 0x748f82ee5defb2fcL;
0x78a5636f43172f60L; 0x84c87814a1f0ab72L; 0x8cc702081a6439ecL;
0x90befffa23631e28L; 0xa4506cebde82bde9L; 0xbef9a3f7b2c67915L;
0xc67178f2e372532bL; 0xca273eceea26619cL; 0xd186b8c721c0c207L;
0xeada7dd6cde0eb1eL; 0xf57d4f7fee6ed178L; 0x06f067aa72176fbaL;
0x0a637dc5a2c898a6L; 0x113f9804bef90daeL; 0x1b710b35131c471bL;
0x28db77f523047d84L; 0x32caab7b40c72493L; 0x3c9ebe0a15c9bebcL;
0x431d67c49c100d4cL; 0x4cc5d4becb3e42b6L; 0x597f299cfc657e2aL;
0x5fcb6fab3ad6faecL; 0x6c44198c4a475817L |]
let ( &: ) = Int64.logand
let ( |: ) = Int64.logor
let ( ^: ) = Int64.logxor
let ( +: ) = Int64.add
let lnot64 = Int64.lognot
let rotr64 x n =
(Int64.shift_right_logical x n) |: (Int64.shift_left x (64 - n))
let sha512_hex (msg : string) : string =
let h = [| 0x6a09e667f3bcc908L; 0xbb67ae8584caa73bL;
0x3c6ef372fe94f82bL; 0xa54ff53a5f1d36f1L;
0x510e527fade682d1L; 0x9b05688c2b3e6c1fL;
0x1f83d9abfb41bd6bL; 0x5be0cd19137e2179L |] in
let len = String.length msg in
let bitlen = len * 8 in
(* Pad to a multiple of 128 bytes; 16-byte big-endian length. *)
let padlen =
let r = (len + 1) mod 128 in
if r <= 112 then 112 - r else 240 - r
in
let total = len + 1 + padlen + 16 in
let buf = Bytes.make total '\000' in
Bytes.blit_string msg 0 buf 0 len;
Bytes.set buf len '\x80';
for i = 0 to 7 do
Bytes.set buf (total - 1 - i)
(Char.chr ((bitlen lsr (8 * i)) land 0xFF))
done;
let w = Array.make 80 0L in
let nblocks = total / 128 in
for b = 0 to nblocks - 1 do
let base = b * 128 in
for t = 0 to 15 do
let o = base + t * 8 in
let v = ref 0L in
for j = 0 to 7 do
v := Int64.logor (Int64.shift_left !v 8)
(Int64.of_int (Char.code (Bytes.get buf (o + j))))
done;
w.(t) <- !v
done;
for t = 16 to 79 do
let s0 =
(rotr64 w.(t - 15) 1) ^: (rotr64 w.(t - 15) 8)
^: (Int64.shift_right_logical w.(t - 15) 7) in
let s1 =
(rotr64 w.(t - 2) 19) ^: (rotr64 w.(t - 2) 61)
^: (Int64.shift_right_logical w.(t - 2) 6) in
w.(t) <- w.(t - 16) +: s0 +: w.(t - 7) +: s1
done;
let a = ref h.(0) and bb = ref h.(1) and c = ref h.(2)
and d = ref h.(3) and e = ref h.(4) and f = ref h.(5)
and g = ref h.(6) and hh = ref h.(7) in
for t = 0 to 79 do
let s1 = (rotr64 !e 14) ^: (rotr64 !e 18) ^: (rotr64 !e 41) in
let ch = (!e &: !f) ^: ((lnot64 !e) &: !g) in
let t1 = !hh +: s1 +: ch +: k512.(t) +: w.(t) in
let s0 = (rotr64 !a 28) ^: (rotr64 !a 34) ^: (rotr64 !a 39) in
let maj = (!a &: !bb) ^: (!a &: !c) ^: (!bb &: !c) in
let t2 = s0 +: maj in
hh := !g; g := !f; f := !e;
e := !d +: t1;
d := !c; c := !bb; bb := !a;
a := t1 +: t2
done;
h.(0) <- h.(0) +: !a;
h.(1) <- h.(1) +: !bb;
h.(2) <- h.(2) +: !c;
h.(3) <- h.(3) +: !d;
h.(4) <- h.(4) +: !e;
h.(5) <- h.(5) +: !f;
h.(6) <- h.(6) +: !g;
h.(7) <- h.(7) +: !hh
done;
let out = Buffer.create 128 in
Array.iter
(fun x -> Buffer.add_string out (Printf.sprintf "%016Lx" x)) h;
Buffer.contents out

107
hosts/ocaml/lib/sx_sha3.ml Normal file
View File

@@ -0,0 +1,107 @@
(** SHA-3 (SHA3-256) — pure OCaml, WASM-safe.
Keccak-f[1600] permutation + SHA-3 multi-rate padding (domain byte
0x06, NOT the legacy Keccak 0x01). Reference: FIPS 202. No deps. *)
let ( ^: ) = Int64.logxor
let ( &: ) = Int64.logand
let lnot64 = Int64.lognot
let rotl64 x n =
if n = 0 then x
else
Int64.logor (Int64.shift_left x n) (Int64.shift_right_logical x (64 - n))
(* FIPS 202 Table 2 — ρ rotation offsets, indexed lane = x + 5*y. *)
let rho = [|
0; 1; 62; 28; 27;
36; 44; 6; 55; 20;
3; 10; 43; 25; 39;
41; 45; 15; 21; 8;
18; 2; 61; 56; 14 |]
(* FIPS 202 §3.2.5 — round constants RC[0..23] for ι. *)
let rc = [|
0x0000000000000001L; 0x0000000000008082L; 0x800000000000808aL;
0x8000000080008000L; 0x000000000000808bL; 0x0000000080000001L;
0x8000000080008081L; 0x8000000000008009L; 0x000000000000008aL;
0x0000000000000088L; 0x0000000080008009L; 0x000000008000000aL;
0x000000008000808bL; 0x800000000000008bL; 0x8000000000008089L;
0x8000000000008003L; 0x8000000000008002L; 0x8000000000000080L;
0x000000000000800aL; 0x800000008000000aL; 0x8000000080008081L;
0x8000000000008080L; 0x0000000080000001L; 0x8000000080008008L |]
let keccak_f (a : int64 array) : unit =
let c = Array.make 5 0L and d = Array.make 5 0L in
let b = Array.make 25 0L in
for round = 0 to 23 do
(* θ *)
for x = 0 to 4 do
c.(x) <- a.(x) ^: a.(x + 5) ^: a.(x + 10)
^: a.(x + 15) ^: a.(x + 20)
done;
for x = 0 to 4 do
d.(x) <- c.((x + 4) mod 5) ^: (rotl64 c.((x + 1) mod 5) 1)
done;
for x = 0 to 4 do
for y = 0 to 4 do
a.(x + 5 * y) <- a.(x + 5 * y) ^: d.(x)
done
done;
(* ρ and π: B[y, 2x+3y] = rotl(A[x,y], rho[x,y]) *)
for x = 0 to 4 do
for y = 0 to 4 do
let nx = y and ny = (2 * x + 3 * y) mod 5 in
b.(nx + 5 * ny) <- rotl64 a.(x + 5 * y) rho.(x + 5 * y)
done
done;
(* χ *)
for y = 0 to 4 do
for x = 0 to 4 do
a.(x + 5 * y) <-
b.(x + 5 * y)
^: ((lnot64 b.((x + 1) mod 5 + 5 * y))
&: b.((x + 2) mod 5 + 5 * y))
done
done;
(* ι *)
a.(0) <- a.(0) ^: rc.(round)
done
let sha3_256_hex (msg : string) : string =
let rate = 136 (* bytes: (1600 - 2*256) / 8 *) in
let len = String.length msg in
(* pad10*1 with SHA-3 domain byte 0x06; last byte ORed with 0x80. *)
let q = rate - (len mod rate) in
let padded = Bytes.make (len + q) '\000' in
Bytes.blit_string msg 0 padded 0 len;
if q = 1 then
Bytes.set padded len '\x86'
else begin
Bytes.set padded len '\x06';
Bytes.set padded (len + q - 1) '\x80'
end;
let total = Bytes.length padded in
let a = Array.make 25 0L in
let nblocks = total / rate in
for blk = 0 to nblocks - 1 do
let base = blk * rate in
(* Absorb: XOR rate bytes into the state, little-endian lanes. *)
for j = 0 to rate - 1 do
let lane = j / 8 and sh = (j mod 8) * 8 in
let byte = Int64.of_int (Char.code (Bytes.get padded (base + j))) in
a.(lane) <- a.(lane) ^: (Int64.shift_left byte sh)
done;
keccak_f a
done;
(* Squeeze 32 bytes (fits in the first 4 lanes; rate > 32). *)
let out = Buffer.create 64 in
for j = 0 to 31 do
let lane = j / 8 and sh = (j mod 8) * 8 in
let byte =
Int64.to_int
(Int64.logand (Int64.shift_right_logical a.(lane) sh) 0xFFL)
in
Buffer.add_string out (Printf.sprintf "%02x" byte)
done;
Buffer.contents out

View File

@@ -44,6 +44,11 @@ type vm = {
ip past OP_PERFORM, stack ready for a result push). *)
exception VmSuspended of value * vm
(** Raised by the extension dispatch fallthrough when an opcode in the
extension range (≥ 200) is encountered with no handler registered.
Carries the offending opcode id. See plans/sx-vm-opcode-extension.md. *)
exception Invalid_opcode of int
(* Register the VM suspension converter so sx_runtime.sx_apply_cek can
catch VmSuspended and convert it to CekPerformRequest without a
direct dependency on this module. *)
@@ -57,6 +62,21 @@ let () = Sx_types._convert_vm_suspension := (fun exn ->
let jit_compile_ref : (lambda -> (string, value) Hashtbl.t -> vm_closure option) ref =
ref (fun _ _ -> None)
(** Forward reference for extension opcode dispatch — Phase B installs the
real registry's dispatch function here at module init. Until then, any
opcode in the extension range raises [Invalid_opcode]. Same forward-ref
pattern as [jit_compile_ref] above; keeps [Sx_vm_extensions] free to
depend on [Sx_vm]'s [vm] / [frame] types without a cycle. *)
let extension_dispatch_ref : (int -> vm -> frame -> unit) ref =
ref (fun op _vm _frame -> raise (Invalid_opcode op))
(** Forward reference for extension opcode → name lookup, used by
[opcode_name] / [disassemble] for human-readable disassembly. The
registry installs a real lookup at module init; default returns
[None] (then [opcode_name] falls back to "UNKNOWN_n"). *)
let extension_opcode_name_ref : (int -> string option) ref =
ref (fun _ -> None)
(* JIT threshold and counters live in Sx_types so primitives can read them
without creating a sx_primitives → sx_vm dependency cycle. *)
@@ -875,6 +895,15 @@ and run vm =
let request = pop vm in
raise (VmSuspended (request, vm))
(* ---- Extension dispatch fallthrough ----
Opcode partition (see plans/sx-vm-opcode-extension.md):
0 reserved / NOP
1-199 core opcodes (current ceiling 175 = OP_DEC)
200-247 extension opcodes (registered via Sx_vm_extensions)
248-255 reserved for future expansion / multi-byte
Any opcode ≥ 200 routes through the extension registry. *)
| op when op >= 200 -> !extension_dispatch_ref op vm frame
| opcode ->
raise (Eval_error (Printf.sprintf "VM: unknown opcode %d at ip=%d"
opcode (frame.ip - 1)))
@@ -1027,6 +1056,62 @@ let _jit_is_broken_name n =
|| n = "hs-repeat-while" || n = "hs-repeat-until"
|| n = "hs-for-each" || n = "hs-put!"
(** Scan bytecode for any extension opcode (≥ 200, the registry's
[Sx_vm_extensions.extension_min]). Walks operand bytes correctly
so values that happen to be ≥200 (e.g. a CONST u16 index pointing
into a large pool) do not trigger false positives. CLOSURE's
dynamic upvalue descriptors are read from the constant pool entry
at the same index it pushes.
Used by [jit_compile_lambda] (Phase E of the opcode-extension
plan): a lambda whose compiled body contains any extension opcode
is routed through interpretation rather than JIT. Extensions
interpret their opcodes via the registry; the JIT does not
currently know how to compile them.
Operand-size logic mirrors [opcode_operand_size] (which is defined
later, in the disassembly section); inlined here so this helper can
sit before [jit_compile_lambda] in the file. *)
let bytecode_uses_extension_opcodes (bc : int array) (consts : value array) =
let core_operand_size = function
| 1 | 20 | 21 | 64 | 65 | 128 -> 2 (* u16 *)
| 16 | 17 | 18 | 19 | 48 | 49 | 144 -> 1 (* u8 *)
| 32 | 33 | 34 | 35 -> 2 (* i16 *)
| 52 -> 3 (* CALL_PRIM: u16 + u8 *)
| _ -> 0
in
let len = Array.length bc in
let ip = ref 0 in
let found = ref false in
while not !found && !ip < len do
let op = bc.(!ip) in
if op >= 200 then found := true
else begin
ip := !ip + 1;
let extra = match op with
| 51 (* CLOSURE *) when !ip + 1 < len ->
let lo = bc.(!ip) in
let hi = bc.(!ip + 1) in
let idx = lo lor (hi lsl 8) in
let uv_count =
if idx < Array.length consts then
(match consts.(idx) with
| Dict d ->
(match Hashtbl.find_opt d "upvalue-count" with
| Some (Integer n) -> n
| Some (Number n) -> int_of_float n
| _ -> 0)
| _ -> 0)
else 0
in
2 + uv_count * 2
| _ -> core_operand_size op
in
ip := !ip + extra
end
done;
!found
let jit_compile_lambda (l : lambda) globals =
let fn_name = match l.l_name with Some n -> n | None -> "<anon>" in
if !_jit_compiling then (
@@ -1089,8 +1174,18 @@ let jit_compile_lambda (l : lambda) globals =
if idx < Array.length outer_code.vc_constants then
let inner_val = outer_code.vc_constants.(idx) in
let code = code_from_value inner_val in
Some { vm_code = code; vm_upvalues = [||];
vm_name = l.l_name; vm_env_ref = effective_globals; vm_closure_env = Some l.l_closure }
(* Phase E: if the inner lambda's bytecode contains any
extension opcode (≥200), skip JIT and let the lambda run
interpreted via CEK. Extension opcodes dispatch correctly
through the VM's registry fallthrough, but the JIT has no
knowledge of them and shouldn't claim ownership. *)
if bytecode_uses_extension_opcodes code.vc_bytecode code.vc_constants then begin
Printf.eprintf "[jit] SKIP %s: bytecode uses extension opcodes (interpret-only in v1)\n%!"
fn_name;
None
end else
Some { vm_code = code; vm_upvalues = [||];
vm_name = l.l_name; vm_env_ref = effective_globals; vm_closure_env = Some l.l_closure }
else begin
Printf.eprintf "[jit] FAIL %s: closure index %d out of bounds (pool=%d)\n%!"
fn_name idx (Array.length outer_code.vc_constants);
@@ -1200,7 +1295,12 @@ let opcode_name = function
| 164 -> "EQ" | 165 -> "LT" | 166 -> "GT" | 167 -> "NOT"
| 168 -> "LEN" | 169 -> "FIRST" | 170 -> "REST" | 171 -> "NTH"
| 172 -> "CONS" | 173 -> "NEG" | 174 -> "INC" | 175 -> "DEC"
| n -> Printf.sprintf "UNKNOWN_%d" n
| n ->
(* Extension opcodes (≥200) get their human-readable name from the
registry; defaults to UNKNOWN_n if the extension isn't loaded. *)
(match !extension_opcode_name_ref n with
| Some name -> name
| None -> Printf.sprintf "UNKNOWN_%d" n)
(** Number of extra operand bytes consumed by each opcode.
Returns (format, total_bytes) where format describes the operand types. *)

View File

@@ -0,0 +1,48 @@
(** {1 VM extension interface}
Type definitions for VM bytecode extensions. See
[plans/sx-vm-opcode-extension.md].
An extension is a first-class module of type [EXTENSION]: it has a
stable [name], an [init] that returns its private state, and an
[opcodes] function that lists the opcodes it provides.
Opcode handlers receive the live [vm] and the active [frame]. They
read operands via [Sx_vm.read_u8] / [read_u16], manipulate the stack
via [push] / [pop] / [peek], and update the frame's [ip] as needed. *)
(** A handler for an extension opcode. Reads operands from bytecode,
manipulates the VM stack, updates the frame's instruction pointer.
May raise exceptions (which propagate via the existing VM error path). *)
type handler = Sx_vm.vm -> Sx_vm.frame -> unit
(** State an extension carries alongside the VM. Opaque to the VM core;
extensions extend this with their own constructor and cast as needed.
Extensible variant — extensions add cases:
{[
type Sx_vm_extension.extension_state +=
| ErlangState of erlang_scheduler
]} *)
type extension_state = ..
(** An extension is a first-class module of this signature. *)
module type EXTENSION = sig
(** Stable name for this extension (e.g. ["erlang"], ["guest_vm"]).
Used as the lookup key in the registry and as the prefix for opcode
names ([erlang.OP_PATTERN_TUPLE_2] etc). *)
val name : string
(** Initialize per-instance state. Called once when [register] is
invoked on this extension. *)
val init : unit -> extension_state
(** Opcodes this extension provides. Each is
[(opcode_id, opcode_name, handler)].
[opcode_id] must be in the range 200-247 (the extension partition;
see the partition comment at the top of [Sx_vm]'s dispatch loop).
Conflicts with already-registered opcodes cause [register] to
fail. *)
val opcodes : extension_state -> (int * string * handler) list
end

View File

@@ -0,0 +1,120 @@
(** {1 VM extension registry}
Holds the live registry of extension opcodes and installs the
[dispatch] function into [Sx_vm.extension_dispatch_ref] at module
init time, replacing Phase A's stub.
See [plans/sx-vm-opcode-extension.md] and [Sx_vm_extension] for the
extension interface. *)
open Sx_vm_extension
(** The opcode range an extension is allowed to claim.
Mirrors the partition comment in [Sx_vm]. *)
let extension_min = 200
let extension_max = 247
(** opcode_id → handler *)
let by_id : (int, handler) Hashtbl.t = Hashtbl.create 64
(** opcode_name → opcode_id *)
let by_name : (string, int) Hashtbl.t = Hashtbl.create 64
(** opcode_id → opcode_name (reverse of [by_name]; used by
[Sx_vm.opcode_name] for disassembly). *)
let name_of_id_table : (int, string) Hashtbl.t = Hashtbl.create 64
(** extension_name → state *)
let states : (string, extension_state) Hashtbl.t = Hashtbl.create 8
(** Registered extension names, newest first. *)
let extensions : string list ref = ref []
(** Dispatch an extension opcode to its registered handler. Raises
[Sx_vm.Invalid_opcode] if no handler is registered for [op]. *)
let dispatch op vm frame =
match Hashtbl.find_opt by_id op with
| Some handler -> handler vm frame
| None -> raise (Sx_vm.Invalid_opcode op)
(** Register an extension. Fails if the extension name is already
registered, or if any opcode_id is outside the extension range or
collides with an already-registered opcode. *)
let register (m : (module EXTENSION)) =
let module M = (val m) in
if Hashtbl.mem states M.name then
failwith (Printf.sprintf
"Sx_vm_extensions: extension %S already registered" M.name);
let st = M.init () in
let ops = M.opcodes st in
List.iter (fun (id, opname, _h) ->
if id < extension_min || id > extension_max then
failwith (Printf.sprintf
"Sx_vm_extensions: opcode %d (%s) outside extension range %d-%d"
id opname extension_min extension_max);
if Hashtbl.mem by_id id then
failwith (Printf.sprintf
"Sx_vm_extensions: opcode %d (%s) already registered" id opname);
if Hashtbl.mem by_name opname then
failwith (Printf.sprintf
"Sx_vm_extensions: opcode name %S already registered" opname)
) ops;
Hashtbl.add states M.name st;
List.iter (fun (id, opname, h) ->
Hashtbl.add by_id id h;
Hashtbl.add by_name opname id;
Hashtbl.add name_of_id_table id opname
) ops;
extensions := M.name :: !extensions
(** Look up the opcode_id for an opcode_name. Returns [None] if no
extension provides that opcode. *)
let id_of_name name = Hashtbl.find_opt by_name name
(** Look up the opcode_name for an opcode_id. Returns [None] if no
extension provides that opcode. Used by disassembly. *)
let name_of_id id = Hashtbl.find_opt name_of_id_table id
(** Look up the state of an extension by name. Returns [None] if the
extension is not registered. *)
let state_of_extension name = Hashtbl.find_opt states name
(** Names of all registered extensions, newest first. *)
let registered_extensions () = !extensions
(** Test-only: clear the registry. Used by unit tests to isolate
extensions between test cases. The dispatch_ref is left in place. *)
let _reset_for_tests () =
Hashtbl.clear by_id;
Hashtbl.clear by_name;
Hashtbl.clear name_of_id_table;
Hashtbl.clear states;
extensions := []
(** Install our [dispatch] into [Sx_vm.extension_dispatch_ref] and our
[name_of_id] into [Sx_vm.extension_opcode_name_ref], replacing
the Phase A stubs. Idempotent. Called automatically at module init. *)
let install_dispatch () =
Sx_vm.extension_dispatch_ref := dispatch;
Sx_vm.extension_opcode_name_ref := name_of_id
let () = install_dispatch ()
(** Compiler-side opcode lookup: register the [extension-opcode-id]
primitive. Compilers ([lib/compiler.sx]) call this to emit
extension opcodes by name. Returns [Integer id] when registered,
[Nil] otherwise — so missing extensions degrade to a fallback
rather than failure. *)
let () =
Sx_primitives.register "extension-opcode-id" (fun args ->
match args with
| [Sx_types.String name] ->
(match id_of_name name with
| Some id -> Sx_types.Integer id
| None -> Sx_types.Nil)
| [Sx_types.Symbol name] ->
(match id_of_name name with
| Some id -> Sx_types.Integer id
| None -> Sx_types.Nil)
| _ -> raise (Sx_types.Eval_error
"extension-opcode-id: expected one string or symbol"))

View File

@@ -16,5 +16,5 @@
{"name":"magic","passed":37,"failed":0,"total":37},
{"name":"demo","passed":21,"failed":0,"total":21}
],
"generated": "2026-05-11T09:40:12+00:00"
"generated": "2026-05-14T20:30:05+00:00"
}

View File

@@ -33,3 +33,54 @@ least: persistent (path-copying) envs, an inline scheduler that
doesn't call/cc on the common path (msg-already-in-mailbox), and a
linked-list mailbox. None of those are in scope for the Phase 3
checkbox — captured here as the floor we're starting from.
## Phase 9 status (2026-05-14)
Specialized opcodes 9b9f landed as **stub dispatchers** in
`lib/erlang/vm/dispatcher.sx`: `OP_PATTERN_TUPLE/LIST/BINARY`,
`OP_PERFORM/HANDLE`, `OP_RECEIVE_SCAN`, `OP_SPAWN/SEND`, and ten
`OP_BIF_*` hot dispatch entries. Each opcode's handler is a thin
wrapper over the existing `er-match-*` / `er-bif-*` / runtime impls,
so **the perf numbers above are unchanged** — same per-hop cost, same
scheduler. The stubs exist to nail down opcode IDs, operand contracts,
and tests against `er-match!` parity *before* 9a (the OCaml
opcode-extension mechanism in `hosts/ocaml/evaluator/`) lands.
When 9a integrates and the bytecode compiler can emit these opcodes
at hot call sites, the real speedup story (~3000× ring throughput,
~1000× spawn) starts. Until then this file documents the
pre-integration ceiling. 72 vm-suite tests guard the stub correctness;
full conformance is **709/709** with the stub infrastructure loaded.
## Phase 9g — post-integration bench (2026-05-15)
9a (vm-ext mechanism), 9h (`erlang_ext.ml` registering `erlang.OP_*`
ids 222-239), and 9i (SX dispatcher consulting `extension-opcode-id`)
are now integrated and built into `hosts/ocaml/_build/default/bin/sx_server.exe`.
Re-ran the ring ladder on that binary:
| N (processes) | Hops | Wall-clock | Throughput |
|---|---|---|---|
| 10 | 10 | 938ms | 11 hops/s |
| 100 | 100 | 2772ms | 36 hops/s |
| 500 | 500 | 14190ms | 35 hops/s |
| 1000 | 1000 | 31814ms | 31 hops/s |
**Numbers are unchanged from the pre-integration baseline** — and that
is the expected, correct result. The opcode handlers (both the SX stub
dispatcher and the OCaml `erlang_ext` module) wrap the existing
`er-match-*` / `er-bif-*` / scheduler implementations 1-to-1, and the
**bytecode compiler does not yet emit `erlang.OP_*` opcodes**, so every
hop still goes through the general CEK path exactly as before. The
unchanged numbers therefore double as a no-regression check: the full
extension wiring (cherry-picked vm-ext A-E + force-link + erlang_ext +
SX bridge) added zero per-hop cost. Conformance **715/715** on this
binary.
The ~3000×/~1000× targets remain gated on a **future phase (Phase 10 —
bytecode emission)**: teach `lib/compiler.sx` (or the Erlang
transpiler) to emit `erlang.OP_PATTERN_TUPLE` etc. at hot call sites,
then give `erlang_ext.ml` real register-machine handlers instead of the
current honest not-wired raise. That is a substantial standalone phase,
tracked in `plans/erlang-on-sx.md`. 9g's deliverable — *honest
measurement + recorded numbers on the integrated binary* — is complete.

View File

@@ -36,6 +36,8 @@ SUITES=(
"bank|er-bank-test-pass|er-bank-test-count"
"echo|er-echo-test-pass|er-echo-test-count"
"fib|er-fib-test-pass|er-fib-test-count"
"ffi|er-ffi-test-pass|er-ffi-test-count"
"vm|er-vm-test-pass|er-vm-test-count"
)
cat > "$TMPFILE" << 'EPOCHS'
@@ -56,6 +58,9 @@ cat > "$TMPFILE" << 'EPOCHS'
(load "lib/erlang/tests/programs/bank.sx")
(load "lib/erlang/tests/programs/echo.sx")
(load "lib/erlang/tests/programs/fib_server.sx")
(load "lib/erlang/vm/dispatcher.sx")
(load "lib/erlang/tests/ffi.sx")
(load "lib/erlang/tests/vm.sx")
(epoch 100)
(eval "(list er-test-pass er-test-count)")
(epoch 101)
@@ -74,6 +79,10 @@ cat > "$TMPFILE" << 'EPOCHS'
(eval "(list er-echo-test-pass er-echo-test-count)")
(epoch 108)
(eval "(list er-fib-test-pass er-fib-test-count)")
(epoch 109)
(eval "(list er-ffi-test-pass er-ffi-test-count)")
(epoch 110)
(eval "(list er-vm-test-pass er-vm-test-count)")
EPOCHS
timeout 600 "$SX_SERVER" < "$TMPFILE" > "$OUTFILE" 2>&1

View File

@@ -853,6 +853,112 @@
(define er-modules-get (fn () (nth er-modules 0)))
(define er-modules-reset! (fn () (set-nth! er-modules 0 {})))
(define er-mk-module-slot
(fn (mod-env old-env version)
{:current mod-env :old old-env :version version :tag "module"}))
(define er-module-current-env (fn (slot) (get slot :current)))
(define er-module-old-env (fn (slot) (get slot :old)))
(define er-module-version (fn (slot) (get slot :version)))
;; ── FFI BIF registry (Phase 8) ───────────────────────────────────
;; Global dict from "Module/Name/Arity" key to {:module :name :arity :fn :pure?}.
;; Replaces the giant cond chain in transpile.sx#er-apply-remote-bif over time —
;; Phase 8 BIFs (crypto / cid / file / httpc / sqlite) all register here.
(define er-bif-registry (list {}))
(define er-bif-registry-get (fn () (nth er-bif-registry 0)))
(define er-bif-registry-reset! (fn () (set-nth! er-bif-registry 0 {})))
(define er-bif-key
(fn (module name arity)
(str module "/" name "/" arity)))
(define er-register-bif!
(fn (module name arity sx-fn)
(dict-set! (er-bif-registry-get) (er-bif-key module name arity)
{:module module :name name :arity arity :fn sx-fn :pure? false})
(er-mk-atom "ok")))
(define er-register-pure-bif!
(fn (module name arity sx-fn)
(dict-set! (er-bif-registry-get) (er-bif-key module name arity)
{:module module :name name :arity arity :fn sx-fn :pure? true})
(er-mk-atom "ok")))
(define er-lookup-bif
(fn (module name arity)
(let ((reg (er-bif-registry-get)) (k (er-bif-key module name arity)))
(if (dict-has? reg k) (get reg k) nil))))
(define er-list-bifs
(fn () (keys (er-bif-registry-get))))
;; ── term marshalling (Phase 8) ───────────────────────────────────
;; Bridge Erlang term values (tagged dicts) and SX-native values for
;; FFI BIFs to call out into platform primitives. Conversions:
;;
;; Erlang SX-native
;; ───────────────────────── ────────────────
;; atom {:tag "atom" :name S} ↔ symbol (make-symbol S)
;; nil {:tag "nil"} ↔ '()
;; cons {:tag "cons" :head :tail} → list of marshalled elements
;; tuple {:tag "tuple" :elements} → list of marshalled elements
;; binary {:tag "binary" :bytes} ↔ SX string
;; integer / float / boolean ↔ passthrough
;; SX string on the way back → binary
;;
;; Pids, refs, funs pass through unchanged — they have no SX-native
;; equivalent and are opaque to FFI primitives.
(define er-cons-to-sx-list
(fn (v)
(cond
(er-nil? v) (list)
(er-cons? v)
(let ((tail (er-cons-to-sx-list (get v :tail)))
(head (er-to-sx (get v :head))))
(let ((out (list head)))
(for-each
(fn (i) (append! out (nth tail i)))
(range 0 (len tail)))
out))
:else (list v))))
(define er-to-sx
(fn (v)
(cond
(er-atom? v) (make-symbol (get v :name))
(er-nil? v) (list)
(er-cons? v) (er-cons-to-sx-list v)
(er-tuple? v)
(let ((out (list)) (es (get v :elements)))
(for-each
(fn (i) (append! out (er-to-sx (nth es i))))
(range 0 (len es)))
out)
(er-binary? v) (list->string (map integer->char (get v :bytes)))
:else v)))
(define er-of-sx
(fn (v)
(let ((ty (type-of v)))
(cond
(= ty "symbol") (er-mk-atom (str v))
(= ty "string") (er-mk-binary (map char->integer (string->list v)))
(= ty "list")
(let ((out (er-mk-nil)))
(for-each
(fn (i)
(set! out
(er-mk-cons (er-of-sx (nth v (- (- (len v) 1) i))) out)))
(range 0 (len v)))
out)
(= ty "nil") (er-mk-nil)
:else v))))
;; Load an Erlang module declaration. Source must start with
;; `-module(Name).` and contain function definitions. Functions
;; sharing a name (different arities) get their clauses concatenated
@@ -897,7 +1003,15 @@
((all-clauses (get by-name k)))
(er-env-bind! mod-env k (er-mk-fun all-clauses mod-env))))
(keys by-name))
(dict-set! (er-modules-get) mod-name mod-env)
(let ((registry (er-modules-get)))
(if (dict-has? registry mod-name)
(let ((existing-slot (get registry mod-name)))
(dict-set! registry mod-name
(er-mk-module-slot mod-env
(er-module-current-env existing-slot)
(+ (er-module-version existing-slot) 1))))
(dict-set! registry mod-name
(er-mk-module-slot mod-env nil 1))))
(er-mk-atom mod-name)))))
(define
@@ -905,7 +1019,7 @@
(fn
(mod name vs)
(let
((mod-env (get (er-modules-get) mod)))
((mod-env (er-module-current-env (get (er-modules-get) mod))))
(if
(not (dict-has? mod-env name))
(raise
@@ -1189,16 +1303,266 @@
:else (er-mk-atom "undefined")))
:else (error "Erlang: ets:info: arity"))))
(define
er-apply-ets-bif
(fn
(name vs)
(cond
(= name "new") (er-bif-ets-new vs)
(= name "insert") (er-bif-ets-insert vs)
(= name "lookup") (er-bif-ets-lookup vs)
(= name "delete") (er-bif-ets-delete vs)
(= name "tab2list") (er-bif-ets-tab2list vs)
(= name "info") (er-bif-ets-info vs)
:else (error
(str "Erlang: undefined 'ets:" name "/" (len vs) "'")))))
;; ── file module (Phase 8 FFI) ────────────────────────────────────
;; Synchronous file IO. Filenames must be SX strings (or Erlang
;; binaries/char-code lists coercible to strings via er-source-to-string).
;; Returns `{ok, Binary}` / `ok` on success, `{error, Reason}` on failure
;; where Reason is one of `enoent`, `eacces`, `enotdir`, `posix_error`.
(define er-classify-file-error
(fn (msg)
(let ((s (str msg)))
(cond
(string-contains? s "No such") (er-mk-atom "enoent")
(string-contains? s "Permission denied") (er-mk-atom "eacces")
(string-contains? s "Not a directory") (er-mk-atom "enotdir")
(string-contains? s "Is a directory") (er-mk-atom "eisdir")
:else (er-mk-atom "posix_error")))))
(define er-bif-file-read-file
(fn (vs)
(let ((path (er-source-to-string (nth vs 0))))
(cond
(= path nil)
(er-mk-tuple (list (er-mk-atom "error") (er-mk-atom "badarg")))
:else
(let ((res (list nil)) (err (list nil)))
(guard (c (:else (set-nth! err 0 c)))
(set-nth! res 0 (file-read path)))
(cond
(not (= (nth err 0) nil))
(er-mk-tuple (list (er-mk-atom "error")
(er-classify-file-error (nth err 0))))
:else
(er-mk-tuple (list (er-mk-atom "ok")
(er-mk-binary (map char->integer (string->list (nth res 0))))))))))))
(define er-bif-file-write-file
(fn (vs)
(let ((path (er-source-to-string (nth vs 0)))
(data (er-source-to-string (nth vs 1))))
(cond
(or (= path nil) (= data nil))
(er-mk-tuple (list (er-mk-atom "error") (er-mk-atom "badarg")))
:else
(let ((err (list nil)))
(guard (c (:else (set-nth! err 0 c)))
(file-write path data))
(cond
(not (= (nth err 0) nil))
(er-mk-tuple (list (er-mk-atom "error")
(er-classify-file-error (nth err 0))))
:else (er-mk-atom "ok")))))))
(define er-bif-file-delete
(fn (vs)
(let ((path (er-source-to-string (nth vs 0))))
(cond
(= path nil)
(er-mk-tuple (list (er-mk-atom "error") (er-mk-atom "badarg")))
:else
(let ((err (list nil)))
(guard (c (:else (set-nth! err 0 c)))
(file-delete path))
(cond
(not (= (nth err 0) nil))
(er-mk-tuple (list (er-mk-atom "error")
(er-classify-file-error (nth err 0))))
:else (er-mk-atom "ok")))))))
;; ── crypto / cid / file:list_dir (Phase 8 FFI — host primitives) ──
;; Wired against loops/fed-prims host primitives (see plans Blockers
;; "RESOLVED 2026-05-18"). Term marshalling at the boundary:
;; Erlang binary/string/charlist -> SX byte-string via er-source-to-string;
;; results -> Erlang binary via er-mk-binary.
(define er-hexval
(fn (c)
(let ((v (char->integer c)))
(cond
(and (>= v 48) (<= v 57)) (- v 48) ;; 0-9
(and (>= v 97) (<= v 102)) (+ 10 (- v 97)) ;; a-f
(and (>= v 65) (<= v 70)) (+ 10 (- v 65)) ;; A-F
:else 0))))
(define er-hex->bytes
(fn (hex)
(let ((cs (string->list hex)) (out (list)) (n (string-length hex)))
(for-each
(fn (i)
(append! out
(+ (* 16 (er-hexval (nth cs (* i 2))))
(er-hexval (nth cs (+ (* i 2) 1))))))
(range 0 (truncate (/ n 2))))
out)))
;; crypto:hash(Type, Data) -> raw digest binary. Type is an Erlang
;; atom (sha256 | sha512 | sha3_256). Bad type / non-binary -> badarg.
(define er-bif-crypto-hash
(fn (vs)
(let ((ty (nth vs 0)) (data (er-source-to-string (nth vs 1))))
(cond
(or (not (er-atom? ty)) (= data nil))
(raise (er-mk-error-marker (er-mk-atom "badarg")))
:else
(let ((name (get ty :name)))
(let ((hex (cond
(= name "sha256") (crypto-sha256 data)
(= name "sha512") (crypto-sha512 data)
(= name "sha3_256") (crypto-sha3-256 data)
:else nil)))
(cond
(= hex nil) (raise (er-mk-error-marker (er-mk-atom "badarg")))
:else (er-mk-binary (er-hex->bytes hex)))))))))
;; cid:from_bytes(Bin) -> CIDv1 (raw codec 0x55, sha2-256 multihash)
;; as an Erlang binary string.
(define er-bif-cid-from-bytes
(fn (vs)
(let ((data (er-source-to-string (nth vs 0))))
(cond
(= data nil) (raise (er-mk-error-marker (er-mk-atom "badarg")))
:else
(let ((digest (er-hex->bytes (crypto-sha256 data))))
(let ((mh (list->string
(map integer->char (append (list 18 32) digest)))))
(er-mk-binary
(map char->integer
(string->list (cid-from-bytes 85 mh))))))))))
;; cid:to_string(Term) -> canonical CIDv1 (dag-cbor) of the term,
;; as an Erlang binary string.
(define er-bif-cid-to-string
(fn (vs)
;; Canonical CID of the term's stable string form. (cbor-encode
;; rejects symbols, so er-to-sx of compound terms is unencodable;
;; er-format-value yields a canonical SX string per term value.)
(er-mk-binary
(map char->integer
(string->list (cid-from-sx (er-format-value (nth vs 0))))))))
;; file:list_dir(Path) -> {ok, [Binary]} | {error, Reason}
(define er-bif-file-list-dir
(fn (vs)
(let ((path (er-source-to-string (nth vs 0))))
(cond
(= path nil)
(er-mk-tuple (list (er-mk-atom "error") (er-mk-atom "badarg")))
:else
(let ((res (list nil)) (err (list nil)))
(guard (c (:else (set-nth! err 0 c)))
(set-nth! res 0 (file-list-dir path)))
(cond
(not (= (nth err 0) nil))
(er-mk-tuple (list (er-mk-atom "error")
(er-classify-file-error (nth err 0))))
:else
(er-mk-tuple (list (er-mk-atom "ok")
(er-of-sx (nth res 0))))))))))
;; ── builtin BIF registrations (Phase 8 migration) ────────────────
;; Populates `er-bif-registry` with every existing built-in BIF. Each
;; entry is keyed by "Module/Name/Arity"; multi-arity BIFs register
;; once per arity. Called eagerly at the end of runtime.sx so the
;; registry is ready before any erlang-eval-ast call.
(define er-register-builtin-bifs!
(fn ()
;; erlang module — type predicates (all pure)
(er-register-pure-bif! "erlang" "is_integer" 1 er-bif-is-integer)
(er-register-pure-bif! "erlang" "is_atom" 1 er-bif-is-atom)
(er-register-pure-bif! "erlang" "is_list" 1 er-bif-is-list)
(er-register-pure-bif! "erlang" "is_tuple" 1 er-bif-is-tuple)
(er-register-pure-bif! "erlang" "is_number" 1 er-bif-is-number)
(er-register-pure-bif! "erlang" "is_float" 1 er-bif-is-float)
(er-register-pure-bif! "erlang" "is_boolean" 1 er-bif-is-boolean)
(er-register-pure-bif! "erlang" "is_pid" 1 er-bif-is-pid)
(er-register-pure-bif! "erlang" "is_reference" 1 er-bif-is-reference)
(er-register-pure-bif! "erlang" "is_binary" 1 er-bif-is-binary)
(er-register-pure-bif! "erlang" "is_function" 1 er-bif-is-function)
(er-register-pure-bif! "erlang" "is_function" 2 er-bif-is-function)
;; erlang module — pure data ops
(er-register-pure-bif! "erlang" "length" 1 er-bif-length)
(er-register-pure-bif! "erlang" "hd" 1 er-bif-hd)
(er-register-pure-bif! "erlang" "tl" 1 er-bif-tl)
(er-register-pure-bif! "erlang" "element" 2 er-bif-element)
(er-register-pure-bif! "erlang" "tuple_size" 1 er-bif-tuple-size)
(er-register-pure-bif! "erlang" "byte_size" 1 er-bif-byte-size)
(er-register-pure-bif! "erlang" "atom_to_list" 1 er-bif-atom-to-list)
(er-register-pure-bif! "erlang" "list_to_atom" 1 er-bif-list-to-atom)
(er-register-pure-bif! "erlang" "abs" 1 er-bif-abs)
(er-register-pure-bif! "erlang" "min" 2 er-bif-min)
(er-register-pure-bif! "erlang" "max" 2 er-bif-max)
(er-register-pure-bif! "erlang" "tuple_to_list" 1 er-bif-tuple-to-list)
(er-register-pure-bif! "erlang" "list_to_tuple" 1 er-bif-list-to-tuple)
(er-register-pure-bif! "erlang" "integer_to_list" 1 er-bif-integer-to-list)
(er-register-pure-bif! "erlang" "list_to_integer" 1 er-bif-list-to-integer)
;; erlang module — process / runtime (side-effecting)
(er-register-bif! "erlang" "self" 0 er-bif-self)
(er-register-bif! "erlang" "spawn" 1 er-bif-spawn)
(er-register-bif! "erlang" "spawn" 3 er-bif-spawn)
(er-register-bif! "erlang" "exit" 1 er-bif-exit)
(er-register-bif! "erlang" "exit" 2 er-bif-exit)
(er-register-bif! "erlang" "make_ref" 0 er-bif-make-ref)
(er-register-bif! "erlang" "link" 1 er-bif-link)
(er-register-bif! "erlang" "unlink" 1 er-bif-unlink)
(er-register-bif! "erlang" "monitor" 2 er-bif-monitor)
(er-register-bif! "erlang" "demonitor" 1 er-bif-demonitor)
(er-register-bif! "erlang" "process_flag" 2 er-bif-process-flag)
(er-register-bif! "erlang" "register" 2 er-bif-register)
(er-register-bif! "erlang" "unregister" 1 er-bif-unregister)
(er-register-bif! "erlang" "whereis" 1 er-bif-whereis)
(er-register-bif! "erlang" "registered" 0 er-bif-registered)
;; erlang module — exception raising (modelled as side-effecting)
(er-register-bif! "erlang" "throw" 1
(fn (vs) (raise (er-mk-throw-marker (er-bif-arg1 vs "throw")))))
(er-register-bif! "erlang" "error" 1
(fn (vs) (raise (er-mk-error-marker (er-bif-arg1 vs "error")))))
;; lists module — all pure
(er-register-pure-bif! "lists" "reverse" 1 er-bif-lists-reverse)
(er-register-pure-bif! "lists" "map" 2 er-bif-lists-map)
(er-register-pure-bif! "lists" "foldl" 3 er-bif-lists-foldl)
(er-register-pure-bif! "lists" "seq" 2 er-bif-lists-seq)
(er-register-pure-bif! "lists" "seq" 3 er-bif-lists-seq)
(er-register-pure-bif! "lists" "sum" 1 er-bif-lists-sum)
(er-register-pure-bif! "lists" "nth" 2 er-bif-lists-nth)
(er-register-pure-bif! "lists" "last" 1 er-bif-lists-last)
(er-register-pure-bif! "lists" "member" 2 er-bif-lists-member)
(er-register-pure-bif! "lists" "append" 2 er-bif-lists-append)
(er-register-pure-bif! "lists" "filter" 2 er-bif-lists-filter)
(er-register-pure-bif! "lists" "any" 2 er-bif-lists-any)
(er-register-pure-bif! "lists" "all" 2 er-bif-lists-all)
(er-register-pure-bif! "lists" "duplicate" 2 er-bif-lists-duplicate)
;; io module — side-effecting (writes to io buffer)
(er-register-bif! "io" "format" 1 er-bif-io-format)
(er-register-bif! "io" "format" 2 er-bif-io-format)
;; ets module — side-effecting (mutates table state)
(er-register-bif! "ets" "new" 2 er-bif-ets-new)
(er-register-bif! "ets" "insert" 2 er-bif-ets-insert)
(er-register-bif! "ets" "lookup" 2 er-bif-ets-lookup)
(er-register-bif! "ets" "delete" 1 er-bif-ets-delete)
(er-register-bif! "ets" "delete" 2 er-bif-ets-delete)
(er-register-bif! "ets" "tab2list" 1 er-bif-ets-tab2list)
(er-register-bif! "ets" "info" 2 er-bif-ets-info)
;; code module — side-effecting (mutates module registry, kills procs)
(er-register-bif! "code" "load_binary" 3 er-bif-code-load-binary)
(er-register-bif! "code" "purge" 1 er-bif-code-purge)
(er-register-bif! "code" "soft_purge" 1 er-bif-code-soft-purge)
(er-register-bif! "code" "which" 1 er-bif-code-which)
(er-register-bif! "code" "is_loaded" 1 er-bif-code-is-loaded)
(er-register-bif! "code" "all_loaded" 0 er-bif-code-all-loaded)
;; file module
(er-register-bif! "file" "read_file" 1 er-bif-file-read-file)
(er-register-bif! "file" "write_file" 2 er-bif-file-write-file)
(er-register-bif! "file" "delete" 1 er-bif-file-delete)
;; Phase 8 FFI — host-primitive BIFs (loops/fed-prims)
(er-register-pure-bif! "crypto" "hash" 2 er-bif-crypto-hash)
(er-register-pure-bif! "cid" "from_bytes" 1 er-bif-cid-from-bytes)
(er-register-pure-bif! "cid" "to_string" 1 er-bif-cid-to-string)
(er-register-bif! "file" "list_dir" 1 er-bif-file-list-dir)
(er-mk-atom "ok")))
;; Register everything at load time.
(er-register-builtin-bifs!)

View File

@@ -1,16 +1,18 @@
{
"language": "erlang",
"total_pass": 530,
"total": 530,
"total_pass": 729,
"total": 729,
"suites": [
{"name":"tokenize","pass":62,"total":62,"status":"ok"},
{"name":"parse","pass":52,"total":52,"status":"ok"},
{"name":"eval","pass":346,"total":346,"status":"ok"},
{"name":"runtime","pass":39,"total":39,"status":"ok"},
{"name":"eval","pass":385,"total":385,"status":"ok"},
{"name":"runtime","pass":93,"total":93,"status":"ok"},
{"name":"ring","pass":4,"total":4,"status":"ok"},
{"name":"ping-pong","pass":4,"total":4,"status":"ok"},
{"name":"bank","pass":8,"total":8,"status":"ok"},
{"name":"echo","pass":7,"total":7,"status":"ok"},
{"name":"fib","pass":8,"total":8,"status":"ok"}
{"name":"fib","pass":8,"total":8,"status":"ok"},
{"name":"ffi","pass":28,"total":28,"status":"ok"},
{"name":"vm","pass":78,"total":78,"status":"ok"}
]
}

View File

@@ -1,18 +1,20 @@
# Erlang-on-SX Scoreboard
**Total: 530 / 530 tests passing**
**Total: 729 / 729 tests passing**
| | Suite | Pass | Total |
|---|---|---|---|
| ✅ | tokenize | 62 | 62 |
| ✅ | parse | 52 | 52 |
| ✅ | eval | 346 | 346 |
| ✅ | runtime | 39 | 39 |
| ✅ | eval | 385 | 385 |
| ✅ | runtime | 93 | 93 |
| ✅ | ring | 4 | 4 |
| ✅ | ping-pong | 4 | 4 |
| ✅ | bank | 8 | 8 |
| ✅ | echo | 7 | 7 |
| ✅ | fib | 8 | 8 |
| ✅ | ffi | 28 | 28 |
| ✅ | vm | 78 | 78 |
Generated by `lib/erlang/conformance.sh`.

View File

@@ -1125,6 +1125,222 @@
(er-eval-test "lists:duplicate val"
(nm (ev "hd(lists:duplicate(3, marker))")) "marker")
;; ── Phase 7: code:load_binary/3 ───────────────────────────────
(er-modules-reset!)
(er-eval-test "code:load_binary ok tag"
(nm (ev "element(1, code:load_binary(cl1, \"cl1.erl\", \"-module(cl1). foo() -> 1.\"))"))
"module")
(er-eval-test "code:load_binary ok name"
(nm (ev "element(2, code:load_binary(cl1, \"cl1.erl\", \"-module(cl1). foo() -> 1.\"))"))
"cl1")
(er-eval-test "code:load_binary then call"
(ev "cl1:foo()") 1)
(er-eval-test "code:load_binary reload v2"
(ev "code:load_binary(cl1, \"cl1.erl\", \"-module(cl1). foo() -> 99.\"), cl1:foo()")
99)
(er-eval-test "code:load_binary name mismatch tag"
(nm (ev "element(1, code:load_binary(cl2, \"x.erl\", \"-module(other). f() -> 0.\"))"))
"error")
(er-eval-test "code:load_binary name mismatch reason"
(nm (ev "element(2, code:load_binary(cl2, \"x.erl\", \"-module(other). f() -> 0.\"))"))
"module_name_mismatch")
(er-eval-test "code:load_binary badfile on garbage"
(nm (ev "element(2, code:load_binary(cl3, \"x.erl\", \"this is not erlang\"))"))
"badfile")
(er-eval-test "code:load_binary non-atom mod is badarg"
(nm (ev "element(2, code:load_binary(\"cl1\", \"x.erl\", \"-module(cl1). f() -> 0.\"))"))
"badarg")
;; ── Phase 7: code:purge/1 + code:soft_purge/1 ───────────────────
(er-modules-reset!)
;; purge unknown module → false
(er-eval-test "code:purge unknown"
(nm (ev "code:purge(nope)")) "false")
;; load, then purge without old version → false (nothing to purge)
(er-eval-test "code:purge no old"
(nm (ev "code:load_binary(pg1, \"pg1\", \"-module(pg1). v() -> 1.\"), code:purge(pg1)"))
"false")
;; load v1, load v2 (creates :old), purge with no live procs → true
(er-eval-test "code:purge after reload"
(nm (ev "code:load_binary(pg2, \"pg2\", \"-module(pg2). v() -> 1.\"), code:load_binary(pg2, \"pg2\", \"-module(pg2). v() -> 2.\"), code:purge(pg2)"))
"true")
;; idempotent: purging again returns false (already purged)
(er-eval-test "code:purge twice"
(nm (ev "code:load_binary(pg3, \"pg3\", \"-module(pg3). v() -> 1.\"), code:load_binary(pg3, \"pg3\", \"-module(pg3). v() -> 2.\"), code:purge(pg3), code:purge(pg3)"))
"false")
;; purge returns true whenever an :old slot exists, regardless of process tracking
;; (proper "kill lingering" semantics requires spawn/3 which is still stubbed)
(er-eval-test "code:purge with old slot present"
(nm (ev "code:load_binary(pg4, \"pg4\", \"-module(pg4). loop() -> receive stop -> ok end.\"),
Pid = spawn(fun () -> pg4:loop() end),
code:load_binary(pg4, \"pg4\", \"-module(pg4). loop() -> receive stop -> done end.\"),
code:purge(pg4)"))
"true")
;; soft_purge unknown → true (nothing to purge)
(er-eval-test "code:soft_purge unknown"
(nm (ev "code:soft_purge(nope)")) "true")
;; soft_purge with no old version → true
(er-eval-test "code:soft_purge no old"
(nm (ev "code:load_binary(sp1, \"sp1\", \"-module(sp1). v() -> 1.\"), code:soft_purge(sp1)"))
"true")
;; soft_purge with old + no lingering procs → true (clears :old)
(er-eval-test "code:soft_purge clean"
(nm (ev "code:load_binary(sp2, \"sp2\", \"-module(sp2). v() -> 1.\"), code:load_binary(sp2, \"sp2\", \"-module(sp2). v() -> 2.\"), code:soft_purge(sp2)"))
"true")
;; non-atom Mod is badarg (raise)
(er-eval-test "code:purge badarg"
(nm (ev "try code:purge(\"str\") catch error:badarg -> ok end")) "ok")
(er-eval-test "code:soft_purge badarg"
(nm (ev "try code:soft_purge(123) catch error:badarg -> ok end")) "ok")
;; ── Phase 7: code:which/1 + code:is_loaded/1 + code:all_loaded/0 ──
(er-modules-reset!)
(er-eval-test "code:which non_existing"
(nm (ev "code:which(nope)")) "non_existing")
(er-eval-test "code:which after load"
(nm (ev "code:load_binary(wh1, \"wh1\", \"-module(wh1). v() -> 1.\"), code:which(wh1)"))
"loaded")
(er-eval-test "code:is_loaded missing"
(nm (ev "code:is_loaded(nope)")) "false")
(er-eval-test "code:is_loaded tag"
(nm (ev "code:load_binary(il1, \"il1\", \"-module(il1). v() -> 1.\"), element(1, code:is_loaded(il1))"))
"file")
(er-eval-test "code:is_loaded value"
(nm (ev "code:load_binary(il2, \"il2\", \"-module(il2). v() -> 1.\"), element(2, code:is_loaded(il2))"))
"loaded")
(er-modules-reset!)
(er-eval-test "code:all_loaded empty"
(ev "length(code:all_loaded())") 0)
(er-modules-reset!)
(er-eval-test "code:all_loaded count"
(ev "code:load_binary(al1, \"al1\", \"-module(al1). v() -> 1.\"),
code:load_binary(al2, \"al2\", \"-module(al2). v() -> 1.\"),
length(code:all_loaded())")
2)
(er-eval-test "code:all_loaded first entry tag"
(nm (ev "code:load_binary(al3, \"al3\", \"-module(al3). v() -> 1.\"),
element(2, hd(code:all_loaded()))"))
"loaded")
(er-eval-test "code:which badarg"
(nm (ev "try code:which(\"str\") catch error:badarg -> ok end")) "ok")
(er-eval-test "code:is_loaded badarg"
(nm (ev "try code:is_loaded(123) catch error:badarg -> ok end")) "ok")
;; ── Phase 7: hot-reload call dispatch semantics ──────────────────
;; Cross-module M:F() calls always hit the CURRENT version;
;; local F() calls inside a module body resolve through the env
;; the function closed over (i.e. the version it was loaded with).
(er-modules-reset!)
;; M:F always hits current
(er-eval-test "cross-mod after reload v2"
(ev "code:load_binary(hr1, \"hr1\", \"-module(hr1). f() -> 1.\"),
code:load_binary(hr1, \"hr1\", \"-module(hr1). f() -> 2.\"),
hr1:f()")
2)
;; Local call inside reloaded module body resolves via fresh mod-env
;; (a() does a local b(); b() got upgraded too)
(er-eval-test "local call inside reloaded module body"
(ev "code:load_binary(hr2, \"hr2\", \"-module(hr2). a() -> b(). b() -> 1.\"),
code:load_binary(hr2, \"hr2\", \"-module(hr2). a() -> b(). b() -> 99.\"),
hr2:a()")
99)
;; Fun captured BEFORE reload, with local-call body, keeps v1 semantics
(er-eval-test "captured fun keeps closed-over env (local call)"
(ev "code:load_binary(hr3, \"hr3\", \"-module(hr3). get_fn() -> fun () -> b() end. b() -> 1.\"),
Fn = hr3:get_fn(),
code:load_binary(hr3, \"hr3\", \"-module(hr3). get_fn() -> fun () -> b() end. b() -> 99.\"),
Fn()")
1)
;; Fun captured BEFORE reload, with CROSS-mod body, sees v2's current
(er-eval-test "captured fun follows cross-mod to current"
(ev "code:load_binary(hr4, \"hr4\", \"-module(hr4). get_xref() -> fun () -> hr4:b() end. b() -> 1.\"),
Fn = hr4:get_xref(),
code:load_binary(hr4, \"hr4\", \"-module(hr4). get_xref() -> fun () -> hr4:b() end. b() -> 99.\"),
Fn()")
99)
;; Two captured funs from two different vintages
(er-eval-test "two funs from two vintages stay independent"
(ev "code:load_binary(hr5, \"hr5\", \"-module(hr5). gf() -> fun () -> v() end. v() -> 10.\"),
F1 = hr5:gf(),
code:load_binary(hr5, \"hr5\", \"-module(hr5). gf() -> fun () -> v() end. v() -> 20.\"),
F2 = hr5:gf(),
F1() + F2()")
30)
;; Version slot bumps correctly when a captured fun stays alive
(er-eval-test "version bumps despite captured funs"
(ev "code:load_binary(hr6, \"hr6\", \"-module(hr6). gf() -> fun () -> v() end. v() -> 1.\"),
_Pinned = hr6:gf(),
code:load_binary(hr6, \"hr6\", \"-module(hr6). gf() -> fun () -> v() end. v() -> 2.\"),
code:load_binary(hr6, \"hr6\", \"-module(hr6). gf() -> fun () -> v() end. v() -> 3.\"),
hr6:v()")
3)
;; ── Phase 7 capstone: full hot-reload ladder ───────────────────
;; Load v1 → spawn from inside module → load v2 → cross-mod hits v2 →
;; local call inside v1 process still resolves v1 → soft_purge refuses
;; while v1 procs alive → purge kills them.
;;
;; All stages must run in a single erlang-eval-ast call: each call resets
;; the scheduler (er-sched-init!) so cross-call Pid handles would point at
;; reaped processes.
(er-modules-reset!)
(define er-rt-cap-prog "code:load_binary(cap, \"cap.erl\", \"-module(cap). start() -> spawn(fun () -> loop() end). loop() -> receive {ping, From} -> From ! {pong, v1}, loop(); stop -> done end. tag() -> v1.\"), Tag1 = cap:tag(), Pid1 = cap:start(), code:load_binary(cap, \"cap.erl\", \"-module(cap). start() -> spawn(fun () -> loop() end). loop() -> receive {ping, From} -> From ! {pong, v2}, loop(); stop -> done end. tag() -> v2.\"), Tag2 = cap:tag(), _Pid2 = cap:start(), Soft1 = code:soft_purge(cap), Hard = code:purge(cap), Soft2 = code:soft_purge(cap), {Tag1, Tag2, Soft1, Hard, Soft2}")
(define er-rt-cap-result (ev er-rt-cap-prog))
(er-eval-test "capstone v1 tag direct"
(get (nth (get er-rt-cap-result :elements) 0) :name) "v1")
(er-eval-test "capstone v2 tag"
(get (nth (get er-rt-cap-result :elements) 1) :name) "v2")
(er-eval-test "capstone soft_purge while v1 alive = false"
(get (nth (get er-rt-cap-result :elements) 2) :name) "false")
(er-eval-test "capstone hard purge = true"
(get (nth (get er-rt-cap-result :elements) 3) :name) "true")
(er-eval-test "capstone soft_purge clean after hard = true"
(get (nth (get er-rt-cap-result :elements) 4) :name) "true")
(define
er-eval-test-summary
(str "eval " er-eval-test-pass "/" er-eval-test-count))

178
lib/erlang/tests/ffi.sx Normal file
View File

@@ -0,0 +1,178 @@
;; Phase 8 FFI BIF tests — one round-trip per BIF.
;; Each BIF lives in lib/erlang/runtime.sx (registered with
;; er-bif-registry) and wraps an SX-host primitive.
(define er-ffi-test-count 0)
(define er-ffi-test-pass 0)
(define er-ffi-test-fails (list))
(define
er-ffi-test
(fn
(name actual expected)
(set! er-ffi-test-count (+ er-ffi-test-count 1))
(if
(= actual expected)
(set! er-ffi-test-pass (+ er-ffi-test-pass 1))
(append! er-ffi-test-fails {:name name :expected expected :actual actual}))))
(define ffi-ev erlang-eval-ast)
(define ffi-nm (fn (v) (get v :name)))
;; ── file:read_file/1 + file:write_file/2 ────────────────────────
(er-ffi-test
"file:write_file ok"
(ffi-nm (ffi-ev "file:write_file(\"/tmp/er-ffi-1.txt\", \"hello\")"))
"ok")
(er-ffi-test
"file:read_file ok tag"
(ffi-nm (ffi-ev "element(1, file:read_file(\"/tmp/er-ffi-1.txt\"))"))
"ok")
(er-ffi-test
"file:read_file payload is binary"
(ffi-nm
(ffi-ev
"case file:read_file(\"/tmp/er-ffi-1.txt\") of {ok, B} -> is_binary(B) end"))
"true")
(er-ffi-test
"file:read_file content byte_size"
(ffi-ev
"case file:read_file(\"/tmp/er-ffi-1.txt\") of {ok, B} -> byte_size(B) end")
5)
(er-ffi-test
"file:read_file missing enoent"
(ffi-nm (ffi-ev "element(2, file:read_file(\"/tmp/er-ffi-no-such-xyz\"))"))
"enoent")
(er-ffi-test
"file:write_file bad path enoent"
(ffi-nm
(ffi-ev "element(2, file:write_file(\"/tmp/er-ffi-no-dir-xyz/x\", \"y\"))"))
"enoent")
(er-ffi-test
"file:write_file binary payload"
(ffi-ev
"file:write_file(\"/tmp/er-ffi-2.bin\", <<1, 2, 3, 4, 5>>), case file:read_file(\"/tmp/er-ffi-2.bin\") of {ok, B} -> byte_size(B) end")
5)
;; ── file:delete/1 ────────────────────────────────────────────────
(er-ffi-test
"file:delete ok"
(ffi-nm
(ffi-ev
"file:write_file(\"/tmp/er-ffi-del.txt\", \"x\"), file:delete(\"/tmp/er-ffi-del.txt\")"))
"ok")
(er-ffi-test
"file:read_file after delete enoent"
(ffi-nm
(ffi-ev
"file:write_file(\"/tmp/er-ffi-del2.txt\", \"x\"), file:delete(\"/tmp/er-ffi-del2.txt\"), element(2, file:read_file(\"/tmp/er-ffi-del2.txt\"))"))
"enoent")
(er-ffi-test
"crypto:hash sha256 -> 32-byte binary"
(ffi-ev "byte_size(crypto:hash(sha256, <<97,98,99>>))")
32)
(er-ffi-test
"crypto:hash sha512 -> 64-byte binary"
(ffi-ev "byte_size(crypto:hash(sha512, <<97,98,99>>))")
64)
(er-ffi-test
"crypto:hash sha3_256 is_binary"
(ffi-nm (ffi-ev "is_binary(crypto:hash(sha3_256, <<120>>))"))
"true")
(er-ffi-test
"crypto:hash deterministic"
(ffi-nm (ffi-ev "crypto:hash(sha256, <<97>>) =:= crypto:hash(sha256, <<97>>)"))
"true")
(er-ffi-test
"crypto:hash distinct inputs distinct digests"
(ffi-nm (ffi-ev "crypto:hash(sha256, <<97>>) =/= crypto:hash(sha256, <<98>>)"))
"true")
(er-ffi-test
"crypto:hash bad type -> error:badarg"
(ffi-nm (ffi-ev "try crypto:hash(md5, <<120>>) catch error:badarg -> ok end"))
"ok")
(er-ffi-test
"cid:from_bytes is_binary"
(ffi-nm (ffi-ev "is_binary(cid:from_bytes(<<97,98,99>>))"))
"true")
(er-ffi-test
"cid:from_bytes deterministic"
(ffi-nm (ffi-ev "cid:from_bytes(<<97,98,99>>) =:= cid:from_bytes(<<97,98,99>>)"))
"true")
(er-ffi-test
"cid:from_bytes distinct inputs distinct CIDs"
(ffi-nm (ffi-ev "cid:from_bytes(<<97,98,99>>) =/= cid:from_bytes(<<97,98,100>>)"))
"true")
(er-ffi-test
"cid:from_bytes non-binary -> error:badarg"
(ffi-nm (ffi-ev "try cid:from_bytes(42) catch error:badarg -> ok end"))
"ok")
(er-ffi-test
"cid:to_string is_binary"
(ffi-nm (ffi-ev "is_binary(cid:to_string({ok, 42}))"))
"true")
(er-ffi-test
"cid:to_string deterministic"
(ffi-nm (ffi-ev "cid:to_string(foo) =:= cid:to_string(foo)"))
"true")
(er-ffi-test
"cid:to_string distinct terms distinct CIDs"
(ffi-nm (ffi-ev "cid:to_string(foo) =/= cid:to_string(bar)"))
"true")
(er-ffi-test
"file:list_dir ok tag"
(ffi-nm (ffi-ev "element(1, file:list_dir(\"lib/erlang\"))"))
"ok")
(er-ffi-test
"file:list_dir non-empty"
(ffi-nm (ffi-ev "case file:list_dir(\"lib/erlang\") of {ok, L} -> length(L) > 3 end"))
"true")
(er-ffi-test
"file:list_dir entries are binaries"
(ffi-nm (ffi-ev "case file:list_dir(\"lib/erlang\") of {ok, L} -> is_binary(hd(L)) end"))
"true")
(er-ffi-test
"file:list_dir missing enoent"
(ffi-nm (ffi-ev "element(2, file:list_dir(\"/no/such/dir/xyz\"))"))
"enoent")
;; ── Still deferred (no host primitive): httpc (HTTP client, v2),
;; sqlite-* (v2 indexes). Assert NOT registered so a future iteration
;; that wires them without updating this suite fails fast.
(er-ffi-test
"httpc:request unregistered"
(er-lookup-bif "httpc" "request" 4)
nil)
(er-ffi-test
"sqlite:exec unregistered"
(er-lookup-bif "sqlite" "exec" 2)
nil)
(define
er-ffi-test-summary
(str "ffi " er-ffi-test-pass "/" er-ffi-test-count))

View File

@@ -134,6 +134,144 @@
(er-sched-current-pid)
nil)
;; ── Phase 7: module-version slots ───────────────────────────────
(er-modules-reset!)
(define er-rt-slot1 (er-mk-module-slot (er-env-new) nil 1))
(er-rt-test "slot tag" (get er-rt-slot1 :tag) "module")
(er-rt-test "slot version" (er-module-version er-rt-slot1) 1)
(er-rt-test "slot old nil" (er-module-old-env er-rt-slot1) nil)
(er-rt-test "slot current not nil" (= (er-module-current-env er-rt-slot1) nil) false)
(erlang-load-module "-module(hr1). a() -> 1.")
(define er-rt-reg (er-modules-get))
(er-rt-test "registry has hr1" (dict-has? er-rt-reg "hr1") true)
(er-rt-test "v1 on first load" (er-module-version (get er-rt-reg "hr1")) 1)
(er-rt-test "v1 old is nil" (er-module-old-env (get er-rt-reg "hr1")) nil)
(er-rt-test "v1 current not nil" (= (er-module-current-env (get er-rt-reg "hr1")) nil) false)
(define er-rt-env-v1 (er-module-current-env (get er-rt-reg "hr1")))
(erlang-load-module "-module(hr1). a() -> 2.")
(er-rt-test "v2 on second load" (er-module-version (get er-rt-reg "hr1")) 2)
(er-rt-test "v2 old is v1 env" (er-module-old-env (get er-rt-reg "hr1")) er-rt-env-v1)
(er-rt-test "v2 current is new" (= (er-module-current-env (get er-rt-reg "hr1")) er-rt-env-v1) false)
(erlang-load-module "-module(hr1). a() -> 3.")
(er-rt-test "v3 on third load" (er-module-version (get er-rt-reg "hr1")) 3)
(er-modules-reset!)
(er-rt-test "registry-reset clears" (dict-has? (er-modules-get) "hr1") false)
;; ── Phase 8: FFI BIF registry ──────────────────────────────────
(er-bif-registry-reset!)
(er-rt-test "empty registry" (len (er-list-bifs)) 0)
(er-rt-test "lookup miss" (er-lookup-bif "crypto" "hash" 2) nil)
(er-register-bif! "fake" "echo" 1 (fn (vs) (nth vs 0)))
(er-rt-test "register grows registry" (len (er-list-bifs)) 1)
(define er-rt-bif-hit (er-lookup-bif "fake" "echo" 1))
(er-rt-test "lookup hit module" (get er-rt-bif-hit :module) "fake")
(er-rt-test "lookup hit name" (get er-rt-bif-hit :name) "echo")
(er-rt-test "lookup hit arity" (get er-rt-bif-hit :arity) 1)
(er-rt-test "lookup hit pure?" (get er-rt-bif-hit :pure?) false)
(er-rt-test "fn invocable" ((get er-rt-bif-hit :fn) (list 42)) 42)
;; Re-register replaces (same key)
(er-register-bif! "fake" "echo" 1 (fn (vs) "replaced"))
(er-rt-test "re-register same key, count unchanged" (len (er-list-bifs)) 1)
(er-rt-test "re-register replaces fn"
((get (er-lookup-bif "fake" "echo" 1) :fn) (list 99)) "replaced")
;; Pure variant
(er-register-pure-bif! "fake" "pure" 2 (fn (vs) (+ (nth vs 0) (nth vs 1))))
(er-rt-test "pure registered separately, count 2" (len (er-list-bifs)) 2)
(er-rt-test "pure flag true"
(get (er-lookup-bif "fake" "pure" 2) :pure?) true)
(er-rt-test "pure fn invocable"
((get (er-lookup-bif "fake" "pure" 2) :fn) (list 7 8)) 15)
;; Arity disambiguation: same module+name, different arity = distinct entries
(er-register-bif! "fake" "echo" 2 (fn (vs) (list (nth vs 0) (nth vs 1))))
(er-rt-test "arity disambiguation count" (len (er-list-bifs)) 3)
(er-rt-test "arity-1 lookup still works"
((get (er-lookup-bif "fake" "echo" 1) :fn) (list 11)) "replaced")
(er-rt-test "arity-2 lookup independent"
(len ((get (er-lookup-bif "fake" "echo" 2) :fn) (list 1 2))) 2)
;; Reset clears the registry
(er-bif-registry-reset!)
(er-rt-test "reset clears" (len (er-list-bifs)) 0)
(er-rt-test "reset lookup nil" (er-lookup-bif "fake" "echo" 1) nil)
;; ── Phase 8: term marshalling (er-to-sx / er-of-sx) ─────────────
;; er-to-sx: Erlang → SX
(er-rt-test "to-sx atom" (er-to-sx (er-mk-atom "foo")) (make-symbol "foo"))
(er-rt-test "to-sx atom is symbol" (type-of (er-to-sx (er-mk-atom "x"))) "symbol")
(er-rt-test "to-sx nil" (er-to-sx (er-mk-nil)) (list))
(er-rt-test "to-sx integer passthrough" (er-to-sx 42) 42)
(er-rt-test "to-sx float passthrough" (er-to-sx 3.14) 3.14)
(er-rt-test "to-sx boolean passthrough" (er-to-sx true) true)
(er-rt-test "to-sx binary → string"
(er-to-sx (er-mk-binary (list 104 105 33))) "hi!")
(er-rt-test "to-sx cons → list"
(er-to-sx (er-mk-cons 1 (er-mk-cons 2 (er-mk-cons 3 (er-mk-nil))))) (list 1 2 3))
(er-rt-test "to-sx tuple → list"
(er-to-sx (er-mk-tuple (list 1 2 3))) (list 1 2 3))
(er-rt-test "to-sx nested cons"
(er-to-sx (er-mk-cons (er-mk-atom "a") (er-mk-cons 7 (er-mk-nil))))
(list (make-symbol "a") 7))
;; er-of-sx: SX → Erlang
(er-rt-test "of-sx symbol"
(get (er-of-sx (make-symbol "ok")) :name) "ok")
(er-rt-test "of-sx symbol is atom"
(er-atom? (er-of-sx (make-symbol "x"))) true)
(er-rt-test "of-sx string is binary"
(er-binary? (er-of-sx "hi")) true)
(er-rt-test "of-sx string bytes"
(get (er-of-sx "hi") :bytes) (list 104 105))
(er-rt-test "of-sx integer passthrough"
(er-of-sx 42) 42)
(er-rt-test "of-sx empty list → nil"
(er-nil? (er-of-sx (list))) true)
(er-rt-test "of-sx list → cons chain length"
(er-list-length (er-of-sx (list 1 2 3 4))) 4)
(er-rt-test "of-sx list head/tail"
(get (er-of-sx (list 10 20)) :head) 10)
;; Round-trips
(er-rt-test "rtrip integer" (er-to-sx (er-of-sx 99)) 99)
(er-rt-test "rtrip atom"
(get (er-of-sx (er-to-sx (er-mk-atom "abc"))) :name) "abc")
(er-rt-test "rtrip binary bytes"
(get (er-of-sx (er-to-sx (er-mk-binary (list 1 2 3)))) :bytes) (list 1 2 3))
(er-rt-test "rtrip cons-of-ints length"
(er-list-length (er-of-sx (er-to-sx
(er-mk-cons 1 (er-mk-cons 2 (er-mk-cons 3 (er-mk-nil))))))) 3)
;; Tuples don't round-trip exactly (er-to-sx flattens tuples to lists);
;; documented one-way conversion.
(er-rt-test "to-sx of tuple loses tag"
(er-cons? (er-of-sx (er-to-sx (er-mk-tuple (list 1 2 3))))) true)
;; Re-populate built-in BIFs so subsequent test files (ring, ping-pong, etc.)
;; can call length/spawn/etc. The migration onto the registry means a reset
;; here would otherwise break the rest of the conformance suite.
(er-register-builtin-bifs!)
(define
er-rt-test-summary
(str "runtime " er-rt-test-pass "/" er-rt-test-count))

403
lib/erlang/tests/vm.sx Normal file
View File

@@ -0,0 +1,403 @@
;; Phase 9 — stub VM opcode dispatcher tests.
;; Verifies the dispatcher shape (mirrors plans/sx-vm-opcode-extension.md
;; for when 9a integrates) and the three pattern-match opcodes (9b)
;; route to the correct er-match-* impl.
(define er-vm-test-count 0)
(define er-vm-test-pass 0)
(define er-vm-test-fails (list))
(define
er-vm-test
(fn
(name actual expected)
(set! er-vm-test-count (+ er-vm-test-count 1))
(if
(= actual expected)
(set! er-vm-test-pass (+ er-vm-test-pass 1))
(append! er-vm-test-fails {:name name :expected expected :actual actual}))))
;; ── dispatcher core ─────────────────────────────────────────────
(er-vm-test
"tuple opcode registered"
(= (er-vm-lookup-opcode-by-id 128) nil)
false)
(er-vm-test
"tuple opcode name"
(get (er-vm-lookup-opcode-by-id 128) :name)
"OP_PATTERN_TUPLE")
(er-vm-test
"list opcode by name"
(get (er-vm-lookup-opcode-by-name "OP_PATTERN_LIST") :id)
129)
(er-vm-test
"binary opcode by name"
(get (er-vm-lookup-opcode-by-name "OP_PATTERN_BINARY") :id)
130)
(er-vm-test "lookup miss by id" (er-vm-lookup-opcode-by-id 999) nil)
(er-vm-test "lookup miss by name" (er-vm-lookup-opcode-by-name "OP_NOPE") nil)
(er-vm-test
"opcode list has 3+"
(>= (len (er-vm-list-opcodes)) 3)
true)
;; ── OP_PATTERN_TUPLE ────────────────────────────────────────────
;; Pattern: {ok, X} matches value {ok, 42} → X bound to 42
(define er-vm-t1-env (er-env-new))
(define er-vm-t1-pat {:type "tuple" :elements (list {:type "atom" :value "ok"} {:name "X" :type "var"})})
(define er-vm-t1-val (er-mk-tuple (list (er-mk-atom "ok") 42)))
(er-vm-test
"OP_PATTERN_TUPLE match"
(er-vm-dispatch 128 (list er-vm-t1-pat er-vm-t1-val er-vm-t1-env))
true)
(er-vm-test "OP_PATTERN_TUPLE binds var" (get er-vm-t1-env "X") 42)
;; Same pattern against {error, ...} → false
(define er-vm-t2-env (er-env-new))
(define er-vm-t2-val (er-mk-tuple (list (er-mk-atom "error") 7)))
(er-vm-test
"OP_PATTERN_TUPLE no-match"
(er-vm-dispatch 128 (list er-vm-t1-pat er-vm-t2-val er-vm-t2-env))
false)
;; Wrong arity tuple — pattern has 2 elements, value has 3
(define er-vm-t3-env (er-env-new))
(define
er-vm-t3-val
(er-mk-tuple (list (er-mk-atom "ok") 1 2)))
(er-vm-test
"OP_PATTERN_TUPLE arity mismatch"
(er-vm-dispatch 128 (list er-vm-t1-pat er-vm-t3-val er-vm-t3-env))
false)
;; ── OP_PATTERN_LIST (cons) ──────────────────────────────────────
;; Pattern: [H | T] matches [1, 2, 3] → H=1, T=[2,3]
(define er-vm-l1-env (er-env-new))
(define er-vm-l1-pat {:type "cons" :tail {:name "T" :type "var"} :head {:name "H" :type "var"}})
(define
er-vm-l1-val
(er-mk-cons
1
(er-mk-cons 2 (er-mk-cons 3 (er-mk-nil)))))
(er-vm-test
"OP_PATTERN_LIST match"
(er-vm-dispatch 129 (list er-vm-l1-pat er-vm-l1-val er-vm-l1-env))
true)
(er-vm-test "OP_PATTERN_LIST binds head" (get er-vm-l1-env "H") 1)
(er-vm-test
"OP_PATTERN_LIST tail is cons"
(er-cons? (get er-vm-l1-env "T"))
true)
;; [H|T] against empty list → false
(define er-vm-l2-env (er-env-new))
(er-vm-test
"OP_PATTERN_LIST no-match on nil"
(er-vm-dispatch 129 (list er-vm-l1-pat (er-mk-nil) er-vm-l2-env))
false)
;; ── OP_PATTERN_BINARY ───────────────────────────────────────────
;; Pattern <<A:8>> against <<42>> → A bound to 42
(define er-vm-b1-env (er-env-new))
(define er-vm-b1-pat {:type "binary" :segments (list {:value {:name "A" :type "var"} :size {:type "integer" :value "8"} :spec "integer"})})
(define er-vm-b1-val (er-mk-binary (list 42)))
(er-vm-test
"OP_PATTERN_BINARY match"
(er-vm-dispatch 130 (list er-vm-b1-pat er-vm-b1-val er-vm-b1-env))
true)
(er-vm-test
"OP_PATTERN_BINARY binds segment"
(get er-vm-b1-env "A")
42)
;; Same pattern against wrong-size binary (2 bytes) → false
(define er-vm-b2-env (er-env-new))
(define er-vm-b2-val (er-mk-binary (list 42 99)))
(er-vm-test
"OP_PATTERN_BINARY size mismatch"
(er-vm-dispatch 130 (list er-vm-b1-pat er-vm-b2-val er-vm-b2-env))
false)
;; ── dispatch error path ────────────────────────────────────────
(define er-vm-err-caught (list nil))
(guard
(c (:else (set-nth! er-vm-err-caught 0 (str c))))
(er-vm-dispatch 999 (list)))
(er-vm-test
"unknown opcode raises"
(string-contains? (str (nth er-vm-err-caught 0)) "unknown opcode")
true)
;; ── Phase 9c — OP_PERFORM / OP_HANDLE ───────────────────────────
(er-vm-test "perform opcode by id"
(get (er-vm-lookup-opcode-by-id 131) :name) "OP_PERFORM")
(er-vm-test "handle opcode by id"
(get (er-vm-lookup-opcode-by-id 132) :name) "OP_HANDLE")
(define er-vm-pf-caught (list nil))
(guard (c (:else (set-nth! er-vm-pf-caught 0 c)))
(er-vm-dispatch 131 (list "yield" (list 42))))
(er-vm-test "perform raises tagged"
(get (nth er-vm-pf-caught 0) :tag) "vm-effect")
(er-vm-test "perform effect name"
(get (nth er-vm-pf-caught 0) :effect) "yield")
(er-vm-test "perform args carried"
(nth (get (nth er-vm-pf-caught 0) :args) 0) 42)
(er-vm-test "handle catches matching effect"
(er-vm-dispatch 132
(list
(fn () (er-vm-dispatch 131 (list "yield" (list 7))))
"yield"
(fn (args) (+ (nth args 0) 100))))
107)
(er-vm-test "handle no-effect returns thunk result"
(er-vm-dispatch 132
(list
(fn () 99)
"yield"
(fn (args) "handler ran")))
99)
(define er-vm-rt-caught (list nil))
(guard (c (:else (set-nth! er-vm-rt-caught 0 c)))
(er-vm-dispatch 132
(list
(fn () (er-vm-dispatch 131 (list "other" (list))))
"yield"
(fn (args) "wrong"))))
(er-vm-test "handle rethrows non-matching"
(get (nth er-vm-rt-caught 0) :effect) "other")
(er-vm-test "nested handles separate effect names"
(er-vm-dispatch 132
(list
(fn ()
(er-vm-dispatch 132
(list
(fn () (er-vm-dispatch 131 (list "b" (list 5))))
"a"
(fn (args) "inner-handled"))))
"b"
(fn (args) (+ (nth args 0) 1000))))
1005)
;; ── Phase 9d — OP_RECEIVE_SCAN ──────────────────────────────────
(er-vm-test "receive-scan opcode by id"
(get (er-vm-lookup-opcode-by-id 133) :name) "OP_RECEIVE_SCAN")
;; Pattern: receive {ok, X} -> X end against mailbox [{error, 1}, {ok, 42}, foo]
(define er-vm-r1-env (er-env-new))
(define er-vm-r1-clauses
(list
{:pattern {:type "tuple"
:elements (list
{:type "atom" :value "ok"}
{:type "var" :name "X"})}
:guards (list)
:body (list {:type "var" :name "X"})}))
(define er-vm-r1-mbox
(list
(er-mk-tuple (list (er-mk-atom "error") 1))
(er-mk-tuple (list (er-mk-atom "ok") 42))
(er-mk-atom "foo")))
(define er-vm-r1-result
(er-vm-dispatch 133 (list er-vm-r1-clauses er-vm-r1-mbox er-vm-r1-env)))
(er-vm-test "scan finds match"
(get er-vm-r1-result :matched) true)
(er-vm-test "scan reports correct index"
(get er-vm-r1-result :index) 1)
(er-vm-test "scan binds var"
(get er-vm-r1-env "X") 42)
(er-vm-test "scan leaves body unevaluated"
(= (get er-vm-r1-result :body) nil) false)
;; No match case
(define er-vm-r2-env (er-env-new))
(define er-vm-r2-mbox (list (er-mk-atom "nope") 99))
(define er-vm-r2-result
(er-vm-dispatch 133 (list er-vm-r1-clauses er-vm-r2-mbox er-vm-r2-env)))
(er-vm-test "scan no-match"
(get er-vm-r2-result :matched) false)
(er-vm-test "scan no-match leaves env clean"
(dict-has? er-vm-r2-env "X") false)
;; Empty mailbox
(define er-vm-r3-result
(er-vm-dispatch 133 (list er-vm-r1-clauses (list) (er-env-new))))
(er-vm-test "scan empty mailbox"
(get er-vm-r3-result :matched) false)
;; First-match wins (arrival order)
(define er-vm-r4-env (er-env-new))
(define er-vm-r4-mbox
(list
(er-mk-tuple (list (er-mk-atom "ok") 1))
(er-mk-tuple (list (er-mk-atom "ok") 2))))
(define er-vm-r4-result
(er-vm-dispatch 133 (list er-vm-r1-clauses er-vm-r4-mbox er-vm-r4-env)))
(er-vm-test "scan first-match wins (index 0)"
(get er-vm-r4-result :index) 0)
(er-vm-test "scan binds first match's var"
(get er-vm-r4-env "X") 1)
;; ── Phase 9e — OP_SPAWN / OP_SEND ───────────────────────────────
(er-vm-procs-reset!)
(er-vm-test "spawn opcode by id"
(get (er-vm-lookup-opcode-by-id 134) :name) "OP_SPAWN")
(er-vm-test "send opcode by id"
(get (er-vm-lookup-opcode-by-id 135) :name) "OP_SEND")
(define er-vm-fn (fn () "body"))
(define er-vm-p1 (er-vm-dispatch 134 (list er-vm-fn (list))))
(define er-vm-p2 (er-vm-dispatch 134 (list er-vm-fn (list "arg"))))
(er-vm-test "spawn returns pid 0 first"
er-vm-p1 0)
(er-vm-test "spawn returns pid 1 second"
er-vm-p2 1)
(er-vm-test "proc count is 2"
(er-vm-proc-count) 2)
(er-vm-test "spawned proc state runnable"
(er-vm-proc-state er-vm-p1) "runnable")
(er-vm-test "spawned proc mailbox empty"
(len (er-vm-proc-mailbox er-vm-p1)) 0)
(er-vm-test "spawned proc has 8 registers"
(len (get (er-vm-proc-get er-vm-p1) :registers)) 8)
;; OP_SEND appends to target's mailbox, preserves arrival order.
(er-vm-test "send returns true on valid pid"
(er-vm-dispatch 135 (list er-vm-p1 "msg1")) true)
(er-vm-dispatch 135 (list er-vm-p1 "msg2")
)
(er-vm-dispatch 135 (list er-vm-p1 "msg3"))
(er-vm-test "mailbox length after 3 sends"
(len (er-vm-proc-mailbox er-vm-p1)) 3)
(er-vm-test "mailbox preserves order — first"
(nth (er-vm-proc-mailbox er-vm-p1) 0) "msg1")
(er-vm-test "mailbox preserves order — last"
(nth (er-vm-proc-mailbox er-vm-p1) 2) "msg3")
;; send to nonexistent pid returns false (doesn't crash)
(er-vm-test "send to unknown pid is false"
(er-vm-dispatch 135 (list 99999 "x")) false)
;; Isolation: msgs to p1 don't appear in p2's mailbox
(er-vm-test "isolation — p2 mailbox empty"
(len (er-vm-proc-mailbox er-vm-p2)) 0)
;; reset clears
(er-vm-procs-reset!)
(er-vm-test "reset clears procs"
(er-vm-proc-count) 0)
(er-vm-test "reset resets pid counter"
(er-vm-dispatch 134 (list er-vm-fn (list))) 0)
;; ── Phase 9f — hot-BIF dispatch table ───────────────────────────
;; Each opcode skips the registry lookup and calls the underlying
;; er-bif-* directly. Verify each returns the same result as going
;; through er-apply-bif.
(er-vm-test "BIF_LENGTH opcode by id"
(get (er-vm-lookup-opcode-by-id 136) :name) "OP_BIF_LENGTH")
(er-vm-test "BIF_LENGTH on 3-cons"
(er-vm-dispatch 136
(list (er-mk-cons 1 (er-mk-cons 2 (er-mk-cons 3 (er-mk-nil))))))
3)
(er-vm-test "BIF_HD on cons"
(er-vm-dispatch 137 (list (er-mk-cons 99 (er-mk-nil)))) 99)
(er-vm-test "BIF_TL is cons"
(er-cons? (er-vm-dispatch 138
(list (er-mk-cons 1 (er-mk-cons 2 (er-mk-nil)))))) true)
(er-vm-test "BIF_ELEMENT pulls index"
(er-vm-dispatch 139 (list 2 (er-mk-tuple (list "a" "b" "c")))) "b")
(er-vm-test "BIF_TUPLE_SIZE on 4-tuple"
(er-vm-dispatch 140 (list (er-mk-tuple (list 1 2 3 4)))) 4)
(er-vm-test "BIF_LISTS_REVERSE preserves elements"
(er-list-length (er-vm-dispatch 141
(list (er-mk-cons 1 (er-mk-cons 2 (er-mk-cons 3 (er-mk-nil))))))) 3)
(er-vm-test "BIF_LISTS_REVERSE actually reverses"
(get (er-vm-dispatch 141
(list (er-mk-cons 1 (er-mk-cons 2 (er-mk-cons 3 (er-mk-nil)))))) :head) 3)
(er-vm-test "BIF_IS_INTEGER true on int"
(get (er-vm-dispatch 142 (list 42)) :name) "true")
(er-vm-test "BIF_IS_INTEGER false on float"
(get (er-vm-dispatch 142 (list 3.14)) :name) "false")
(er-vm-test "BIF_IS_ATOM true"
(get (er-vm-dispatch 143 (list (er-mk-atom "ok"))) :name) "true")
(er-vm-test "BIF_IS_ATOM false on int"
(get (er-vm-dispatch 143 (list 7)) :name) "false")
(er-vm-test "BIF_IS_LIST true on cons"
(get (er-vm-dispatch 144
(list (er-mk-cons 1 (er-mk-nil)))) :name) "true")
(er-vm-test "BIF_IS_LIST true on nil"
(get (er-vm-dispatch 144 (list (er-mk-nil))) :name) "true")
(er-vm-test "BIF_IS_LIST false on tuple"
(get (er-vm-dispatch 144 (list (er-mk-tuple (list)))) :name) "false")
(er-vm-test "BIF_IS_TUPLE true"
(get (er-vm-dispatch 145 (list (er-mk-tuple (list 1)))) :name) "true")
(er-vm-test "BIF_IS_TUPLE false on int"
(get (er-vm-dispatch 145 (list 5)) :name) "false")
;; Sanity: total opcode count grew (3 patterns + perform + handle +
;; receive-scan + spawn + send + 10 hot-BIFs = 16+ registered).
(er-vm-test "opcode list has 16+"
(>= (len (er-vm-list-opcodes)) 16) true)
;; ── Phase 9i — host opcode-id resolution ────────────────────────
;; Requires a binary with the erlang_ext extension registered (9h).
;; The loop runs conformance against exactly that binary.
(er-vm-test "host id: OP_PATTERN_TUPLE = 222"
(er-vm-host-opcode-id "erlang.OP_PATTERN_TUPLE") 222)
(er-vm-test "host id: OP_BIF_IS_TUPLE = 239"
(er-vm-host-opcode-id "erlang.OP_BIF_IS_TUPLE") 239)
(er-vm-test "host id: unknown name -> nil"
(er-vm-host-opcode-id "erlang.OP_NOPE") nil)
(er-vm-test "effective id prefers host when present"
(er-vm-effective-opcode-id "erlang.OP_BIF_LENGTH" 136) 230)
(er-vm-test "effective id falls back to stub on nil"
(er-vm-effective-opcode-id "erlang.OP_NOPE" 999) 999)
;; The full erlang.OP_* namespace resolves to the contiguous 222-239 block.
(er-vm-test "host ids contiguous 222..239"
(let ((names (list "erlang.OP_PATTERN_TUPLE" "erlang.OP_PATTERN_LIST"
"erlang.OP_PATTERN_BINARY" "erlang.OP_PERFORM"
"erlang.OP_HANDLE" "erlang.OP_RECEIVE_SCAN"
"erlang.OP_SPAWN" "erlang.OP_SEND"
"erlang.OP_BIF_LENGTH" "erlang.OP_BIF_HD"
"erlang.OP_BIF_TL" "erlang.OP_BIF_ELEMENT"
"erlang.OP_BIF_TUPLE_SIZE" "erlang.OP_BIF_LISTS_REVERSE"
"erlang.OP_BIF_IS_INTEGER" "erlang.OP_BIF_IS_ATOM"
"erlang.OP_BIF_IS_LIST" "erlang.OP_BIF_IS_TUPLE"))
(ok (list true)))
(for-each
(fn (i)
(when (not (= (er-vm-host-opcode-id (nth names i)) (+ 222 i)))
(set-nth! ok 0 false)))
(range 0 (len names)))
(nth ok 0))
true)
(define er-vm-test-summary (str "vm " er-vm-test-pass "/" er-vm-test-count))

View File

@@ -669,96 +669,23 @@
(define
er-apply-bif
(fn
(name vs)
(cond
(= name "is_integer") (er-bif-is-integer vs)
(= name "is_atom") (er-bif-is-atom vs)
(= name "is_list") (er-bif-is-list vs)
(= name "is_tuple") (er-bif-is-tuple vs)
(= name "is_number") (er-bif-is-number vs)
(= name "is_float") (er-bif-is-float vs)
(= name "is_boolean") (er-bif-is-boolean vs)
(= name "length") (er-bif-length vs)
(= name "hd") (er-bif-hd vs)
(= name "tl") (er-bif-tl vs)
(= name "element") (er-bif-element vs)
(= name "tuple_size") (er-bif-tuple-size vs)
(= name "atom_to_list") (er-bif-atom-to-list vs)
(= name "list_to_atom") (er-bif-list-to-atom vs)
(= name "is_pid") (er-bif-is-pid vs)
(= name "is_reference") (er-bif-is-reference vs)
(= name "is_binary") (er-bif-is-binary vs)
(= name "byte_size") (er-bif-byte-size vs)
(= name "abs") (er-bif-abs vs)
(= name "min") (er-bif-min vs)
(= name "max") (er-bif-max vs)
(= name "tuple_to_list") (er-bif-tuple-to-list vs)
(= name "list_to_tuple") (er-bif-list-to-tuple vs)
(= name "integer_to_list") (er-bif-integer-to-list vs)
(= name "list_to_integer") (er-bif-list-to-integer vs)
(= name "is_function") (er-bif-is-function vs)
(= name "self") (er-bif-self vs)
(= name "spawn") (er-bif-spawn vs)
(= name "exit") (er-bif-exit vs)
(= name "make_ref") (er-bif-make-ref vs)
(= name "link") (er-bif-link vs)
(= name "unlink") (er-bif-unlink vs)
(= name "monitor") (er-bif-monitor vs)
(= name "demonitor") (er-bif-demonitor vs)
(= name "process_flag") (er-bif-process-flag vs)
(= name "register") (er-bif-register vs)
(= name "unregister") (er-bif-unregister vs)
(= name "whereis") (er-bif-whereis vs)
(= name "registered") (er-bif-registered vs)
(= name "throw") (raise (er-mk-throw-marker (er-bif-arg1 vs "throw")))
(= name "error") (raise (er-mk-error-marker (er-bif-arg1 vs "error")))
:else (error
(str "Erlang: undefined function '" name "/" (len vs) "'")))))
(fn (name vs)
(let ((entry (er-lookup-bif "erlang" name (len vs))))
(if (not (= entry nil))
((get entry :fn) vs)
(error (str "Erlang: undefined function '" name "/" (len vs) "'"))))))
(define
er-apply-remote-bif
(fn
(mod name vs)
(fn (mod name vs)
(cond
(dict-has? (er-modules-get) mod)
(er-apply-user-module mod name vs)
(= mod "lists") (er-apply-lists-bif name vs)
(= mod "io") (er-apply-io-bif name vs)
(= mod "erlang") (er-apply-bif name vs)
(= mod "ets") (er-apply-ets-bif name vs)
:else (error
(str "Erlang: undefined module '" mod "'")))))
(define
er-apply-lists-bif
(fn
(name vs)
(cond
(= name "reverse") (er-bif-lists-reverse vs)
(= name "map") (er-bif-lists-map vs)
(= name "foldl") (er-bif-lists-foldl vs)
(= name "seq") (er-bif-lists-seq vs)
(= name "sum") (er-bif-lists-sum vs)
(= name "nth") (er-bif-lists-nth vs)
(= name "last") (er-bif-lists-last vs)
(= name "member") (er-bif-lists-member vs)
(= name "append") (er-bif-lists-append vs)
(= name "filter") (er-bif-lists-filter vs)
(= name "any") (er-bif-lists-any vs)
(= name "all") (er-bif-lists-all vs)
(= name "duplicate") (er-bif-lists-duplicate vs)
:else (error
(str "Erlang: undefined 'lists:" name "/" (len vs) "'")))))
(define
er-apply-io-bif
(fn
(name vs)
(cond
(= name "format") (er-bif-io-format vs)
:else (error
(str "Erlang: undefined 'io:" name "/" (len vs) "'")))))
(er-apply-user-module mod name vs)
:else
(let ((entry (er-lookup-bif mod name (len vs))))
(if (not (= entry nil))
((get entry :fn) vs)
(error (str "Erlang: undefined remote function '" mod ":" name "/" (len vs) "'")))))))
(define
er-bif-arg1
@@ -1911,3 +1838,180 @@
(fn (_) (set! out (er-mk-cons v out)))
(range 0 n))
out))))
;; ── code module (Phase 7 hot-reload) ─────────────────────────────
(define er-source-walk-bytes!
(fn (n bytes-box)
(cond
(er-nil? n) true
(er-cons? n)
(let ((h (get n :head)))
(cond
(= (type-of h) "number")
(do (append! (nth bytes-box 0) h)
(er-source-walk-bytes! (get n :tail) bytes-box))
:else (do (set-nth! bytes-box 0 nil) false)))
:else (do (set-nth! bytes-box 0 nil) false))))
(define er-source-to-string
(fn (v)
(cond
(= (type-of v) "string") v
(er-binary? v) (list->string (map integer->char (get v :bytes)))
(or (er-nil? v) (er-cons? v))
(let ((box (list (list))))
(er-source-walk-bytes! v box)
(cond
(= (nth box 0) nil) nil
:else (list->string (map integer->char (nth box 0)))))
:else nil)))
(define er-bif-code-load-binary
(fn (vs)
(let ((mod-arg (nth vs 0)) (src-arg (nth vs 2)))
(cond
(not (er-atom? mod-arg))
(er-mk-tuple (list (er-mk-atom "error") (er-mk-atom "badarg")))
:else
(let ((src-str (er-source-to-string src-arg)))
(cond
(= src-str nil)
(er-mk-tuple (list (er-mk-atom "error") (er-mk-atom "badarg")))
:else
(let ((result-box (list nil)) (failed-box (list false)))
(guard
(c (:else (set-nth! failed-box 0 true)))
(set-nth! result-box 0 (erlang-load-module src-str)))
(cond
(nth failed-box 0)
(er-mk-tuple
(list (er-mk-atom "error") (er-mk-atom "badfile")))
(not (= (get (nth result-box 0) :name) (get mod-arg :name)))
(er-mk-tuple
(list (er-mk-atom "error") (er-mk-atom "module_name_mismatch")))
:else
(er-mk-tuple (list (er-mk-atom "module") mod-arg))))))))))
(define er-env-derived-from?
(fn (env target-env)
;; Object-identity check, NOT value `=`. On evaluators where dict `=`
;; is structural/deep, comparing closure envs (which are large and
;; cyclic — a module fun's env references the fun) does not terminate.
;; `identical?` is pointer identity on every host and is the actual
;; intended semantics: "is this the same env object".
(cond
(identical? env target-env) true
:else
(let ((ks (keys env)) (found-ref (list false)))
(for-each
(fn (i)
(when (not (nth found-ref 0))
(let ((v (get env (nth ks i))))
(when (and (er-fun? v) (identical? (get v :env) target-env))
(set-nth! found-ref 0 true)))))
(range 0 (len ks)))
(nth found-ref 0)))))
(define er-procs-on-env
(fn (target-env)
(let ((all-keys (keys (er-sched-processes)))
(matches (list)))
(for-each
(fn (i)
(let ((proc (get (er-sched-processes) (nth all-keys i))))
(let ((init-fun (get proc :initial-fun)))
(when (and (not (= init-fun nil))
(er-fun? init-fun)
(er-env-derived-from? (get init-fun :env) target-env)
(not (= (get proc :state) "dead")))
(append! matches (get proc :pid))))))
(range 0 (len all-keys)))
matches)))
(define er-bif-code-purge
(fn (vs)
(let ((mod-arg (nth vs 0)))
(cond
(not (er-atom? mod-arg))
(raise (er-mk-error-marker (er-mk-atom "badarg")))
:else
(let ((registry (er-modules-get)) (mod-name (get mod-arg :name)))
(cond
(not (dict-has? registry mod-name)) (er-mk-atom "false")
:else
(let ((slot (get registry mod-name)))
(cond
(= (er-module-old-env slot) nil) (er-mk-atom "false")
:else
(let ((procs (er-procs-on-env (er-module-old-env slot))))
(for-each
(fn (i) (er-cascade-exit! (nth procs i) (er-mk-atom "killed")))
(range 0 (len procs)))
(dict-set! registry mod-name
(er-mk-module-slot (er-module-current-env slot) nil
(er-module-version slot)))
(er-mk-atom "true"))))))))))
(define er-bif-code-soft-purge
(fn (vs)
(let ((mod-arg (nth vs 0)))
(cond
(not (er-atom? mod-arg))
(raise (er-mk-error-marker (er-mk-atom "badarg")))
:else
(let ((registry (er-modules-get)) (mod-name (get mod-arg :name)))
(cond
(not (dict-has? registry mod-name)) (er-mk-atom "true")
:else
(let ((slot (get registry mod-name)))
(cond
(= (er-module-old-env slot) nil) (er-mk-atom "true")
:else
(let ((procs (er-procs-on-env (er-module-old-env slot))))
(cond
(> (len procs) 0) (er-mk-atom "false")
:else
(do
(dict-set! registry mod-name
(er-mk-module-slot (er-module-current-env slot) nil
(er-module-version slot)))
(er-mk-atom "true"))))))))))))
(define er-bif-code-which
(fn (vs)
(let ((mod-arg (nth vs 0)))
(cond
(not (er-atom? mod-arg))
(raise (er-mk-error-marker (er-mk-atom "badarg")))
(dict-has? (er-modules-get) (get mod-arg :name))
(er-mk-atom "loaded")
:else (er-mk-atom "non_existing")))))
(define er-bif-code-is-loaded
(fn (vs)
(let ((mod-arg (nth vs 0)))
(cond
(not (er-atom? mod-arg))
(raise (er-mk-error-marker (er-mk-atom "badarg")))
(dict-has? (er-modules-get) (get mod-arg :name))
(er-mk-tuple (list (er-mk-atom "file") (er-mk-atom "loaded")))
:else (er-mk-atom "false")))))
(define er-bif-code-all-loaded
(fn (vs)
(let ((registry (er-modules-get))
(ks (keys (er-modules-get)))
(out (er-mk-nil)))
(for-each
(fn (i)
(let ((k (nth ks (- (- (len ks) 1) i))))
(set! out
(er-mk-cons
(er-mk-tuple
(list (er-mk-atom k) (er-mk-atom "loaded")))
out))))
(range 0 (len ks)))
out)))

313
lib/erlang/vm/dispatcher.sx Normal file
View File

@@ -0,0 +1,313 @@
;; Erlang VM — stub opcode dispatcher (Phase 9).
;;
;; Mimics the OCaml-side EXTENSION shape from
;; plans/sx-vm-opcode-extension.md so opcodes 9b-9g can be designed
;; and tested in SX before 9a (`hosts/ocaml/`) lands the real
;; registration plumbing. When 9a is available, these stubs become
;; the cross-host SX-side mirror of the C/OCaml handlers and the
;; bytecode compiler emits them directly.
;;
;; Opcode IDs follow the plan's tier partition:
;; 0-127 reserved for SX core
;; 128-199 guest extensions (e.g. erlang, lua)
;; 200-247 port-/platform-specific
;;
;; Erlang owns 128-159 for now.
(define er-vm-opcodes (list {}))
(define er-vm-opcodes-get (fn () (nth er-vm-opcodes 0)))
(define
er-vm-opcodes-reset!
(fn () (set-nth! er-vm-opcodes 0 {})))
(define
er-vm-register-opcode!
(fn
(id name handler)
(dict-set! (er-vm-opcodes-get) (str id) {:name name :id id :handler handler})
(er-mk-atom "ok")))
(define
er-vm-lookup-opcode-by-id
(fn
(id)
(let
((reg (er-vm-opcodes-get)) (k (str id)))
(if (dict-has? reg k) (get reg k) nil))))
(define
er-vm-lookup-opcode-by-name
(fn
(name)
(let
((reg (er-vm-opcodes-get))
(ks (keys (er-vm-opcodes-get)))
(found (list nil)))
(for-each
(fn
(i)
(let
((entry (get reg (nth ks i))))
(when
(= (get entry :name) name)
(set-nth! found 0 entry))))
(range 0 (len ks)))
(nth found 0))))
(define er-vm-list-opcodes (fn () (keys (er-vm-opcodes-get))))
;; ── Phase 9i — host opcode-id resolution ────────────────────────
;; When the OCaml `erlang_ext` extension is registered (Phase 9h), the
;; runtime exposes `extension-opcode-id` which maps an "erlang.OP_*"
;; name to the host-assigned id (222-239). We consult it so the SX
;; side and the OCaml side agree on ids; when it returns nil (name not
;; registered) we fall back to the stub-local id.
;;
;; NOTE: this requires a binary with the VM extension mechanism (the
;; vm-ext phase-A..E cherry-pick + Sx_vm_extensions force-link). The
;; loop builds and runs against exactly that binary
;; (hosts/ocaml/_build/default/bin/sx_server.exe). `extension-opcode-id`
;; resolves lazily at call time, so merely loading this file is safe;
;; only invoking the resolver on a binary that lacks the primitive
;; would raise.
(define er-vm-host-opcode-id
(fn (ext-name)
(extension-opcode-id ext-name)))
(define er-vm-effective-opcode-id
(fn (ext-name stub-id)
(let ((host (extension-opcode-id ext-name)))
(cond
(= host nil) stub-id
:else host))))
(define
er-vm-dispatch
(fn
(id operands)
(let
((entry (er-vm-lookup-opcode-by-id id)))
(if
(= entry nil)
(error (str "Erlang VM: unknown opcode id " id))
((get entry :handler) operands)))))
(define
er-vm-dispatch-by-name
(fn
(name operands)
(let
((entry (er-vm-lookup-opcode-by-name name)))
(if
(= entry nil)
(error (str "Erlang VM: unknown opcode name '" name "'"))
((get entry :handler) operands)))))
;; ── Phase 9c — effect opcodes (perform / handle) ────────────────
;; Stub algebraic-effects-style operators. OP_PERFORM raises a tagged
;; exception; OP_HANDLE wraps a thunk in `guard` and catches matching
;; effects, passing the args to the handler. The real specialization
;; (constant-time effect dispatch, single-shot vs multi-shot continuations)
;; lands when 9a integrates.
(define er-vm-effect-marker?
(fn (c effect-name)
(and (= (type-of c) "dict")
(= (get c :tag) "vm-effect")
(= (get c :effect) effect-name))))
(define er-vm-op-perform
(fn (operands)
(raise {:tag "vm-effect" :effect (nth operands 0) :args (nth operands 1)})))
(define er-vm-op-handle
(fn (operands)
(let ((thunk (nth operands 0))
(effect-name (nth operands 1))
(handler (nth operands 2))
(result (list nil))
(caught (list false))
(rethrow (list nil)))
(guard
(c
(:else
(cond
(er-vm-effect-marker? c effect-name)
(do (set-nth! caught 0 true)
(set-nth! result 0 (handler (get c :args))))
:else (set-nth! rethrow 0 c))))
(set-nth! result 0 (thunk)))
(cond
(not (= (nth rethrow 0) nil)) (raise (nth rethrow 0))
:else (nth result 0)))))
;; ── Phase 9d — receive scan opcode ────────────────────────────
;; Selective receive primitive. Scans a mailbox value-list in arrival
;; order; for each value, tries each clause's pattern (binding into
;; env on success); on match returns `{:matched true :index N :body B}`
;; — the caller decides what to do with the index (queue-delete) and
;; the body (eval in the now-mutated env). On miss returns
;; `{:matched false}`, the caller arranges suspension (via OP_PERFORM).
;;
;; Operands: (clauses mbox-list env)
;; clauses — list of {:pattern :guards :body} dicts
;; mbox-list — SX list of message values
;; env — env dict (mutated on match)
(define er-vm-receive-try-clauses
(fn (clauses msg env i)
(cond
(>= i (len clauses)) {:matched false}
:else
(let ((c (nth clauses i)) (snap (er-env-copy env)))
(cond
(and
(er-match! (get c :pattern) msg env)
(er-eval-guards (get c :guards) env))
{:matched true :body (get c :body)}
:else
(do (er-env-restore! env snap)
(er-vm-receive-try-clauses clauses msg env (+ i 1))))))))
(define er-vm-receive-scan-loop
(fn (clauses mbox env i)
(cond
(>= i (len mbox)) {:matched false}
:else
(let ((msg (nth mbox i))
(cr (er-vm-receive-try-clauses clauses msg env 0)))
(cond
(get cr :matched) {:matched true :index i :body (get cr :body)}
:else (er-vm-receive-scan-loop clauses mbox env (+ i 1)))))))
(define er-vm-op-receive-scan
(fn (operands)
(er-vm-receive-scan-loop (nth operands 0) (nth operands 1) (nth operands 2) 0)))
;; ── Phase 9e — spawn / send + lightweight scheduler ─────────────
;; Stub register-machine process layout for the eventual fast scheduler.
;; A VM-process is `{:id :registers :mailbox :state :initial-fn :initial-args}`.
;; Registers is a vector (SX list, mutated via set-nth!) — fixed slot count
;; per process so cells don't grow during execution. Mailbox is an SX list.
;; State is one of "runnable" / "waiting" / "dead". This sits PARALLEL to
;; the existing `er-scheduler` (which is the language-level scheduler) —
;; the VM scheduler will eventually take over once 9a integrates and
;; bytecode-compiled Erlang runs against it.
(define er-vm-procs (list {}))
(define er-vm-procs-get (fn () (nth er-vm-procs 0)))
(define er-vm-procs-reset!
(fn () (do (set-nth! er-vm-procs 0 {}) (set-nth! er-vm-next-pid 0 0))))
(define er-vm-next-pid (list 0))
(define er-vm-proc-new!
(fn (initial-fn initial-args)
(let ((pid (nth er-vm-next-pid 0)))
(set-nth! er-vm-next-pid 0 (+ pid 1))
(let ((proc
{:id pid
:registers (list nil nil nil nil nil nil nil nil)
:mailbox (list)
:state "runnable"
:initial-fn initial-fn
:initial-args initial-args}))
(dict-set! (er-vm-procs-get) (str pid) proc)
pid))))
(define er-vm-proc-get (fn (pid) (get (er-vm-procs-get) (str pid))))
(define er-vm-proc-send!
(fn (pid msg)
(let ((proc (er-vm-proc-get pid)))
(cond
(= proc nil) false
:else
(do
(dict-set! proc :mailbox (append (get proc :mailbox) (list msg)))
(when (= (get proc :state) "waiting")
(dict-set! proc :state "runnable"))
true)))))
(define er-vm-proc-mailbox (fn (pid) (get (er-vm-proc-get pid) :mailbox)))
(define er-vm-proc-state (fn (pid) (get (er-vm-proc-get pid) :state)))
(define er-vm-proc-count (fn () (len (keys (er-vm-procs-get)))))
(define er-vm-op-spawn
(fn (operands)
(er-vm-proc-new! (nth operands 0) (nth operands 1))))
(define er-vm-op-send
(fn (operands)
(er-vm-proc-send! (nth operands 0) (nth operands 1))))
;; ── Phase 9f — hot-BIF dispatch table ──────────────────────────
;; Specialized opcodes for the BIFs that the bytecode compiler emits
;; on hot call sites. The handler is the underlying `er-bif-*` impl
;; directly — same `(vs)` signature as the dispatcher uses for
;; operands, so the cost is the opcode-id → handler hop with no
;; registry-key string lookup. Cold BIFs continue going through the
;; general path (`er-apply-bif` / `er-lookup-bif`).
;;
;; Opcodes 136-159 reserved for hot BIFs.
;; ── Phase 9b — pattern-match opcodes ────────────────────────────
;; Each handler takes a list (pattern-ast value env) and returns
;; true/false, mutating env on success (same contract as the
;; existing er-match-tuple / er-match-cons / er-match-binary).
;; Wire these as wrappers for now; the real opcodes will eventually
;; have register-machine semantics and skip the AST-walk overhead.
(define
er-vm-register-erlang-opcodes!
(fn
()
(er-vm-register-opcode!
128
"OP_PATTERN_TUPLE"
(fn
(operands)
(er-match-tuple
(nth operands 0)
(nth operands 1)
(nth operands 2))))
(er-vm-register-opcode!
129
"OP_PATTERN_LIST"
(fn
(operands)
(er-match-cons
(nth operands 0)
(nth operands 1)
(nth operands 2))))
(er-vm-register-opcode!
130
"OP_PATTERN_BINARY"
(fn
(operands)
(er-match-binary
(nth operands 0)
(nth operands 1)
(nth operands 2))))
(er-vm-register-opcode! 131 "OP_PERFORM" er-vm-op-perform)
(er-vm-register-opcode! 132 "OP_HANDLE" er-vm-op-handle)
(er-vm-register-opcode! 133 "OP_RECEIVE_SCAN" er-vm-op-receive-scan)
(er-vm-register-opcode! 134 "OP_SPAWN" er-vm-op-spawn)
(er-vm-register-opcode! 135 "OP_SEND" er-vm-op-send)
;; Phase 9f — hot BIFs
(er-vm-register-opcode! 136 "OP_BIF_LENGTH" er-bif-length)
(er-vm-register-opcode! 137 "OP_BIF_HD" er-bif-hd)
(er-vm-register-opcode! 138 "OP_BIF_TL" er-bif-tl)
(er-vm-register-opcode! 139 "OP_BIF_ELEMENT" er-bif-element)
(er-vm-register-opcode! 140 "OP_BIF_TUPLE_SIZE" er-bif-tuple-size)
(er-vm-register-opcode! 141 "OP_BIF_LISTS_REVERSE" er-bif-lists-reverse)
(er-vm-register-opcode! 142 "OP_BIF_IS_INTEGER" er-bif-is-integer)
(er-vm-register-opcode! 143 "OP_BIF_IS_ATOM" er-bif-is-atom)
(er-vm-register-opcode! 144 "OP_BIF_IS_LIST" er-bif-is-list)
(er-vm-register-opcode! 145 "OP_BIF_IS_TUPLE" er-bif-is-tuple)
(er-mk-atom "ok")))
(er-vm-register-erlang-opcodes!)

133
lib/go/conformance.sh Executable file
View File

@@ -0,0 +1,133 @@
#!/usr/bin/env bash
# Go-on-SX conformance runner.
#
# Loads every Go-on-SX test suite via the epoch protocol, collects
# pass/fail counts, and writes lib/go/scoreboard.json + .md.
#
# Usage:
# bash lib/go/conformance.sh # run all suites
# bash lib/go/conformance.sh -v # verbose per-suite
set -uo pipefail
cd "$(git rev-parse --show-toplevel)"
SX_SERVER="${SX_SERVER:-hosts/ocaml/_build/default/bin/sx_server.exe}"
if [ ! -x "$SX_SERVER" ]; then
SX_SERVER="/root/rose-ash/hosts/ocaml/_build/default/bin/sx_server.exe"
fi
if [ ! -x "$SX_SERVER" ]; then
echo "ERROR: sx_server.exe not found." >&2
exit 1
fi
VERBOSE="${1:-}"
TMPFILE=$(mktemp)
OUTFILE=$(mktemp)
trap "rm -f $TMPFILE $OUTFILE" EXIT
# Each suite: name | pass-counter | total-counter
SUITES=(
"lex|go-test-pass|go-test-count"
)
cat > "$TMPFILE" <<'EPOCHS'
(epoch 1)
(load "lib/guest/lex.sx")
(load "lib/go/lex.sx")
(load "lib/go/tests/lex.sx")
EPOCHS
idx=0
for entry in "${SUITES[@]}"; do
name="${entry%%|*}"
pass_var=$(echo "$entry" | awk -F'|' '{print $2}')
total_var=$(echo "$entry" | awk -F'|' '{print $3}')
epoch=$((100 + idx))
echo "(epoch $epoch)" >> "$TMPFILE"
echo "(eval \"(list $pass_var $total_var)\")" >> "$TMPFILE"
idx=$((idx + 1))
done
"$SX_SERVER" < "$TMPFILE" > "$OUTFILE" 2>&1
parse_pair() {
local epoch="$1"
local line
line=$(grep -A1 "^(ok-len $epoch " "$OUTFILE" | tail -1)
echo "$line" | sed -E 's/[()]//g'
}
TOTAL_PASS=0
TOTAL_COUNT=0
JSON_SUITES=""
MD_ROWS=""
idx=0
for entry in "${SUITES[@]}"; do
name="${entry%%|*}"
epoch=$((100 + idx))
pair=$(parse_pair "$epoch")
pass=$(echo "$pair" | awk '{print $1}')
count=$(echo "$pair" | awk '{print $2}')
if [ -z "$pass" ] || [ -z "$count" ]; then
pass=0
count=0
fi
TOTAL_PASS=$((TOTAL_PASS + pass))
TOTAL_COUNT=$((TOTAL_COUNT + count))
status="ok"
marker="✅"
if [ "$pass" != "$count" ]; then
status="fail"
marker="❌"
fi
if [ "$VERBOSE" = "-v" ]; then
printf " %-12s %s/%s\n" "$name" "$pass" "$count"
fi
if [ -n "$JSON_SUITES" ]; then JSON_SUITES+=","; fi
JSON_SUITES+=$'\n '
JSON_SUITES+="{\"name\":\"$name\",\"pass\":$pass,\"total\":$count,\"status\":\"$status\"}"
MD_ROWS+="| $marker | $name | $pass | $count |"$'\n'
idx=$((idx + 1))
done
printf '\nGo-on-SX conformance: %d / %d\n' "$TOTAL_PASS" "$TOTAL_COUNT"
cat > lib/go/scoreboard.json <<JSON
{
"language": "go",
"total_pass": $TOTAL_PASS,
"total": $TOTAL_COUNT,
"suites": [$JSON_SUITES,
{"name":"parse","pass":0,"total":0,"status":"pending"},
{"name":"types","pass":0,"total":0,"status":"pending"},
{"name":"eval","pass":0,"total":0,"status":"pending"},
{"name":"runtime","pass":0,"total":0,"status":"pending"},
{"name":"stdlib","pass":0,"total":0,"status":"pending"},
{"name":"e2e","pass":0,"total":0,"status":"pending"}
]
}
JSON
cat > lib/go/scoreboard.md <<MD
# Go-on-SX Scoreboard
**Total: ${TOTAL_PASS} / ${TOTAL_COUNT} tests passing**
| | Suite | Pass | Total |
|---|---|---|---|
$MD_ROWS|| parse | 0 | 0 |
|| types | 0 | 0 |
|| eval | 0 | 0 |
|| runtime | 0 | 0 |
|| stdlib | 0 | 0 |
|| e2e | 0 | 0 |
Generated by \`lib/go/conformance.sh\`.
MD
if [ "$TOTAL_PASS" -eq "$TOTAL_COUNT" ]; then
exit 0
else
exit 1
fi

371
lib/go/lex.sx Normal file
View File

@@ -0,0 +1,371 @@
;; lib/go/lex.sx — Go tokenizer with automatic semicolon insertion.
;;
;; Consumes lib/guest/lex.sx character-class predicates.
;;
;; Tokens: {:type T :value V :pos P}
;; Types:
;; "ident" — identifiers (foo, _bar, mixedCase)
;; "keyword" — one of the 25 Go keywords
;; "int" — integer literals (decimal only this iteration)
;; "string" — interpreted string literals "..."
;; "rune" — rune literals 'x' (single char + simple escapes)
;; "op" — operators & punctuation; :value is the literal text
;; "semi" — explicit ';' or auto-inserted (Go spec § Semicolons)
;; "eof" — end-of-input sentinel
;;
;; ASI (Go spec § Semicolons): a newline (or EOF, or a block comment
;; containing a newline) emits a ";semi" if the previous emitted token's
;; type is ident/int/string/rune, or its value is one of
;; {break, continue, fallthrough, return, ++, --, ), ], }}.
;;
;; All scanner locals are gl- prefixed: SX host primitives (peek/emit/etc.)
;; silently shadow guest-language defines. See feedback_sx_bind_clash.
(define
go-keywords
(list
"break"
"case"
"chan"
"const"
"continue"
"default"
"defer"
"else"
"fallthrough"
"for"
"func"
"go"
"goto"
"if"
"import"
"interface"
"map"
"package"
"range"
"return"
"select"
"struct"
"switch"
"type"
"var"))
(define go-keyword? (fn (s) (some (fn (k) (= k s)) go-keywords)))
(define go-asi-keywords (list "break" "continue" "fallthrough" "return"))
(define go-asi-ops (list "++" "--" ")" "]" "}"))
(define
go-asi-trigger?
(fn
(tok)
(if
(= tok nil)
false
(let
((ty (get tok :type)) (v (get tok :value)))
(or
(= ty "ident")
(= ty "int")
(= ty "string")
(= ty "rune")
(and (= ty "keyword") (some (fn (k) (= k v)) go-asi-keywords))
(and (= ty "op") (some (fn (o) (= o v)) go-asi-ops)))))))
(define
go-tokenize
(fn
(src)
(let
((tokens (list)) (pos 0) (src-len (len src)))
(define
gl-peek
(fn
(offset)
(if (< (+ pos offset) src-len) (nth src (+ pos offset)) nil)))
(define gl-cur (fn () (gl-peek 0)))
(define gl-advance! (fn (n) (set! pos (+ pos n))))
(define
gl-last
(fn
()
(if
(= (len tokens) 0)
nil
(nth tokens (- (len tokens) 1)))))
(define gl-emit! (fn (type value start) (append! tokens {:type type :value value :pos start})))
(define
gl-maybe-asi!
(fn
(at)
(when (go-asi-trigger? (gl-last)) (gl-emit! "semi" "\n" at))))
(define
gl-skip-line!
(fn
()
(when
(and (< pos src-len) (not (= (gl-cur) "\n")))
(gl-advance! 1)
(gl-skip-line!))))
(define
gl-skip-block!
(fn
(saw-nl)
(cond
(>= pos src-len)
saw-nl
(and (= (gl-cur) "*") (= (gl-peek 1) "/"))
(do (gl-advance! 2) saw-nl)
:else (let
((is-nl (= (gl-cur) "\n")))
(gl-advance! 1)
(gl-skip-block! (or saw-nl is-nl))))))
(define
gl-read-ident!
(fn
(start)
(when
(and (< pos src-len) (lex-ident-char? (gl-cur)))
(gl-advance! 1)
(gl-read-ident! start))
(slice src start pos)))
(define
gl-read-digits!
(fn
()
(when
(and (< pos src-len) (lex-digit? (gl-cur)))
(gl-advance! 1)
(gl-read-digits!))))
(define
gl-read-string!
(fn
()
(gl-advance! 1)
(let
((chars (list)))
(define
gl-string-loop
(fn
()
(cond
(>= pos src-len)
nil
(= (gl-cur) "\"")
(gl-advance! 1)
(= (gl-cur) "\\")
(do
(gl-advance! 1)
(when
(< pos src-len)
(let
((ch (gl-cur)))
(cond
(= ch "n")
(append! chars "\n")
(= ch "t")
(append! chars "\t")
(= ch "r")
(append! chars "\r")
(= ch "\\")
(append! chars "\\")
(= ch "\"")
(append! chars "\"")
(= ch "'")
(append! chars "'")
:else (append! chars ch))
(gl-advance! 1)))
(gl-string-loop))
:else (do
(append! chars (gl-cur))
(gl-advance! 1)
(gl-string-loop)))))
(gl-string-loop)
(join "" chars))))
(define
gl-read-rune!
(fn
()
(gl-advance! 1)
(let
((chars (list)))
(cond
(and (< pos src-len) (= (gl-cur) "\\"))
(do
(gl-advance! 1)
(when
(< pos src-len)
(let
((ch (gl-cur)))
(cond
(= ch "n")
(append! chars "\n")
(= ch "t")
(append! chars "\t")
(= ch "r")
(append! chars "\r")
(= ch "\\")
(append! chars "\\")
(= ch "'")
(append! chars "'")
(= ch "\"")
(append! chars "\"")
:else (append! chars ch))
(gl-advance! 1))))
(< pos src-len)
(do (append! chars (gl-cur)) (gl-advance! 1)))
(when
(and (< pos src-len) (= (gl-cur) "'"))
(gl-advance! 1))
(join "" chars))))
(define
gl-match-op
(fn
()
(let
((c0 (gl-cur))
(c1 (gl-peek 1))
(c2 (gl-peek 2)))
(cond
(and (= c0 "<") (= c1 "<") (= c2 "="))
"<<="
(and (= c0 ">") (= c1 ">") (= c2 "="))
">>="
(and (= c0 "&") (= c1 "^") (= c2 "="))
"&^="
(and (= c0 ".") (= c1 ".") (= c2 "."))
"..."
(and (= c0 "=") (= c1 "="))
"=="
(and (= c0 "!") (= c1 "="))
"!="
(and (= c0 "<") (= c1 "="))
"<="
(and (= c0 ">") (= c1 "="))
">="
(and (= c0 "&") (= c1 "&"))
"&&"
(and (= c0 "|") (= c1 "|"))
"||"
(and (= c0 "+") (= c1 "+"))
"++"
(and (= c0 "-") (= c1 "-"))
"--"
(and (= c0 "<") (= c1 "<"))
"<<"
(and (= c0 ">") (= c1 ">"))
">>"
(and (= c0 "+") (= c1 "="))
"+="
(and (= c0 "-") (= c1 "="))
"-="
(and (= c0 "*") (= c1 "="))
"*="
(and (= c0 "/") (= c1 "="))
"/="
(and (= c0 "%") (= c1 "="))
"%="
(and (= c0 "&") (= c1 "="))
"&="
(and (= c0 "|") (= c1 "="))
"|="
(and (= c0 "^") (= c1 "="))
"^="
(and (= c0 ":") (= c1 "="))
":="
(and (= c0 "<") (= c1 "-"))
"<-"
(and (= c0 "&") (= c1 "^"))
"&^"
(or
(= c0 "+")
(= c0 "-")
(= c0 "*")
(= c0 "/")
(= c0 "%")
(= c0 "&")
(= c0 "|")
(= c0 "^")
(= c0 "<")
(= c0 ">")
(= c0 "=")
(= c0 "!")
(= c0 "(")
(= c0 ")")
(= c0 "{")
(= c0 "}")
(= c0 "[")
(= c0 "]")
(= c0 ",")
(= c0 ".")
(= c0 ":"))
c0
:else nil))))
(define
gl-scan!
(fn
()
(cond
(>= pos src-len)
nil
(= (gl-cur) "\n")
(do (gl-maybe-asi! pos) (gl-advance! 1) (gl-scan!))
(lex-space? (gl-cur))
(do (gl-advance! 1) (gl-scan!))
(and (= (gl-cur) "/") (= (gl-peek 1) "/"))
(do (gl-advance! 2) (gl-skip-line!) (gl-scan!))
(and (= (gl-cur) "/") (= (gl-peek 1) "*"))
(do
(gl-advance! 2)
(let
((saw-nl (gl-skip-block! false)))
(when saw-nl (gl-maybe-asi! pos)))
(gl-scan!))
(= (gl-cur) ";")
(do
(gl-emit! "semi" ";" pos)
(gl-advance! 1)
(gl-scan!))
(lex-ident-start? (gl-cur))
(do
(let
((start pos))
(gl-read-ident! start)
(let
((word (slice src start pos)))
(gl-emit!
(if (go-keyword? word) "keyword" "ident")
word
start)))
(gl-scan!))
(lex-digit? (gl-cur))
(do
(let
((start pos))
(gl-read-digits!)
(gl-emit! "int" (slice src start pos) start))
(gl-scan!))
(= (gl-cur) "\"")
(let
((start pos) (v (gl-read-string!)))
(gl-emit! "string" v start)
(gl-scan!))
(= (gl-cur) "'")
(let
((start pos) (v (gl-read-rune!)))
(gl-emit! "rune" v start)
(gl-scan!))
:else (let
((op (gl-match-op)))
(cond
op
(do
(gl-emit! "op" op pos)
(gl-advance! (len op))
(gl-scan!))
:else (do (gl-advance! 1) (gl-scan!)))))))
(gl-scan!)
(gl-maybe-asi! pos)
(gl-emit! "eof" nil pos)
tokens)))

14
lib/go/scoreboard.json Normal file
View File

@@ -0,0 +1,14 @@
{
"language": "go",
"total_pass": 78,
"total": 78,
"suites": [
{"name":"lex","pass":78,"total":78,"status":"ok"},
{"name":"parse","pass":0,"total":0,"status":"pending"},
{"name":"types","pass":0,"total":0,"status":"pending"},
{"name":"eval","pass":0,"total":0,"status":"pending"},
{"name":"runtime","pass":0,"total":0,"status":"pending"},
{"name":"stdlib","pass":0,"total":0,"status":"pending"},
{"name":"e2e","pass":0,"total":0,"status":"pending"}
]
}

15
lib/go/scoreboard.md Normal file
View File

@@ -0,0 +1,15 @@
# Go-on-SX Scoreboard
**Total: 78 / 78 tests passing**
| | Suite | Pass | Total |
|---|---|---|---|
| ✅ | lex | 78 | 78 |
| ⬜ | parse | 0 | 0 |
| ⬜ | types | 0 | 0 |
| ⬜ | eval | 0 | 0 |
| ⬜ | runtime | 0 | 0 |
| ⬜ | stdlib | 0 | 0 |
| ⬜ | e2e | 0 | 0 |
Generated by `lib/go/conformance.sh`.

204
lib/go/tests/lex.sx Normal file
View File

@@ -0,0 +1,204 @@
;; Go tokenizer tests.
(define go-test-count 0)
(define go-test-pass 0)
(define go-test-fails (list))
(define gtok-type (fn (t) (get t :type)))
(define gtok-value (fn (t) (get t :value)))
(define tok-types (fn (src) (map gtok-type (go-tokenize src))))
(define tok-values (fn (src) (map gtok-value (go-tokenize src))))
(define
go-test
(fn
(name actual expected)
(set! go-test-count (+ go-test-count 1))
(if
(= actual expected)
(set! go-test-pass (+ go-test-pass 1))
(append! go-test-fails {:name name :expected expected :actual actual}))))
;; ── empty / whitespace ────────────────────────────────────────────
(go-test "empty source" (tok-types "") (list "eof"))
(go-test "spaces only" (tok-types " ") (list "eof"))
(go-test "tabs only" (tok-types "\t\t") (list "eof"))
(go-test
"newline only — no prior token, no ASI"
(tok-types "\n")
(list "eof"))
;; ── identifiers ───────────────────────────────────────────────────
(go-test "ident: simple" (tok-values "foo") (list "foo" "\n" nil))
(go-test
"ident: underscore prefix"
(tok-values "_bar")
(list "_bar" "\n" nil))
(go-test "ident: mixed case" (tok-values "fooBar") (list "fooBar" "\n" nil))
(go-test "ident: with digits" (tok-values "x123") (list "x123" "\n" nil))
(go-test "ident: type tag" (tok-types "foo") (list "ident" "semi" "eof"))
;; ── keywords (all 25) ─────────────────────────────────────────────
(go-test "kw: break" (tok-types "break") (list "keyword" "semi" "eof"))
(go-test "kw: case" (tok-types "case") (list "keyword" "eof"))
(go-test "kw: chan" (tok-types "chan") (list "keyword" "eof"))
(go-test "kw: const" (tok-types "const") (list "keyword" "eof"))
(go-test "kw: continue" (tok-types "continue") (list "keyword" "semi" "eof"))
(go-test "kw: default" (tok-types "default") (list "keyword" "eof"))
(go-test "kw: defer" (tok-types "defer") (list "keyword" "eof"))
(go-test "kw: else" (tok-types "else") (list "keyword" "eof"))
(go-test
"kw: fallthrough"
(tok-types "fallthrough")
(list "keyword" "semi" "eof"))
(go-test "kw: for" (tok-types "for") (list "keyword" "eof"))
(go-test "kw: func" (tok-types "func") (list "keyword" "eof"))
(go-test "kw: go" (tok-types "go") (list "keyword" "eof"))
(go-test "kw: goto" (tok-types "goto") (list "keyword" "eof"))
(go-test "kw: if" (tok-types "if") (list "keyword" "eof"))
(go-test "kw: import" (tok-types "import") (list "keyword" "eof"))
(go-test "kw: interface" (tok-types "interface") (list "keyword" "eof"))
(go-test "kw: map" (tok-types "map") (list "keyword" "eof"))
(go-test "kw: package" (tok-types "package") (list "keyword" "eof"))
(go-test "kw: range" (tok-types "range") (list "keyword" "eof"))
(go-test "kw: return" (tok-types "return") (list "keyword" "semi" "eof"))
(go-test "kw: select" (tok-types "select") (list "keyword" "eof"))
(go-test "kw: struct" (tok-types "struct") (list "keyword" "eof"))
(go-test "kw: switch" (tok-types "switch") (list "keyword" "eof"))
(go-test "kw: type" (tok-types "type") (list "keyword" "eof"))
(go-test "kw: var" (tok-types "var") (list "keyword" "eof"))
;; ── integer literals ──────────────────────────────────────────────
(go-test "int: zero" (tok-values "0") (list "0" "\n" nil))
(go-test "int: small" (tok-values "42") (list "42" "\n" nil))
(go-test "int: bigger" (tok-values "123456") (list "123456" "\n" nil))
(go-test "int: type" (tok-types "42") (list "int" "semi" "eof"))
;; ── string literals ───────────────────────────────────────────────
(go-test "string: empty" (tok-values "\"\"") (list "" "\n" nil))
(go-test "string: hello" (tok-values "\"hello\"") (list "hello" "\n" nil))
(go-test
"string: with space"
(tok-values "\"hi there\"")
(list "hi there" "\n" nil))
(go-test "string: escape n" (tok-values "\"a\\nb\"") (list "a\nb" "\n" nil))
(go-test "string: escape quote" (tok-values "\"a\\\"b\"") (list "a\"b" "\n" nil))
(go-test
"string: escape backslash"
(tok-values "\"a\\\\b\"")
(list "a\\b" "\n" nil))
(go-test "string: type" (tok-types "\"x\"") (list "string" "semi" "eof"))
;; ── rune literals ─────────────────────────────────────────────────
(go-test "rune: simple" (tok-values "'a'") (list "a" "\n" nil))
(go-test "rune: escape" (tok-values "'\\n'") (list "\n" "\n" nil))
(go-test "rune: type" (tok-types "'a'") (list "rune" "semi" "eof"))
;; ── comments ──────────────────────────────────────────────────────
(go-test "line comment" (tok-types "// ignored") (list "eof"))
(go-test "line comment then code" (tok-values "// hi\nx") (list "x" "\n" nil))
(go-test "block comment" (tok-types "/* a b c */") (list "eof"))
(go-test
"block comment inline"
(tok-values "x /* mid */ y")
(list "x" "y" "\n" nil))
(go-test
"block comment with newline — ASI"
(tok-types "x /* multi\nline */ y")
(list "ident" "semi" "ident" "semi" "eof"))
;; ── operators & punctuation ───────────────────────────────────────
(go-test
"ops: arithmetic"
(tok-values "+ - * / %")
(list "+" "-" "*" "/" "%" nil))
(go-test
"ops: comparison"
(tok-values "== != < > <= >=")
(list "==" "!=" "<" ">" "<=" ">=" nil))
(go-test "ops: logical" (tok-values "&& || !") (list "&&" "||" "!" nil))
(go-test
"ops: assign forms"
(tok-values "= := += -=")
(list "=" ":=" "+=" "-=" nil))
(go-test "ops: channel arrow" (tok-values "<- chan") (list "<-" "chan" nil))
(go-test "ops: incdec ASI" (tok-types "++ --") (list "op" "op" "semi" "eof"))
(go-test "ops: ellipsis" (tok-values "...") (list "..." nil))
(go-test
"punct: all brackets"
(tok-values "( ) { } [ ]")
(list "(" ")" "{" "}" "[" "]" "\n" nil))
(go-test
"punct: comma colon dot"
(tok-values ", : .")
(list "," ":" "." nil))
;; ── automatic semicolon insertion (Go spec § Semicolons) ──────────
(go-test
"ASI: after ident at newline"
(tok-types "x\ny")
(list "ident" "semi" "ident" "semi" "eof"))
(go-test "ASI: after int" (tok-types "42\n") (list "int" "semi" "eof"))
(go-test
"ASI: after string"
(tok-types "\"hi\"\n")
(list "string" "semi" "eof"))
(go-test "ASI: after rune" (tok-types "'a'\n") (list "rune" "semi" "eof"))
(go-test
"ASI: after )"
(tok-types "f()\n")
(list "ident" "op" "op" "semi" "eof"))
(go-test
"ASI: after ]"
(tok-types "x[0]\n")
(list "ident" "op" "int" "op" "semi" "eof"))
(go-test "ASI: after }" (tok-types "{}\n") (list "op" "op" "semi" "eof"))
(go-test "ASI: after ++" (tok-types "i++\n") (list "ident" "op" "semi" "eof"))
(go-test
"ASI: NOT after +"
(tok-types "x +\ny")
(list "ident" "op" "ident" "semi" "eof"))
(go-test
"ASI: NOT after ("
(tok-types "f(\nx)")
(list "ident" "op" "ident" "op" "semi" "eof"))
(go-test
"ASI: blank lines collapse — single semi only"
(tok-types "x\n\n\ny")
(list "ident" "semi" "ident" "semi" "eof"))
(go-test
"ASI: at EOF after ident"
(tok-types "x")
(list "ident" "semi" "eof"))
(go-test
"ASI: explicit semi"
(tok-types "x;y")
(list "ident" "semi" "ident" "semi" "eof"))
;; ── short program ─────────────────────────────────────────────────
(go-test
"short-decl: x := 42 (types)"
(tok-types "x := 42")
(list "ident" "op" "int" "semi" "eof"))
(go-test
"short-decl: x := 42 (values)"
(tok-values "x := 42")
(list "x" ":=" "42" "\n" nil))
(go-test
"func decl shape"
(tok-types "func foo() int { return 0 }")
(list
"keyword"
"ident"
"op"
"op"
"ident"
"op"
"keyword"
"int"
"op"
"semi"
"eof"))
;; ── report ────────────────────────────────────────────────────────
(define go-lex-test-summary (str "lex " go-test-pass "/" go-test-count))

View File

@@ -11,7 +11,7 @@ isolation: worktree
## Prompt
You are the sole background agent working `/root/rose-ash/plans/erlang-on-sx.md`. Isolated worktree, forever, one commit per feature. Never push.
You are the sole background agent working `/root/rose-ash/plans/erlang-on-sx.md`. Isolated worktree, forever, one commit per feature. Push to `origin/loops/erlang` after every commit.
## Restart baseline — check before iterating
@@ -42,7 +42,7 @@ Every iteration: implement → test → commit → tick `[ ]` → Progress log
- **Shared-file issues** → plan's Blockers with minimal repro.
- **Delimited continuations** are in `lib/callcc.sx` + `spec/evaluator.sx` Step 5. `sx_summarise` spec/evaluator.sx first — 2300+ lines.
- **SX files:** `sx-tree` MCP tools ONLY. `sx_validate` after edits.
- **Worktree:** commit locally. Never push. Never touch `main`.
- **Worktree:** commit, then push to `origin/loops/erlang`. Never touch `main`.
- **Commit granularity:** one feature per commit.
- **Plan file:** update Progress log + tick boxes every commit.

View File

@@ -0,0 +1,109 @@
# fed-prims loop agent (single agent, phase-ordered)
Role: iterates `plans/fed-sx-host-primitives.md` forever. Adds the pure-OCaml
crypto / CBOR / CID / Ed25519 / RSA primitives and the native HTTP server that
Erlang Phase 8 BIFs (and therefore fed-sx Milestone 1) are blocked on. One
feature per commit.
```
description: fed-prims host-primitive loop
subagent_type: general-purpose
run_in_background: true
isolation: worktree
```
## Prompt
You are the sole background agent working `/root/rose-ash/plans/fed-sx-host-primitives.md`.
You run in an isolated git worktree on branch `loops/fed-prims`. You work the
plan's phases in order (A→I), forever, one commit per feature. Push to
`origin/loops/fed-prims` after every commit.
## Restart baseline — check before iterating
1. Read `plans/fed-sx-host-primitives.md` — Phasing + Progress log + Blockers
tell you where you are.
2. `cd hosts/ocaml && dune build bin/sx_server.exe 2>&1 | tail` — must be green
before new work. If broken and not by your last edit, Blockers + stop.
3. `bash hosts/ocaml/browser/test_boot.sh` — the WASM kernel must boot. This is
the regression you are most at risk of causing.
4. Find the first unchecked `[ ]` phase. That is your iteration.
## The iteration
Implement → `dune build bin/sx_server.exe` (native) → **WASM build check**
(`test_boot.sh`) → run the phase's tests → run the no-regression gate
(`conformance.sh`, see plan) → commit → tick the `[ ]` → append one dated line
to the Progress log (newest first) → push → stop.
One phase = one iteration = one commit. Do not batch phases.
## Ground rules (hard)
- **Scope:** only `hosts/ocaml/lib/**`, `hosts/ocaml/bin/**`, and
`plans/fed-sx-host-primitives.md`. The single exception is Phase I, which also
edits exactly one Blockers entry in `plans/erlang-on-sx.md`. Do **not** touch
`lib/erlang/**`, `spec/`, `lib/` root, other `lib/<lang>/`.
- **Pure OCaml for `lib/` primitives.** No new opam deps. WASM-safe: no C stubs,
no `Unix`/`Thread` in `lib/sx_primitives.ml`. The HTTP server (Phase H) is
native-only — register it in `bin/sx_server.ml`, never in the lib.
- **Prove WASM every commit.** `test_boot.sh` green is a phase gate, not
optional. A broken WASM kernel = the phase failed; revert and rethink.
- **No-regression gate:** OCaml `run_tests` + Erlang `conformance.sh` must stay
at their current pass counts (Erlang 715/715 once the merge lands; otherwise
whatever `lib/erlang/scoreboard.json` says). New crypto tests are additive.
- **`.ml`/`.sh` files:** ordinary `Read`/`Edit`/`Write` — these are NOT `.sx`.
Do not use sx-tree MCP for OCaml. (sx-tree is only if you ever touch `.sx`,
which this loop should not.)
- **Builds are slow.** Use a generous `timeout` on `dune build` (≥600s) and on
`conformance.sh` (≥400s). If a build genuinely hangs >10min, Blockers + stop.
- **Worktree:** commit, push `origin/loops/fed-prims`. Never `main`, never
`architecture`.
- **Commit granularity:** one feature per commit. `fed-prims: SHA-256 + 4 NIST
vectors`. Update Progress log + tick box every commit.
- **If blocked** two iterations on the same issue: Blockers entry, move to the
next independent phase (A-G are largely independent; H is independent; only
D depends on A+C, E depends on A).
## Crypto correctness gotchas
- **Test vectors are non-negotiable.** Every hash/sig phase lands with published
vectors (NIST FIPS 180-4 / 202, RFC 8032, RFC 8949). A primitive without a
passing standard vector is not done — do not tick the box.
- **SHA endianness:** SHA-2 is big-endian length-append; SHA-3 is little-endian
Keccak lane order. Easy to get backwards — the empty-string vector catches it.
- **dag-cbor determinism:** map keys sorted by **byte length first, then
bytewise**. Not lexicographic-only. The "reordered dict keys → identical
bytes" test is the guard; it must be in the phase.
- **CIDv1 layout:** `0x01 || codec-varint || (mh-code-varint || mh-len-varint ||
digest)`, then multibase base32-lower with a leading `b`. Off-by-one in varint
is the classic bug — cross-check one CID against `ipfs` CLI if available.
- **Ed25519 verify is total:** wrong-length inputs return `false`, never raise.
Verify checks `[S]B = R + [k]A` with `k = SHA512(R||A||M)` reduced mod L.
- **RSA:** PKCS#1 v1.5 EMSA — the DigestInfo DER prefix for SHA-256 is fixed
(`3031300d060960864801650304020105000420`). Constant-time not required (verify
only, public data).
## General gotchas
- The `sx` library is `(wrapped false)` — new module `Sx_sha2` is referenced as
`Sha2.f` is **wrong**; it's `Sx_sha2.f` unless you also alias. Check
`lib/dune` `include_subdirs unqualified`: a new `lib/sx_sha2.ml` is module
`Sx_sha2`. Match the existing `Sx_*` naming.
- `Eval_error` is the primitive-error exception; raise it with `"name: shape"`.
- Reach a primitive from SX to smoke-test:
`printf '(epoch 1)\n(crypto-sha256 "abc")\n' | hosts/ocaml/_build/default/bin/sx_server.exe`
- The native binary the conformance gate uses is
`hosts/ocaml/_build/default/bin/sx_server.exe` — rebuild it before gating.
## Style
- No comments in OCaml unless non-obvious (crypto constants ARE non-obvious —
cite the RFC/FIPS section in a one-line comment).
- No new planning docs — update `plans/fed-sx-host-primitives.md` inline.
- One feature per iteration. Build. WASM-check. Test. Gate. Commit. Log. Push.
Next.
Go. Run the restart baseline. Find the first unchecked `[ ]`. Implement it.
Remember: no commit without a passing standard test vector AND a green WASM
boot.

View File

@@ -0,0 +1,86 @@
# sx-vm-extensions loop agent
Role: drives `plans/sx-vm-opcode-extension.md` to completion. One phase per
fire (A → B → C → D → E). Bounded loop — after Phase E acceptance, the loop
is done.
```
description: sx-vm-extensions queue loop
subagent_type: general-purpose
run_in_background: true
isolation: worktree (already on loops/sx-vm-extensions)
```
## What this loop is for
Mechanism in `hosts/ocaml/lib/` that lets language ports register specialized
bytecode opcodes without modifying the SX VM core. Direct prerequisite for
**erlang-on-sx Phase 9** (the BEAM analog) and a structural enabler for any
future language port that wants performance-critical opcodes.
## The queue
Per `plans/sx-vm-opcode-extension.md`, in order:
- **Phase A** — Opcode ID partition + dispatch fallthrough in `sx_vm.ml`.
Add `Invalid_opcode of int` exception, `extension_dispatch_ref`, the
`| op when op >= 200 -> !extension_dispatch_ref op vm frame` arm, and a
partition comment near the opcode list.
- **Phase B** — Extension registry module (`sx_vm_extensions.ml`).
`register`, `dispatch`, `id_of_name`, `state_of_extension`. Wire dispatch
into Phase A's ref at module init.
- **Phase C** — Compiler-side opcode lookup primitive (`extension-opcode-id`).
- **Phase D** — Test extension at `hosts/ocaml/lib/extensions/test_ext.ml`,
end-to-end SX → bytecode → VM dispatch flow.
- **Phase E** — JIT awareness: extension opcodes mark a lambda as
interpret-only.
## Per-fire workflow (hard)
1. Read `plans/sx-vm-opcode-extension.md` — find the first un-ticked phase.
2. Implement the phase (only files in `hosts/ocaml/**` and the plan file).
3. Build via `sx_build target=ocaml`.
4. Run regression: every existing language-port conformance suite plus
the OCaml unit tests. The list lives at `lib/<lang>/conformance.sh`
13 suites at last count (apl, common-lisp, datalog, erlang, forth, guest,
haskell, js, lua, ocaml, prolog, smalltalk, tcl).
5. If green, commit (short factual message — `vm-ext: phase A — dispatch
fallthrough` style).
6. Tick the `[ ]` for the completed phase in the plan, append one dated
line to the Progress log (newest first).
7. Stop. Wait for the next fire.
## Ground rules (hard)
- **Scope:** only `hosts/ocaml/**` and `plans/sx-vm-opcode-extension.md`.
Do **not** edit `lib/<lang>/**`, `spec/**`, `shared/**`, or any other
language port's tests.
- **One phase per fire.** Don't combine phases even if a phase looks small.
The point of the loop is incremental commits.
- **Commit locally only.** Do **not** push. Do **not** touch `main`.
- **Worktree:** you are on `loops/sx-vm-extensions` in
`/root/rose-ash-loops/sx-vm-extensions`.
- **OCaml SX VM gotchas:**
- `vm` and `frame` types are defined in `sx_vm.ml`, not `sx_types.ml`.
Forward refs (like the existing `jit_compile_ref` pattern) are how
sibling modules avoid circular dependency.
- Current core opcode ceiling is 175 (OP_DEC). The extension threshold
is 200, leaving 24 spare slots for future core opcodes.
- JIT compilation is lazy per-lambda. See `project_jit_compilation.md`
in memory for the cache + sentinel pattern.
- **SX edits:** `sx-tree` MCP tools only (none expected for this loop, but
if needed).
- **OCaml edits:** Edit/Write tools are fine — these aren't `.sx` files.
## Done condition
Phase E acceptance: all 13 (or however many exist at the time) language-port
conformance suites pass, OCaml unit tests pass, the test extension from
Phase D demonstrates end-to-end flow including JIT routing. Loop is
complete; mark and stop.
## After acceptance
Hand off to the Erlang loop: `hosts/ocaml/lib/extensions/erlang.ml` becomes
the first real consumer, written against this mechanism instead of the
Phase 9b stub dispatcher in `lib/erlang/vm/dispatcher.sx`.

View File

@@ -10,7 +10,9 @@ End-state goal: spawn a million processes, run the classic **ring benchmark**, p
- **Conformance:** not BEAM-compat. "Looks like Erlang, runs like Erlang, not byte-compatible." We care about semantics, not BEAM bug-for-bug.
- **Test corpus:** custom — ring, ping-pong, fibonacci-server, bank-account-server, echo-server, plus ~100 hand-written tests for patterns/guards/BIFs. No ISO Common Test.
- **Binaries:** basic bytes-lists only; full binary pattern matching deferred.
- **Hot code reload, distribution, NIFs:** out of scope entirely.
- **Distribution, NIFs:** out of scope entirely.
- **Hot code reload (Phase 7):** in scope — driven by [fed-sx](../plans/fed-sx-design.md) (section 17.5) which needs federated modules to be re-loaded without restarting the scheduler.
- **FFI BIFs (Phase 8):** in scope — Erlang code needs `crypto:hash`, `cid:from_bytes`, `file:read_file`, `httpc:request`, `sqlite:exec` to participate in fed-sx. A general FFI BIF registry replaces today's hard-coded BIF dispatch.
## Ground rules
@@ -95,10 +97,128 @@ Core mapping:
- [x] ETS-lite (in-memory tables via SX dicts) — **13 new eval tests**; `ets:new/2`, `insert/2`, `lookup/2`, `delete/1-2`, `tab2list/1`, `info/2` (size); set semantics with full Erlang-term keys
- [x] More BIFs — target 200+ test corpus green — **40 new eval tests**; 530/530 total. New: `abs/1`, `min/2`, `max/2`, `tuple_to_list/1`, `list_to_tuple/1`, `integer_to_list/1`, `list_to_integer/1`, `is_function/1-2`, `lists:seq/2-3`, `lists:sum/1`, `lists:nth/2`, `lists:last/1`, `lists:member/2`, `lists:append/2`, `lists:filter/2`, `lists:any/2`, `lists:all/2`, `lists:duplicate/2`
### Phase 7 — hot code reload
Driven by **fed-sx** (see `plans/fed-sx-design.md` §17.5): federated modules must be replaceable at runtime without bouncing the scheduler. Classic OTP behaviour: two versions per module ("current" and "old"), local calls stick to the version the process started with, cross-module (`M:F(...)`) calls always resolve to the current version, and `purge` kills any process still running old code.
- [x] Module version slot: `er-modules` entry becomes `{:current MOD-ENV :old MOD-ENV-or-nil :version INT}`; bump version on each load — **13 new runtime tests** (543/543 total)
- [x] `code:load_binary/3` (the canonical reload BIF) — re-parses module source, swaps `:current``:old`, installs new env as `:current`; returns `{module, Name}` or `{error, Reason}` (badarg / badfile / module_name_mismatch). **+8 eval tests** (551/551 total). `code:load_file/1` is a thin filesystem wrapper around this and lands once `file:read_file/1` is in (Phase 8).
- [x] `code:purge/1` + `code:soft_purge/1` — purge clears `:old` slot and kills any process whose `:initial-fun` env identity matches the old env (returns `true` if there was old code, `false` if there wasn't). soft_purge: refuses (returns `false`, leaves `:old` intact) if any process is still pinned to the old env; otherwise clears and returns `true`. **+10 eval tests** (561/561 total). Caveat: a true "lingering on old code" test needs `spawn/3` (still stubbed) or `fun M:F/A` syntax (not parsed) — anonymous `fun () -> M:F() end` closures capture the caller's env, not the module's, and cross-module calls always resolve to `:current`. Current tests therefore exercise the return-value matrix but not the kill path.
- [x] `code:which/1`, `code:is_loaded/1`, `code:all_loaded/0` — introspection. **+10 eval tests** (571/571 total). Return-value contract: `which``loaded` / `non_existing` (since we have no filesystem path); `is_loaded``{file, loaded}` / `false`; `all_loaded` → list of `{Module, loaded}` tuples. Non-atom Mod raises `error:badarg`.
- [x] Cross-module call `M:F(...)` dispatches to `:current`; local calls inside a module body keep using the env they closed over so a running process finishes its current function with the version it started with — **+6 eval tests** verifying the property end-to-end (577/577 total). No implementation change: `er-apply-user-module` already routes through `er-module-current-env`, and `er-mk-fun` captures its env by reference so closures created under v1 retain v1's `mod-env` even after the slot bumps to v2.
- [x] Tests: load v1 → spawn → load v2 → cross-module call hits v2 → local call inside v1 process keeps v1 semantics until function returns → purge kills v1 procs → soft_purge refuses while v1 procs alive — **+5 capstone eval tests** (582/582 total). Required extending `er-procs-on-env` from raw identity match to `er-env-derived-from?` (an env "comes from" mod-env if it IS mod-env or contains a value that's a fun closed over mod-env), because `er-apply-fun-clauses` does `er-env-copy closure-env` before binding params — so the spawned-from-inside-module fun's `:env` is a fresh dict, not mod-env. Test ladder runs as one single `erlang-eval-ast` program (every call to `ev` resets the scheduler via `er-sched-init!`, so Pid handles must live within one program).
### Phase 8 — FFI BIF mechanism + standard libs
Replace today's hardcoded BIF dispatch (`er-apply-bif`/`er-apply-remote-bif` in `transpile.sx`) with a runtime-extensible **BIF registry**. Each registry entry is `{:module :name :arity :fn :pure?}`. Standard libs are then registered at boot, and fed-sx can register new BIFs from `.sx` files. Includes the marshalling layer (Erlang term ↔ SX value) so wrappers stay one-liners.
- [x] BIF registry: `er-bif-registry` global dict keyed by `"Module/Name/Arity"`, with `er-register-bif!`/`er-register-pure-bif!`/`er-lookup-bif`/`er-list-bifs`/`er-bif-registry-reset!` helpers — **+18 runtime tests** (600/600 total). Entries are `{:module :name :arity :fn :pure?}`. Arity is part of the key so `m:f/1` and `m:f/2` are independent. Re-registering the same key replaces the previous entry; reset clears.
- [x] Migrate existing local + remote BIFs (length/hd/tl/lists:*/io:format/ets:*/etc.) onto the registry; delete the giant `cond` dispatch in `er-apply-bif`/`er-apply-remote-bif`. Conformance held at **600/600** after migration (baseline was 600, not the plan-text's 530 — the text was authored before Phase 7 work added rows). 67 builtin registrations across `erlang`/`lists`/`io`/`ets`/`code` modules; multi-arity BIFs (`is_function`, `spawn`, `exit`, `io:format`, `lists:seq`, `ets:delete`) register once per arity, all pointing at the same impl which dispatches on `(len vs)` internally. The four per-module cond dispatchers (`er-apply-lists-bif`, `er-apply-io-bif`, `er-apply-ets-bif`, `er-apply-code-bif`) are deleted. `er-apply-bif` and `er-apply-remote-bif` are now ~5-line registry lookups; user modules still win precedence over the registry.
- [x] Term-marshalling helpers: `er-of-sx` (SX → Erlang) and `er-to-sx` (Erlang → SX). atom ↔ symbol, nil ↔ `()`, cons → list, tuple → list (one-way; tuples flatten), binary ↔ SX string, integer / float / boolean passthrough. **+23 runtime tests** (623/623 total). Erlang maps (`dict ↔ map`) deferred — Erlang map term not implemented in this port; will land when `#{}` syntax does. Pids, refs, funs pass through unchanged. SX strings on the way back become Erlang binaries (most useful FFI return shape).
- [x] `crypto:hash/2`**WIRED 2026-05-18** against `crypto-sha256`/`crypto-sha512`/`crypto-sha3-256` (loops/fed-prims). `crypto:hash(Type, Data)`: `Type``sha256|sha512|sha3_256` atom; `Data` an Erlang binary/string/charlist (→ SX byte-string via `er-source-to-string`). Returns the **raw digest as an Erlang binary** (host hex → bytes via `er-hex->bytes`). Bad type / non-binary → `error:badarg`. 6 ffi tests (digest sizes 32/64, sha3 is_binary, deterministic, distinct, badarg).
- [x] `cid:from_bytes/1`, `cid:to_string/1`**WIRED 2026-05-18**. `cid:from_bytes(Bin)` → CIDv1 raw-codec (0x55), sha2-256 multihash built in SX (`[0x12,0x20]++digest`) fed to `cid-from-bytes`; returned as an Erlang binary string. `cid:to_string(Term)` → canonical CIDv1 of the term's stable `er-format-value` string via `cid-from-sx` (cbor-encode rejects marshalled symbols, so `er-to-sx` is unencodable for compound terms — string form is total + deterministic). 7 ffi tests (is_binary, deterministic, distinct-inputs, non-binary badarg, to_string is_binary/deterministic/distinct).
- [x] `file:read_file/1`, `file:write_file/2`, `file:delete/1`**+10 eval tests** (633/633 total). Returns `{ok, Binary}` / `ok` / `{error, Reason}` where Reason is `enoent`/`eacces`/`enotdir`/`eisdir`/`posix_error` (classified from the SX `file-read`/`-write`/`-delete` exception string). Path accepts SX string, Erlang binary, or Erlang char-code list. **`file:list_dir/1` WIRED 2026-05-18** against `file-list-dir``{ok, [Binary]}` (entries marshalled via `er-of-sx`) / `{error, Reason}` (same `er-classify-file-error` mapping; missing dir → `enoent`). 4 ffi tests (ok-tag, non-empty, entries-are-binaries, missing-enoent).
- [ ] `httpc:request/4`**BLOCKED** (no HTTP client primitive). See Blockers.
- [ ] `sqlite:open/1`, `sqlite:close/1`, `sqlite:exec/2`, `sqlite:query/2`**BLOCKED** (no SQLite primitive). See Blockers.
- [x] Tests: 1 round-trip per BIF; suite name `ffi`; conformance scoreboard auto-picks it up — **+14 ffi tests** at 637/637 total. Suite covers the 3 implemented file BIFs (9 tests: write-ok, read-ok-tag, payload-is-binary, byte_size content, missing-enoent, bad-path-enoent, binary-payload round-trip, delete-ok, read-after-delete-enoent) plus 5 negative asserts (one per blocked BIF — `crypto:hash`/`cid:from_bytes`/`file:list_dir`/`httpc:request`/`sqlite:exec`) so this suite fails fast if a future iteration adds a wrapper without registering proper tests. Target "+40 ffi tests" was relative to the original 5-BIF-family plan; with 5 of those families blocked on host primitives, the achievable count is 14 — the suite scaffolding is what matters and is ready to accept the remaining tests when the primitives land.
### Phase 9 — specialized opcodes (the BEAM analog)
**Driver:** Erlang-on-SX going through the general-purpose CEK machine has architectural perf ceilings (call/cc per receive, env-copy per call, mailbox rebuild on delete). The fix is specialized bytecode opcodes that bypass the general machinery for hot Erlang operations. Targets: 100k+ message hops/sec, 1M-process spawn in under 30sec. Layered perf strategy: Layer 1 (this) = specialized opcodes; Layer 2 (Phase 10, deferred) = multi-core scheduler.
**Architectural note:** opcodes get developed in `lib/erlang/vm/` (in scope). The **opcode extension mechanism in `hosts/ocaml/`** (Phase 9a) is **out of scope** for this loop — log as Blocker until a session that owns `hosts/` lands it. Sub-phases 9b-9g design and test opcodes against a stub dispatcher in the meantime; integrate when 9a is available.
**Shared-opcode discipline:** opcodes that another language port could plausibly use (pattern match, perform/handle, record access) get prepared for **chiselling out to `lib/guest/vm/`** when a second use materialises. Same lib/guest pattern, applied at the bytecode layer. Don't pre-extract; do annotate candidates in commit messages.
- [x] **9a — Opcode extension mechanism****INTEGRATED** (scope widened by user 2026-05-15: hosts/ in scope, merging back). Cherry-picked the 5 vm-ext commits (phases A-E: dispatch fallthrough for opcodes ≥200, `Sx_vm_extension` interface, `Sx_vm_extensions` registry, `extension-opcode-id` SX primitive, JIT skip path) onto loops/erlang. Force-linked `Sx_vm_extensions` into `bin/sx_server.ml` so its module-init runs (was dead-code-eliminated — only `run_tests` referenced it). `extension-opcode-id` is now live in the runtime: returns the registered opcode id, or nil for unknown names. Built clean; conformance held at **709/709** on the freshly built binary. Design: `plans/sx-vm-opcode-extension.md`.
- [x] **9b — `OP_PATTERN_TUPLE` / `OP_PATTERN_LIST` / `OP_PATTERN_BINARY`****+19 vm tests** (656/656 total). Stub dispatcher in `lib/erlang/vm/dispatcher.sx` mirrors the OCaml extension shape from `plans/sx-vm-opcode-extension.md`: `er-vm-register-opcode!`/`er-vm-lookup-opcode-by-id`/`er-vm-lookup-opcode-by-name`/`er-vm-dispatch`. Opcode IDs 128 (TUPLE), 129 (LIST), 130 (BINARY) per the guest-tier partition (128-199). Handlers are thin wrappers over the existing `er-match-tuple`/`er-match-cons`/`er-match-binary` for now; the real specialization (skip AST walk, register-machine operands) lands when 9a integrates. Conformance must remain unchanged — **656/656** preserved. Candidate for chiselling to `lib/guest/vm/match.sx` once a second port (Prolog? miniKanren?) wants the same opcodes.
- [x] **9c — `OP_PERFORM` / `OP_HANDLE`****+9 vm tests** (665/665 total). Stubs in `lib/erlang/vm/dispatcher.sx`: `OP_PERFORM` (id 131) raises `{:tag "vm-effect" :effect <name> :args <args>}`; `OP_HANDLE` (id 132) wraps a thunk in `guard`, catches matching effects (by `:effect` name), passes args to the handler, returns the handler's result. Non-matching effects rethrow to outer handlers (verified by a nested-handle test). Pure Erlang `receive` interface unchanged; this is the substrate for the eventual call/cc-free implementation when 9a integrates. Candidate for chiselling (Scheme call/cc, OCaml 5 effects, miniKanren all want the same shape).
- [x] **9d — `OP_RECEIVE_SCAN`****+10 vm tests** (675/675 total). Stub at id 133 in `lib/erlang/vm/dispatcher.sx`. Operand contract: `(clauses mbox-list env)` where each clause is `{:pattern :guards :body}`, mbox-list is a plain SX list (not a queue — caller does queue→list before invoking and queue-delete after). Walks mbox in arrival order; tries each clause per message; first match returns `{:matched true :index N :body B}` (env mutated with bindings, body NOT evaluated — caller chooses when); no match returns `{:matched false}`. Pure pattern scan; suspension is the caller's job (compose with OP_PERFORM "receive-suspend" once 9a integrates). The real opcode will skip the AST walk by JIT-compiling each clause's match expr; this stub re-uses `er-match!` for correctness.
- [x] **9e — `OP_SPAWN` / `OP_SEND` + lightweight scheduler****+16 vm tests** (691/691 total). Stubs at ids 134 (SPAWN) and 135 (SEND) in `lib/erlang/vm/dispatcher.sx`, plus the VM-process registry: `er-vm-procs` (dict pid → proc record), `er-vm-next-pid`, `er-vm-procs-reset!`, `er-vm-proc-new!`/`get`/`send!`/`mailbox`/`state`/`count`. Process record shape is the register-machine layout the real scheduler will use: `{:id :registers (list of 8 nil slots) :mailbox (SX list) :state ("runnable"/"waiting"/"dead") :initial-fn :initial-args}`. OP_SPAWN returns a numeric pid and allocates a fresh record; OP_SEND appends to the target's mailbox, flipping `:state` from "waiting" → "runnable" if needed (returns true on success, false on unknown pid — no crash). Sits parallel to `er-scheduler` (the language-level scheduler from Phase 3); the real VM scheduler will take over once 9a integrates and Erlang programs compile to bytecode. Perf targets in the bullet (spawn <50µs, send <5µs) defer to the integration step.
- [x] **9f — BIF dispatch table****+18 vm tests** (709/709 total). 10 hot BIFs get their own opcode IDs (136-145) in `lib/erlang/vm/dispatcher.sx`: `OP_BIF_LENGTH`, `OP_BIF_HD`, `OP_BIF_TL`, `OP_BIF_ELEMENT`, `OP_BIF_TUPLE_SIZE`, `OP_BIF_LISTS_REVERSE`, `OP_BIF_IS_INTEGER`, `OP_BIF_IS_ATOM`, `OP_BIF_IS_LIST`, `OP_BIF_IS_TUPLE`. Each opcode's handler IS the underlying `er-bif-*` impl directly (no registry-string-lookup), so cost is opcode-id → handler one-hop. Cold BIFs continue through `er-apply-bif` / `er-lookup-bif` as before. IDs 136-159 reserved for future hot-BIF additions.
- [x] **9h — `erlang_ext.ml`** — OCaml extension at `hosts/ocaml/lib/extensions/erlang_ext.ml` registering the 18-opcode Erlang namespace (ids **222-239**, names `erlang.OP_*` mirroring the SX stub dispatcher). Registered at sx_server startup via `Erlang_ext.register ()` (guarded against double-register Failure). `extension-opcode-id "erlang.OP_PATTERN_TUPLE"` → 222 … `OP_BIF_IS_TUPLE` → 239, unknown → nil. Handlers raise a descriptive not-wired `Eval_error` (bytecode emission is a later phase; SX stub dispatcher remains the working specialization path) — keeps the extension honest rather than silently corrupting the VM stack. id range 222+ dodges test_reg (210/211) + test_ext (220/221) so all three coexist in run_tests. **+5 OCaml ext tests** (run_tests `Suite: extensions/erlang_ext`); Erlang conformance held **709/709**.
- [x] **9i — wire SX dispatcher to real ids**`lib/erlang/vm/dispatcher.sx` gains `er-vm-host-opcode-id` (thin `extension-opcode-id` wrapper) and `er-vm-effective-opcode-id name stub-id` (host id when non-nil, else stub-id). `extension-opcode-id` resolves lazily at call time so loading the file is safe even on a binary lacking the primitive; only invoking the resolver there would raise (documented prereq — the loop builds + runs against the binary that has it). **+6 vm tests** (715/715): OP_PATTERN_TUPLE→222, OP_BIF_IS_TUPLE→239, unknown→nil, effective prefers host (OP_BIF_LENGTH→230), effective falls back to stub on nil (999), and a sweep asserting the whole 18-name namespace maps contiguously to 222..239. Stub-local ids (128-145) registration untouched so the prior 72 vm tests stay green.
- [x] **9g — Conformance + perf bench** — Ran `lib/erlang/bench_ring.sh 10 100 500 1000` on the integrated binary (9a+9h+9i built in): 11/36/35/31 hops/s — **unchanged from the pre-integration baseline**, which is the correct expected result and doubles as a no-regression proof (the full extension wiring added zero per-hop cost). Conformance **715/715** on the same binary. Numbers recorded in `lib/erlang/bench_ring_results.md` with the rationale. The ~3000×/~1000× targets are gated on Phase 10 (bytecode emission) — the compiler doesn't emit `erlang.OP_*` yet, so every hop still takes the general CEK path. 9g's deliverable (honest measurement on the integrated binary) is complete.
### Phase 10 — bytecode emission (unlock the speedup)
The Phase 9 opcodes are registered, tested, and bridged SX↔OCaml, but inert: nothing emits them. Phase 10 makes the speedup real.
- [ ] **10a — compiler emits `erlang.OP_*` at hot sites****BLOCKED on `lib/compiler.sx` ownership (out of this loop's scope).** Architecture fully mapped (2026-05-15, see Blockers + design below). The correct implementation site is `lib/compiler.sx`'s `compile-call` — it must recognize calls to the Erlang runtime-helper functions that have a registered `erlang.OP_*` opcode and emit that opcode (via the already-live `extension-opcode-id` primitive) instead of a generic CALL. This is **generic shared compiler infrastructure** (any guest port — Prolog, Lua — would use the same intrinsic mechanism), explicitly excluded by the ground rules ("Don't edit lib/ root"; not in the widened hosts/-only scope). Concrete sub-steps for the owning session:
- **10a.1** Add an *intrinsic registry* to `lib/compiler.sx`: a dict `callee-name → extension-opcode-name`, populated by guests at load (e.g. Erlang registers `er-bif-length → "erlang.OP_BIF_LENGTH"`, `er-match-tuple → "erlang.OP_PATTERN_TUPLE"`, …).
- **10a.2** In `compile-call`: if the resolved callee is in the intrinsic registry AND `(extension-opcode-id name)` is non-nil, compile the args normally (push left→right) then emit the single opcode byte instead of `CALL`. Fall back to generic CALL when the opcode is absent (graceful on binaries without the extension).
- **10a.3** Define the operand/stack contract per opcode class and make `erlang_ext.ml`'s control handlers (222-229) match it (pattern opcodes need the pattern AST as a constant-pool operand + the scrutinee on the stack; perform/handle/receive/spawn/send need OCaml↔SX runtime-state access — see 10b-control note).
- **10a.4** Conformance must stay green; add bytecode-emission tests (compile an Erlang fn, disassemble, assert the opcode appears at the hot site).
Until a session owning `lib/compiler.sx` lands 10a.1-10a.2, the speedup cannot be realized from this loop. The BIF half of 10b (operand-less stack ops) is fully done and *would* light up immediately once emission exists.
- [~] **10b — real `erlang_ext.ml` handlers****10 of 18 real** (ALL BIF opcodes done: 230-239). Latest: `OP_BIF_ELEMENT` (233, pops Tuple-then-Index, 1-indexed, range-checked) and `OP_BIF_LISTS_REVERSE` (235, builds a fresh reversed cons chain in OCaml). Re-scoping correction: ELEMENT/REVERSE were earlier mislabelled "gated on 10a" — they're pure stack transforms (no bytecode operands; element/2 just pops 2), so they landed now. **21 e2e run_tests** total. Remaining 8 stubs are the genuine control/structural opcodes that DO need compiler-defined operands + runtime state: `OP_PATTERN_TUPLE/LIST/BINARY` (222-224), `OP_PERFORM/HANDLE` (225-226), `OP_RECEIVE_SCAN` (227), `OP_SPAWN/SEND` (228-229). not-wired guard repointed to 222. 715/715 unaffected. — earlier note: 8 of 18 real (all hot-BIFs done). Real register-machine handlers: `OP_BIF_LENGTH` (230, cons-walk), `OP_BIF_HD` (231), `OP_BIF_TL` (232), `OP_BIF_TUPLE_SIZE` (234, handles List + ListRef `:elements`), `OP_BIF_IS_INTEGER` (236, `Integer _`), `OP_BIF_IS_ATOM` (237), `OP_BIF_IS_LIST` (238, cons|nil), `OP_BIF_IS_TUPLE` (239) — all operate on the tagged-Dict value repr, push Erlang bool atoms via a `mk_atom` helper, raise on type errors. **15 end-to-end run_tests tests** (build real bytecode `[CONST i; op; RETURN]` with list/tuple/atom constants, assert via `Sx_vm.execute_module`). Still `not_wired`: the 8 control opcodes — `OP_PATTERN_TUPLE/LIST/BINARY` (222-224), `OP_PERFORM/HANDLE` (225-226), `OP_RECEIVE_SCAN` (227), `OP_SPAWN/SEND` (228-229) — plus `OP_BIF_ELEMENT` (233, needs 2 operands) and `OP_BIF_LISTS_REVERSE` (235). not-wired guard repointed to 233. 715/715 conformance unaffected (VM-bytecode path only; interpreter untouched). Remaining 10b: the 10 control/structural handlers.
- [ ] **10c — perf validation**: re-run `bench_ring.sh`; target 100k+ hops/sec at N=1000, 1M-process spawn < 30s; record in `bench_ring_results.md`. Conformance must stay green.
**Acceptance:** ring benchmark hits the 100k hops/sec target. All prior phase tests pass. Two opcodes chiselled to `lib/guest/vm/` (or annotated as candidates with a written rationale).
## Progress log
_Newest first._
- **2026-05-18 Phase 8 host-primitive BIFs wired (crypto / cid / file:list_dir)** — `loops/fed-prims` (merged at architecture `380bc69f`) delivered the platform primitives; wired the 3 previously-BLOCKED Phase 8 BIF groups in `lib/erlang/runtime.sx` as `er-register-pure-bif!`/`er-register-bif!` entries with term marshalling at the boundary. **`crypto:hash/2`** → `crypto-sha256`/`crypto-sha512`/`crypto-sha3-256`; atom `Type` dispatch, `er-source-to-string` for `Data`, host hex result → raw bytes via new `er-hexval`/`er-hex->bytes`, returns Erlang binary; bad type/arg → `error:badarg`. **`cid:from_bytes/1`** → `cid-from-bytes` with raw codec `0x55` + sha2-256 multihash assembled in SX (`[0x12,0x20]++digest`); **`cid:to_string/1`** → `cid-from-sx` of `er-format-value` (cbor-encode rejects `er-to-sx`-marshalled symbols; the canonical string form is total + deterministic). **`file:list_dir/1`** → `file-list-dir`, `{ok,[Binary]}` via `er-of-sx` / `{error,Reason}` reusing `er-classify-file-error`. Test gotcha caught + fixed: this Erlang port's binary parser only supports integer/var segments — `<<"abc">>` string-binary literals silently produce **empty** binaries, so the first-cut distinct-input tests compared two empty inputs and failed; rewrote ffi inputs to integer-segment binaries (`<<97,98,99>>`). ffi suite 14→**28** (3 BLOCKED negative-asserts flipped to positive+negative functional tests; `httpc`/`sqlite` kept as deferred unregistered-asserts per fed-prims handoff). Built `sx_server.exe` (dune, opam 5.2.0) at `380bc69f`; full conformance **729/729** (eval 385/385, vm 78/78, **ffi 28/28**, all process suites green). loops/erlang only — not merged, not pushed to architecture.
- **2026-05-18 FIXED merge-blocking regression: cyclic-env hang in `er-env-derived-from?`** — A trial merge of loops/erlang → architecture regressed Erlang **715/715 → 0/0** on the architecture binary. Bisected: not loader semantics, not a uniform slowdown — pinpointed to the *single* Phase 7 capstone test (eval.sx lines 1314-1346; prefix-1313 was byte-identical speed on both binaries, 27s, prefix-1346 was 28s on loops vs >5min/hung on architecture). Isolated further: spawn+reload alone 0.6s, reload+purge alone 0.3s, but spawn+reload+**purge over forever-blocked procs** hung. Root cause: `er-env-derived-from?` (transpile.sx, used by `code:purge`/`soft_purge` via `er-procs-on-env`) compared closure envs with `(= env target-env)`. loops/erlang's evaluator implements dict `=` as **object identity**; architecture's 131-commit-newer evaluator changed it to **structural deep equality**. Erlang closure envs are large and **cyclic** (a module fun's `:env` transitively references the fun), so structural `=` over them never terminates. Fix: use `identical?` (pointer-identity predicate, present + consistent `(true false)` on *both* binaries) — the actually-intended semantics and host-independent. Verified: full eval.sx on the architecture binary >200s/hung → **59s**; full 10-suite conformance on the architecture binary now **715/715** (eval 385/385, vm 78/78, ffi 14/14, all process suites green). loops/erlang behaviour unchanged (`identical?` ≡ its old `=`-identity). One-file change (`lib/erlang/transpile.sx`, +7/-2). The merge can now be re-attempted; this was the sole blocker.
- **2026-05-15 Phase 10a — architecture traced, scoped, blocked on `lib/compiler.sx`** — Investigation-only iteration (correctly: faking compiler emission within scope is impossible and would be dishonest). Traced the full JIT path: `sx_vm.ml`'s `jit_compile_lambda` (the ref set at line 1206) invokes the SX-level `compile` from `lib/compiler.sx` via the CEK machine — that is the only SX→bytecode producer. Erlang's hot helpers are ordinary SX functions in `transpile.sx` that get JIT-compiled through exactly this path, so emitting `erlang.OP_*` means teaching `compiler.sx`'s `compile-call` to recognize them as intrinsics and emit the extension opcode (the file's own docstring already anticipates this — "Compilers call `extension-opcode-id` to emit extension opcodes" — designed but unimplemented; grep confirms zero `extension-opcode-id` uses in `compiler.sx`). `lib/compiler.sx` is lib-root: excluded by ground rules and the widened scope (editing it changes every guest's JIT — must be a shared-compiler session, not this loop). Recorded a precise Blockers entry + decomposed 10a into four numbered sub-steps (10a.1 intrinsic registry, 10a.2 `compile-call` emission with graceful CALL fallback, 10a.3 operand/stack contract for control opcodes, 10a.4 bytecode-emission tests) so the owning session can execute directly. Key payoff documented: all 10 BIF handlers (230-239) are already real, so they light up the instant 10a.1-10a.2 land — zero further Erlang-side work for the BIF speedup. No code changed; conformance unverified-but-untouched at **715/715** (no source touched). Phase 10's loop-reachable work (10b BIF half) is complete; the rest is correctly blocked and fully actionable elsewhere.
- **2026-05-15 Phase 10b — ELEMENT + LISTS_REVERSE real; all 10 BIF opcodes done** — Re-examined the earlier "gated on 10a" claim for ELEMENT/REVERSE and found it wrong: both are pure stack transforms with no need for bytecode operands (`element/2` just pops Tuple then Index off the VM stack; `lists:reverse/1` pops one list). Implemented both as real handlers in `erlang_ext.ml`. `OP_BIF_ELEMENT` (233): pops Tuple (TOS) then Index, handles List/ListRef `:elements`, 1-indexed, raises on out-of-range or wrong arg types. `OP_BIF_LISTS_REVERSE` (235): walks the cons chain building a fresh reversed one via local `mk_cons`/`mk_nil`, raises on improper list. Defined the calling convention for arity-2 ELEMENT: args pushed left→right so stack is `[Index Tuple]`, Tuple on top. 6 new e2e run_tests: element(2/1,{1,2,3}), element out-of-range raises, reverse-then-HD=9, reverse-then-TL-HD=8, reverse-then-LENGTH=3 (composes 3 real opcodes in one bytecode sequence). erlang_ext suite 15→21 PASS, dispatch_count 22. not-wired guard repointed 233→222 (OP_PATTERN_TUPLE — a genuine control opcode still stubbed). **All 10 BIF opcodes (230-239) now real**; the 8 remaining stubs are the true control/structural opcodes (pattern match, perform/handle, receive-scan, spawn/send) which genuinely need 10a's compiler-defined operand encoding + runtime-state access. Erlang conformance **715/715** (interpreter path untouched). 10b is now BIF-complete; the control-opcode half is the real remaining Phase 10 work and is correctly gated on 10a.
- **2026-05-15 Phase 10b — all 8 hot-BIF handlers real** — Built on the vertical slice: added 7 more real register-machine handlers in `erlang_ext.ml` (HD 231, TL 232, TUPLE_SIZE 234, IS_INTEGER 236, IS_ATOM 237, IS_LIST 238, IS_TUPLE 239), joining LENGTH 230. Shared helpers added: `mk_atom` (builds the Erlang bool atom `{tag→atom, name→true|false}`), `er_bool`, `is_tag` (Dict tag predicate). TUPLE_SIZE handles both `List` and `ListRef` `:elements` (Erlang tuples may be built mutably). IS_INTEGER keys off `Sx_types.Integer`. All raise descriptive `Eval_error` on type mismatch. The `op N "name"` stub helper now only covers the 10 remaining control/structural opcodes. 9 new end-to-end run_tests assertions added (HD, TL∘HD, TUPLE_SIZE, IS_INTEGER pos+neg, IS_ATOM, IS_LIST nil-true + tuple-false, IS_TUPLE) — each builds real bytecode with a list/tuple/atom constant and executes via `Sx_vm.execute_module`. erlang_ext suite 6→15 PASS; dispatch_count 12. not-wired guard repointed 231→233 (OP_BIF_ELEMENT, still stubbed — it needs two operands so it's a later sub-step). Erlang conformance **715/715** (the interpreter path is untouched; only the VM-bytecode dispatch gained real handlers). Remaining 10b: pattern tuple/list/binary, perform/handle, receive-scan, spawn/send, element, lists:reverse (10 opcodes).
- **2026-05-15 Phase 10b vertical slice — first real opcode handler, end-to-end VM proof** — Investigation first: confirmed Erlang runs as a pure tree-walking interpreter (`er-eval-expr` over CEK) — there is **no** Erlang→bytecode compiler, so full 10a (compiler emits opcodes) is a multi-week standalone effort, not one iteration. Rather than fake it, de-risked the whole Phase 9/10 architecture with a vertical slice: replaced the `not_wired` raise for `erlang.OP_BIF_LENGTH` (id 230) with a genuine register-machine handler in `erlang_ext.ml` — pops a value, walks the Erlang cons-list representation (`Dict` with `"tag"``"cons"`/`"nil"`, `"head"`, `"tail"`), pushes `Integer` length, raises on improper lists. Added an end-to-end run_tests test that builds real bytecode `[| 1; 0; 0; 230; 50 |]` (CONST idx 0 → OP_BIF_LENGTH → RETURN) with an Erlang `[1,2,3]` in `vc_constants`, executes via `Sx_vm.execute_module`, asserts `Integer 3`. This proves the complete path works: `extension-opcode-id` → bytecode → `Sx_vm` ≥200 dispatch fallthrough → `erlang_ext` handler → correct VM stack result — the load-bearing proof that Phase 9's wiring isn't just stubs. The other 17 opcodes still honestly raise `not_wired`; the prior not-wired guard test was repointed from 230 to 231 (OP_BIF_HD) so it still verifies the honest-failure path. erlang_ext suite 5→6 tests, dispatch_count now 2. Erlang conformance **715/715** unaffected (the new path is VM-bytecode-only; the interpreter path is untouched). 10b marked in-progress `[~]`; remaining: real handlers for the other 17 opcodes + 10a compiler emission. Builds clean via `dune build bin/run_tests.exe bin/sx_server.exe`.
- **2026-05-15 Phase 9g — perf bench recorded on integrated binary; Phase 10 scoped** — Built the fresh `sx_server.exe` (9a+9h+9i wired in), ran `lib/erlang/bench_ring.sh 10 100 500 1000`: 11/36/35/31 hops/s — statistically identical to the pre-9a baseline (11/24/26/29/34). This is the *expected* outcome and the iteration's actual deliverable: it proves the entire extension stack (vm-ext A-E cherry-pick + `Sx_vm_extensions` force-link + `erlang_ext.ml` + SX dispatcher bridge) added **zero per-hop overhead** — a clean no-regression result — while honestly showing the speedup hasn't arrived because the bytecode compiler still doesn't emit `erlang.OP_*` (every hop takes the general CEK path). Updated `bench_ring_results.md` with a "Phase 9g" section: the table + the rationale that unchanged numbers = correct + no-regression. Conformance **715/715** on the integrated binary. Added **Phase 10 — bytecode emission** to the roadmap (10a compiler emits opcodes at hot sites, 10b real register-machine `erlang_ext.ml` handlers replacing the not-wired raises, 10c perf validation against the 100k-hops/1M-spawn targets). Phase 9 is now fully ticked (9a-9i); the actual speedup is honestly deferred to Phase 10 rather than faked. No code change this iteration — measurement + documentation + roadmap.
- **2026-05-15 Phase 9i — SX dispatcher consults host opcode ids** — `lib/erlang/vm/dispatcher.sx` now bridges SX↔OCaml opcode ids. Two new functions: `er-vm-host-opcode-id` (wraps `extension-opcode-id`) and `er-vm-effective-opcode-id name stub-id` (host id if the OCaml `erlang_ext` registered it, else the stub-local id). Key SX-runtime fact established this iteration: symbol resolution is **lazy/call-time**`(define f (fn () (extension-opcode-id "x")))` does NOT raise at load even when the primitive is absent; only calling `f` does. Combined with the earlier findings (guard can't catch undefined-symbol; no symbol-existence reflection), this means graceful in-SX degradation is impossible — so the design instead documents the binary prerequisite and relies on the loop building+running the freshly-built `hosts/ocaml/_build/default/bin/sx_server.exe` (conformance.sh's default, which has the vm-ext mechanism + erlang_ext). Stub-local registration (128-145) deliberately left intact so the 72 pre-existing vm tests don't move. 6 new vm tests: 222/239 lookups, unknown→nil, effective-prefers-host (230), effective-fallback (999), and a contiguity sweep over all 18 `erlang.OP_*` names asserting they map to 222..239 in order. vm suite 72→78. Total **715/715** on the fresh binary. Next: 9g — re-run ring bench, record numbers (note: stubs still wrap existing impls 1-to-1 so numbers won't move until the compiler emits these opcodes — a later phase).
- **2026-05-15 Phase 9h — erlang_ext.ml registered, opcode namespace live** — New `hosts/ocaml/lib/extensions/erlang_ext.ml` modelled on `test_ext.ml`: an `EXTENSION` module `name="erlang"`, per-instance `ErlangExtState` (dispatch counter), 18 opcodes ids 222-239 named `erlang.OP_*` exactly mirroring the SX stub dispatcher. Registered at sx_server startup with a second guarded line in `bin/sx_server.ml` (`try Erlang_ext.register () with Failure _ -> ()` — survives a re-entered server). `include_subdirs unqualified` in `lib/dune` already pulls `lib/extensions/*.ml` into the `sx` lib, so no dune edit needed. Handlers deliberately raise a descriptive `Eval_error` ("bytecode emission not yet wired (Phase 9j) — Erlang runs via CEK; specialization path is the SX stub dispatcher") rather than fake stack ops — the compiler doesn't emit these yet, so an honest loud failure beats silent corruption. Hit and fixed an opcode-id collision: the original 200-217 range clashed with run_tests' inline test_reg (210/211); relocated to 222-239 (clears test_reg + test_ext 220/221, all coexist; production sx_server only registers erlang). 5 new OCaml tests in run_tests `Suite: extensions/erlang_ext`: opcode-id 222 + 239 resolve, unknown→nil, dispatch raises not-wired (substring check, no Str dep since run_tests doesn't link str), dispatch_count state ≥1. Built via `eval $(opam env --switch=5.2.0); dune build bin/run_tests.exe bin/sx_server.exe`. Erlang conformance **709/709** on the rebuilt binary (the broad run_tests 1110 failures are loops/erlang's pre-existing months-old divergence from architecture — run_tests was never built on this branch before; my changes are isolated additive). Next: 9i — wire the SX stub dispatcher to consult `extension-opcode-id`.
- **2026-05-15 Phase 9a integrated — scope widened to hosts/** — User lifted the hosts/ scope restriction ("we are going to merge this back anyhow"). Cherry-picked the 5 `vm-ext` commits (phases A-E) from `loops/sx-vm-extensions` onto `loops/erlang` — only conflict was `plans/sx-vm-opcode-extension.md` (already had architecture's final copy from an earlier iteration; resolved `-X ours`, OCaml files auto-merged clean since loops/erlang never touched hosts/). Discovered `extension-opcode-id` was still "Undefined symbol" even on a fresh build: `Sx_vm_extensions`'s module-init (`install_dispatch` + primitive registration) only runs if the module is linked, and `sx_server.ml` never referenced it (only `run_tests.ml` did), so OCaml dead-code-eliminated it. Fix: added `let () = ignore (Sx_vm_extensions.id_of_name "")` force-link reference near the top of `bin/sx_server.ml`. Rebuilt with `dune build` (opam switch 5.2.0; `dune` not on PATH by default — `eval $(opam env --switch=5.2.0)` first). `extension-opcode-id` now live: returns nil for unregistered names, will return real ids once an extension registers. Conformance **709/709** on the freshly built binary (cherry-picked sx_vm.ml dispatch changes + force-link, zero regressions). 9a checkbox flipped from BLOCKED to INTEGRATED; Blockers entry resolved; added 9h (erlang_ext.ml) + 9i (wire SX dispatcher to real ids) as ordinary in-scope checkboxes, reordered 9g after them. Next: write `hosts/ocaml/lib/extensions/erlang_ext.ml`.
- **2026-05-14 Phase 9g logged as partially BLOCKED — perf bench waits on 9a** — Conformance half satisfied: 709/709 with all Phase 9 stub infrastructure loaded (10 opcode IDs registered, 72 vm-suite tests passing, zero regressions in tokenize/parse/eval/runtime/ring/ping-pong/bank/echo/fib/ffi suites). Perf-bench half can't move forward in this worktree because the stub handlers wrap the existing `er-bif-*` / `er-match-*` / scheduler impls 1-to-1; a ring benchmark with the new opcodes "active" would measure the same 34 hops/s already documented in `bench_ring_results.md`. Updated `bench_ring_results.md` with a Phase 9 status section explaining the pre-integration state (stubs ready, real measurement gated on 9a's bytecode compiler emitting these IDs at hot sites). Blockers entry added pairing 9g with the existing 9a Blocker. No code change; total **709/709** unchanged. Phase 9 stub work (9b-9f) is complete from this loop's vantage point — 9a and 9g remain BLOCKED on a `hosts/ocaml/` iteration.
- **2026-05-14 Phase 9f — hot-BIF opcode table green** — Ten hot BIFs get direct opcode IDs in `lib/erlang/vm/dispatcher.sx` so the bytecode compiler can emit them at hot call sites without paying the registry string-key hash: `OP_BIF_LENGTH (136)`, `OP_BIF_HD (137)`, `OP_BIF_TL (138)`, `OP_BIF_ELEMENT (139)`, `OP_BIF_TUPLE_SIZE (140)`, `OP_BIF_LISTS_REVERSE (141)`, `OP_BIF_IS_INTEGER (142)`, `OP_BIF_IS_ATOM (143)`, `OP_BIF_IS_LIST (144)`, `OP_BIF_IS_TUPLE (145)`. Implementation is one line per opcode: the handler IS the existing `er-bif-*` function directly — same `(vs)` signature as the dispatcher's `(operands)`, so the registration is `(er-vm-register-opcode! ID "NAME" er-bif-FOO)`. IDs 136-159 reserved for future hot-BIF additions; cold BIFs continue through `er-apply-bif`/`er-lookup-bif`. 18 new tests in `tests/vm.sx`: opcode-by-id verification (LENGTH), one positive test per BIF (length on 3-cons, hd, tl-is-cons, element index 2, tuple_size 4, lists:reverse preserves length AND actually reverses [head check], is_integer pos+neg, is_atom pos+neg, is_list pos+nil pos+tuple neg, is_tuple pos+neg), opcode-list-grew-to-16+. vm suite 54 → 72. Total **709/709** (+18 vm). Real perf benefit lands when 9a integrates and the compiler emits these IDs at hot sites.
- **2026-05-14 Phase 9e — OP_SPAWN / OP_SEND + VM-process registry green** — `lib/erlang/vm/dispatcher.sx` gains a parallel mini-runtime distinct from the language-level `er-scheduler`: `er-vm-procs` (dict pid → proc record), `er-vm-next-pid` (counter cell), `er-vm-procs-reset!`, plus six accessors (`er-vm-proc-new!`/`get`/`send!`/`mailbox`/`state`/`count`). Process record shape is the register-machine layout the real bytecode scheduler will use: `{:id :registers (8 nil slots) :mailbox :state :initial-fn :initial-args}` — fixed register width so cells don't grow during execution. Opcode 134 `OP_SPAWN` calls `er-vm-proc-new!` and returns the new pid; 135 `OP_SEND` appends to the target's mailbox and flips a waiting proc back to runnable, returns false for unknown pid (graceful, doesn't crash). 16 new tests in `tests/vm.sx`: opcode-by-id for both, spawn returns 0 / 1 / count=2 / state=runnable / mailbox empty / 8 registers, send returns true, 3-sends preserve arrival order (first + last verified), send to unknown pid returns false, isolation (p1's msgs don't leak into p2), reset clears procs + resets pid counter. vm suite 38 → 54. One gotcha during impl: SX `fn` bodies evaluate ONLY the last expression — `er-vm-procs-reset!` had two `set-nth!` calls back-to-back which silently dropped the first; wrapped in `(do ...)` to fix. Total **691/691** (+16 vm). Real scheduler with per-process scheduling latency and runnable queue is post-9a.
- **2026-05-14 Phase 9d — OP_RECEIVE_SCAN stub green** — Selective-receive primitive at opcode id 133 in `lib/erlang/vm/dispatcher.sx`. Operand contract: `(clauses mbox-list env)` — clauses are AST dicts (`{:pattern :guards :body}`), mbox-list is a plain SX list (queue → list is the caller's job), env is the binding target. Internal helpers `er-vm-receive-try-clauses` (per-message clause walker with env snapshot/restore on failure) and `er-vm-receive-scan-loop` (mailbox walker, arrival order). Match returns `{:matched true :index N :body B}` so the caller can queue-delete at N and then evaluate B in the now-mutated env; miss returns `{:matched false}` so the caller can suspend via OP_PERFORM "receive-suspend". Mirrors the existing `er-try-receive-loop` in `transpile.sx` but doesn't reach into the scheduler — purely VM-level. 10 new tests in `tests/vm.sx`: opcode registered, scan finds match at correct index, scan binds var, body left unevaluated, no-match leaves env untouched, empty mailbox, first-match wins (arrival order — verified by two `{ok, _}` msgs and binding the FIRST value). vm suite 28 → 38. Total **675/675** (+10 vm). When 9a integrates and the real OP_RECEIVE_SCAN compiles clauses into a register-machine match, the existing `er-eval-receive-loop` becomes a one-line dispatch wrapper.
- **2026-05-14 Phase 9c — OP_PERFORM / OP_HANDLE stubs green** — Two new opcodes in `lib/erlang/vm/dispatcher.sx`: id 131 `OP_PERFORM` raises `{:tag "vm-effect" :effect <name> :args <args>}`; id 132 `OP_HANDLE` wraps a thunk in SX `guard`, catches matching effects by `:effect` name, passes the `:args` list to the handler fn, returns the handler's result. New helper `er-vm-effect-marker?` predicates on the dict shape. Non-matching effects rethrow via a small box+rethrow dance (caught with `:else` first, decision deferred to a post-guard cond — re-raise outside the guard's scope so it propagates to outer handlers cleanly). 9 new tests in `tests/vm.sx`: opcode registered for each id; OP_PERFORM raises with correct tag/effect/args; OP_HANDLE catches matching effect; OP_HANDLE returns thunk result when no effect performed; OP_HANDLE rethrows non-matching effect to outer; nested OP_HANDLE blocks separate by effect name (inner handles "a", outer handles "b", performing "b" bypasses inner). vm suite grew 19 → 28 tests. Total **665/665** (+9 vm). Underlying call/cc + raise/guard machinery used by Erlang `receive` is unchanged; this is the shape for the eventual specialization when 9a integrates. Candidate for chiselling to `lib/guest/vm/effects.sx` — Scheme call/cc, OCaml 5 effects, miniKanren all want the same shape.
- **2026-05-14 Phase 9b — stub VM dispatcher + 3 pattern opcodes green** — New `lib/erlang/vm/dispatcher.sx` defines the stub opcode registry mirroring the OCaml `EXTENSION` shape from `plans/sx-vm-opcode-extension.md`: opcodes registered as `{:id :name :handler}` keyed by string-id, looked up by id OR by name, dispatched via `er-vm-dispatch`. Opcode IDs follow the guest-tier partition (128-199 reserved for guest extensions like erlang/lua). Three opcodes registered at load time via `er-vm-register-erlang-opcodes!`: 128 `OP_PATTERN_TUPLE``er-match-tuple`, 129 `OP_PATTERN_LIST``er-match-cons`, 130 `OP_PATTERN_BINARY``er-match-binary`. Operand contract: `(pattern-ast value env)` returning `true`/`false` and mutating env on success — same as the underlying match functions. New `lib/erlang/tests/vm.sx` suite with 19 tests: 7 dispatcher core (registered, lookup by id+name for all three, two miss cases, list-has-3+); 4 OP_PATTERN_TUPLE (match success + var bind, no-match, arity mismatch); 4 OP_PATTERN_LIST (match, head bind, tail-is-cons, no-match on nil); 3 OP_PATTERN_BINARY (match, segment bind, size mismatch); 1 dispatch error (unknown opcode raises). `conformance.sh` updated: added `vm` to SUITES, added `(load "lib/erlang/vm/dispatcher.sx")` before tests and `(load "lib/erlang/tests/vm.sx")` after ffi, added epoch 110 evaluator. AST shape gotcha: er-match! reads `:type` not `:tag`; binary segment `:size` must be an AST node `{:type "integer" :value "8"}` because `er-eval-expr` runs on it. Total **656/656** (+19 vm). 9b complete; 9c (OP_PERFORM/OP_HANDLE) is next.
- **2026-05-14 Phase 9a logged as Blocker — sub-phase 9b is next** — 9a (the opcode extension mechanism in `hosts/ocaml/evaluator/`) is explicitly out-of-scope for this loop per the plan itself (briefing scope rule + 9a's own text). Logged a Blockers entry citing `plans/sx-vm-opcode-extension.md` as the design doc and pointing at the fix path (a `hosts/` session lands the registration shape, then a follow-up here wires the stub dispatcher to the real one). Ticked 9a as DONE because its contract was "Log as Blocker" — that's complete. Sub-phases 9b9g (PATTERN/PERFORM/RECEIVE/SPAWN_SEND/BIF/conformance) now in queue against a stub dispatcher in `lib/erlang/vm/`. No code change this iteration. Total **637/637** unchanged.
- **2026-05-14 Phase 9 scoped + supporting plan files synced** — Copied three plan files from `/root/rose-ash/plans/` (architecture branch) that this worktree was missing: `fed-sx-design.md` (124KB, the substrate design referenced from Phase 7/8 drivers), `fed-sx-milestone-1.md` (33KB, first concrete implementation milestone), `sx-vm-opcode-extension.md` (19KB, the prerequisite for Phase 9a — designs how `lib/<lang>/vm/` registers opcodes against the OCaml SX VM core). Then appended **Phase 9 — specialized opcodes (the BEAM analog)** to `plans/erlang-on-sx.md` covering sub-phases 9a-9g: 9a (opcode extension mechanism in `hosts/ocaml/`) is out-of-scope for this loop (will be logged as a Blocker when the next iteration tries to start it); 9b-9g (PATTERN_TUPLE/LIST/BINARY, PERFORM/HANDLE, RECEIVE_SCAN, SPAWN/SEND + lightweight scheduler, BIF dispatch table, conformance + perf bench) can be designed and tested against a stub dispatcher in the meantime. Targets: ring benchmark 100k+ hops/sec at N=1000 (~3000× speedup), 1M-process spawn under 30sec (~1000× speedup). Plan framing intact for Phase 7/8 — those reflect the actual implementation done in this loop; the architecture-branch framing diverges in language but the work is equivalent. No code touched this iteration. Total **637/637** unchanged.
- **2026-05-14 ffi test suite extracted, conformance scoreboard auto-picks it up** — New `lib/erlang/tests/ffi.sx` with its own counter trio (`er-ffi-test-count`/`-pass`/`-fails`) and `er-ffi-test` helper following the same pattern as runtime/eval/ring tests. The 10 file BIF eval tests from the previous iteration moved out of `eval.sx` (eval dropped from 395 to 385 tests) and into the new suite where they're now 9 tests (consolidated the two write+read tests). `conformance.sh` updated: added `ffi` to `SUITES` array with `er-ffi-test-pass`/`-count` symbols, added `(load "lib/erlang/tests/ffi.sx")` after `fib_server.sx`, added `(epoch 109) (eval "(list er-ffi-test-pass er-ffi-test-count)")`. Scoreboard markdown auto-updated to include the row. Suite also asserts that the 5 blocked BIFs (`crypto:hash`, `cid:from_bytes`, `file:list_dir`, `httpc:request`, `sqlite:exec`) are NOT yet registered — turns a future "added the wrapper but forgot to extend ffi tests" into a hard failure. One eval-comparison gotcha en route: SX's `=` does identity equality on dicts so comparing two separately-constructed `(er-mk-atom "true")` values is false; the existing eval suite has an `eev-deep=` helper that handles this, but the simpler fix in ffi was to extract `:name` via `ffi-nm` and compare strings. Total **637/637** (+14 ffi). Phase 8 fully ticked aside from the BLOCKED bullets — those remain unchecked with explicit Blockers references.
- **2026-05-14 file BIFs landed; crypto/cid/list_dir/http/sqlite blocked on missing host primitives** — Three new FFI BIFs registered in `runtime.sx`: `file:read_file/1`, `file:write_file/2`, `file:delete/1`. Each wraps the SX-host primitive (`file-read`, `file-write`, `file-delete`) inside a `guard` that converts thrown exception strings into Erlang `{error, Reason}` tuples. New helper `er-classify-file-error` does loose pattern-matching on the error message using `string-contains?` to map to standard POSIX-style reasons: `"No such"``enoent`, `"Permission denied"``eacces`, `"Not a directory"``enotdir`, `"Is a directory"``eisdir`, fallback `posix_error`. Filenames coerce through `er-source-to-string` so SX strings, Erlang binaries, and Erlang char-code lists all work. Read returns `{ok, Binary}` (bytes via `(map char->integer (string->list ...))` then `er-mk-binary`); write returns bare `ok`; delete returns bare `ok`. Bootstrap registrations added at the bottom of `er-register-builtin-bifs!` under `"file"`. 10 new eval tests: write-then-read round-trip, ok-tag, payload is binary, byte_size content, missing-file `enoent`, delete-ok, read-after-delete `enoent`, write to non-existent dir `enoent`, binary payload (5 raw bytes) round-trip preserving byte count. Blockers entry added covering five Phase 8 BIFs whose host primitives don't exist in this SX runtime: `crypto:hash/2`, `cid:from_bytes/1`/`to_string/1`, `file:list_dir/1`, `httpc:request/4`, `sqlite:open/exec/query/close`. Fix path documented inline (architecture-branch iteration to register OCaml-side primitives). Total **633/633** (+10 eval).
- **2026-05-14 term-marshalling helpers landed** — `er-to-sx` (Erlang term → SX-native) and `er-of-sx` (SX-native → Erlang term) plus internal helper `er-cons-to-sx-list` (recursive cons-chain walker). All three live in `runtime.sx` next to the BIF registry. Conversion table: atom ↔ symbol via `make-symbol`/`er-mk-atom`; nil ↔ `()`; cons-chain → SX list (recursive marshal of each head); tuple → SX list (one-way — tuples flatten and can't be reconstructed without a tag); binary ↔ SX string (bytes ↔ char codes via `char->integer`/`integer->char`); integer / float / boolean passthrough; opaque types (pid, ref, fun) passthrough. SX strings on the way back become Erlang binaries — the natural FFI return shape. Empty SX list (`type-of` `"nil"`) marshals back to `er-mk-nil`. Edit gotchas during implementation: SX has no `while`, `string-ref`, or `string-length` primitive — used `(map char->integer (string->list s))` for byte extraction and a recursive helper for cons-walking. 23 new runtime tests in `tests/runtime.sx`: 10 covering `er-to-sx` (atom/atom-is-symbol, nil, int / float / bool passthrough, binary→string, cons→list, tuple→list, nested), 8 covering `er-of-sx` (symbol→atom, atom-tag, string→binary, byte content, int passthrough, empty-list→nil, list→cons length, head field), 4 round-trips (int, atom, binary bytes, list length), 1 negative documenting that tuple round-trip flattens to cons. Total **623/623** (+23 runtime).
- **2026-05-14 BIF registry migration complete — cond chains gone** — `er-register-builtin-bifs!` at the end of `runtime.sx` populates the registry with all 67 built-in BIFs in five module namespaces. Pure ops (`length`, `hd`, `tl`, `element`, predicates, arithmetic, list/atom/integer conversions, all of `lists`) registered via `er-register-pure-bif!`; side-effecting ops (`spawn`, `self`, `exit`, `link`/`monitor`/`register`, `process_flag`, `make_ref`, `throw`/`error`, `io:format`, all of `ets`, all of `code`) via `er-register-bif!`. Multi-arity entries: `is_function/1`/`/2`, `spawn/1`/`/3`, `exit/1`/`/2`, `io:format/1`/`/2`, `lists:seq/2`/`/3`, `ets:delete/1`/`/2` — six pairs, twelve registrations, all pointing at the existing arity-dispatching impl. `throw` and `error` are registered with a tiny inline `(fn (vs) (raise ...))` lambda because the original code chained directly through `raise` inside the cond instead of an `er-bif-*` helper. `er-apply-bif` shrinks from a 44-line cond chain to a 5-line registry lookup. `er-apply-remote-bif` becomes a 7-line dispatcher (user-modules-first → registry → error). All four per-module dispatchers (`er-apply-lists-bif`, `er-apply-io-bif`, `er-apply-ets-bif`, `er-apply-code-bif`) deleted — net reduction ~110 lines of cond machinery. One subtle wrinkle: `tests/runtime.sx` calls `er-bif-registry-reset!` near the end of its BIF-registry tests, which would have left subsequent test files (ring, ping-pong, etc.) unable to call `length`/`spawn`/etc. Fix: re-call `er-register-builtin-bifs!` at the bottom of `tests/runtime.sx` to repopulate. Total **600/600** unchanged.
- **2026-05-14 Phase 8 BIF registry foundation** — `lib/erlang/runtime.sx` gains `er-bif-registry` (a `(list {})` mutable cell, same shape as `er-modules`) and five helpers: `er-bif-registry-get`/`er-bif-registry-reset!` (access + reset), `er-bif-key` (format `"Module/Name/Arity"`), `er-register-bif!` and `er-register-pure-bif!` (both upsert; differ only in the `:pure?` flag — pure ones are safe to inline, side-effecting ones go through normal IO), `er-lookup-bif` (returns the entry dict or nil), `er-list-bifs` (registered keys). Entries are `{:module :name :arity :fn :pure?}`. Lookup miss → nil; arity is part of the key so `m:f/1` and `m:f/2` are distinct; re-registering the same key replaces in-place (count stays the same); reset clears. Registry sits alongside `er-modules` in runtime.sx so any other piece of the system can register BIFs without touching the dispatcher — the migration onto this registry (the next checkbox) will rip out the giant cond chains in `er-apply-bif`/`er-apply-remote-bif`. 18 new runtime tests in `tests/runtime.sx`: empty-state, lookup-miss, register-grows-count, lookup-hit-fields (module/name/arity/pure?), fn-invocable, re-register-replaces, pure-flag-true, arity-disambiguation (3 entries for `fake:echo/1`, `fake:echo/2`, `fake:pure/2`), reset-clears, reset-lookup-nil. Total **600/600** (+18 runtime).
- **2026-05-14 Phase 7 capstone green — full hot-reload ladder works end-to-end** — Wires everything from the previous five iterations into one test program: load cap v1 with `start/0` (spawn-from-inside-module) + `loop/0` + `tag/0` → spawn Pid1 (running v1) → load cap v2 → assert `cap:tag()` returns v2 (cross-module dispatch hits `:current`) → spawn Pid2 (running v2) → `code:soft_purge(cap)` returns `false` (refuses while Pid1 is alive on v1's env) → `code:purge(cap)` returns `true` (kills Pid1, clears `:old`) → `code:soft_purge(cap)` returns `true` (clean — no `:old` left). To make this work, `er-procs-on-env` was extended with a new helper `er-env-derived-from?`: a process counts as "running on" mod-env if its `:initial-fun`'s `:env` IS mod-env directly OR contains at least one binding whose value is a fun closed over mod-env. Reason: `er-apply-fun-clauses` always `er-env-copy`s the closure-env before binding params, so a fun created inside a module body has a `:env` that's a *copy* of mod-env, not mod-env itself — the copy still contains the module's other functions as values, each pointing back to the canonical mod-env. The whole ladder runs as a single `erlang-eval-ast` invocation because each call to `ev` resets the scheduler via `er-sched-init!`, wiping any cross-call Pids. 5 capstone tests: v1 tag, v2 tag (cross-mod after reload), soft_purge-refuses, hard purge, soft_purge-clean-after-hard. Total **582/582** (+5 eval). Phase 7 fully ticked.
- **2026-05-14 hot-reload call-dispatch semantics verified** — Tests-only iteration: no implementation change, just six new eval tests that nail down the Erlang semantics already implicit in the current code. (1) `M:F()` after reload returns v2's value (cross-module call hits `:current`). (2) Inside a freshly-loaded body, a bare local call resolves through the new mod-env so a chain `a() -> b()` reflects v2's `b/0`. (3) Calling a fun captured BEFORE reload, whose body uses a local call, returns the v1 value (closure pinned to old mod-env via `er-mk-fun`'s `:env` reference). (4) Calling a fun captured BEFORE reload, whose body uses a cross-module call `M:b()`, returns v2's value (cross-module always wins over closed-over env). (5) Two captured funs from two distinct vintages stay independent — F1() + F2() = 10 + 20 = 30. (6) The slot version counter still bumps even while old captured funs are alive, demonstrating the closure-pinning doesn't block reloads. The "running process finishes its current function with the version it started with" property falls out of fun-as-closure semantics for free — there's no special bookkeeping. Total **577/577** (+6 eval).
- **2026-05-14 code introspection BIFs green** — `code:which/1`, `code:is_loaded/1`, `code:all_loaded/0` added to `er-apply-code-bif` dispatch with three small implementations in `transpile.sx`. `which` and `is_loaded` are dict-lookups on the module registry returning the loaded-marker (atom `loaded`) or the missing-marker (atom `non_existing` for which, atom `false` for is_loaded). Since we don't have a filesystem path representation, the standard `{file, Path}` shape for `is_loaded` becomes `{file, loaded}` — same tuple arity so destructuring code stays portable. `all_loaded` iterates `(keys (er-modules-get))` in reverse (so the result list preserves insertion order after the cons-prepend loop), wrapping each name in a `{Module, loaded}` tuple. **10 new eval tests**: non_existing for absent / loaded after load for which; missing / file-tag / loaded-value for is_loaded; empty / count-after-2-loads / first-entry-tag for all_loaded; badarg for both single-arg BIFs. Two of the all_loaded tests needed an explicit `(er-modules-reset!)` before the measurement because prior tests in the suite leave modules registered (the registry is process-global across the whole epoch session). Total **571/571** (+10 eval).
- **2026-05-14 code:purge/1 + code:soft_purge/1 green** — Two new BIFs in `transpile.sx`: `er-bif-code-purge` and `er-bif-code-soft-purge`, both dispatched through the existing `er-apply-code-bif` cond chain. Shared helper `er-procs-on-env` walks `(er-sched-processes)` and collects pids whose `:initial-fun` is a fun whose `:env` is identical (dict-identity, not structural) to a given env, filtering out already-dead procs. `er-bif-code-purge` looks up the module slot, returns `false` if either the module isn't registered or `:old` is nil; otherwise calls `er-cascade-exit!` on every matching pid with reason `killed`, replaces the slot with a fresh `er-mk-module-slot` that has `:old nil` (current + version preserved), returns `true`. `er-bif-code-soft-purge` returns `true` (treating "no module" / "no old version" as already-purged), else checks for lingering procs and returns `false` (leaving the slot untouched) if any, else clears `:old` and returns `true`. Non-atom Mod raises `error:badarg` from both. **10 new eval tests**: unknown / no-old / after-reload / idempotent for purge; unknown / no-old / clean for soft_purge; badarg for both; one "purge after spawn" test verifying return value (does NOT exercise the kill path — see caveat in plan). Total **561/561** (+10 eval). Implementation cost: 1 dispatch entry, 3 small BIFs, no scheduler changes.
- **2026-05-14 code:load_binary/3 green** — Canonical hot-reload entry point. Adds a `"code"` module branch to `er-apply-remote-bif`'s dispatch; new helpers `er-source-walk-bytes!` and `er-source-to-string` coerce any of {SX string, Erlang binary `<<...>>`, Erlang char-code cons list} to an SX source string before parsing. `er-bif-code-load-binary` is the BIF itself: validates `Mod` is an atom (`{error, badarg}` else), coerces source (`{error, badarg}` on unrecognised shape), wraps `erlang-load-module` in `guard` to convert parse failures into `{error, badfile}`, checks the parsed `-module(Name).` matches the BIF's first arg (`{error, module_name_mismatch}` else), returns `{module, Mod}`. Reload reuses the Phase-7 slot logic from the previous iteration so calling `code:load_binary(m, _, v2_source)` after `code:load_binary(m, _, v1_source)` bumps the slot to version 2 with v1 sitting in `:old`. 8 new eval tests: ok-tag/ok-name on first load, immediate cross-module call hits new env, reload-and-call returns v2 result, name-mismatch errors with both tag and reason, garbage source yields badfile, non-atom Mod is badarg. Total **551/551** (+8 eval). `code:load_file/1` deferred until `file:read_file/1` lands in Phase 8 (it's just a wrapper that reads bytes from disk then calls `load_binary`).
- **2026-05-14 Phase 7 module-version slot landed** — `er-modules` entries are now `{:current MOD-ENV :old MOD-ENV-or-nil :version INT :tag "module"}` instead of bare mod-env dicts. New helpers in `runtime.sx`: `er-mk-module-slot`, `er-module-current-env`, `er-module-old-env`, `er-module-version`. `erlang-load-module` updated: first load creates a slot with `:version 1` and `:old nil`; subsequent loads of the same module name copy `:current` into `:old` and increment `:version` (bump-and-shift, single-old-version retention as per OTP semantics). `er-apply-user-module` now reads via `er-module-current-env` so cross-module calls always hit the latest version. 13 new runtime tests (mostly in `tests/runtime.sx`): slot constructor + accessors, registry-after-first-load (v1, old nil), registry-after-second-load (v2, old = previous current env identity, current = new env), v3 on triple-load, registry-reset clears. Total **543/543** (was 530/530). Note: sx-tree path-based MCP tools (`sx_replace_node`, `sx_read_subtree`) are broken in this worktree's `mcp_tree.exe` (every path returns/replaces form 0); edits applied via a Python script then `sx_validate`d. Pattern-based tools (`sx_find_all`, `sx_rename_symbol`) still work fine.
- **2026-05-14 Phase 7 + Phase 8 scoped** — Plan extended with two new phases driven by fed-sx (see `plans/fed-sx-design.md` §17.5). Phase 7 brings hot code reload back in scope (was previously listed as out-of-scope): module versioning slot, `code:load_file/1`/`purge/1`/`soft_purge/1`/`which/1`/`is_loaded/1`, cross-module calls hitting current, local calls keeping start-time semantics until function returns. Phase 8 introduces a runtime-extensible **FFI BIF registry** that replaces today's hardcoded `er-apply-bif`/`er-apply-remote-bif` cond chains, plus a term-marshalling layer and concrete BIFs for `crypto:hash`, `cid:from_bytes`/`to_string`, `file:read_file`/`write_file`/`list_dir`/`delete`, `httpc:request`, `sqlite:open`/`exec`/`query`. Scope decisions header updated accordingly. Baseline 530/530 unchanged; no code touched this iteration.
- **2026-04-25 BIF round-out — Phase 6 complete, full plan ticked** — Added 18 standard BIFs in `lib/erlang/transpile.sx`. **erlang module:** `abs/1` (negates negative numbers), `min/2`/`max/2` (use `er-lt?` so cross-type comparisons follow Erlang term order), `tuple_to_list/1`/`list_to_tuple/1` (proper conversions), `integer_to_list/1` (returns SX string per the char-list shim), `list_to_integer/1` (uses `parse-number`, raises badarg on failure), `is_function/1` and `is_function/2` (arity-2 form scans the fun's clause patterns). **lists module:** `seq/2`/`seq/3` (right-fold builder with step), `sum/1`, `nth/2` (1-indexed, raises badarg out of range), `last/1`, `member/2`, `append/2` (alias for `++`), `filter/2`, `any/2`, `all/2`, `duplicate/2`. 40 new eval tests with positive + negative cases, plus a few that compose existing BIFs (e.g. `lists:sum(lists:seq(1, 100)) = 5050`). Total suite **530/530** — every checkbox in `plans/erlang-on-sx.md` is now ticked.
- **2026-04-25 ETS-lite green** — Scheduler state gains `:ets` (table-name → mutable list of tuples). New `er-apply-ets-bif` dispatches `ets:new/2` (registers table by atom name; rejects duplicate name with `{badarg, Name}`), `insert/2` (set semantics — replaces existing entry with the same first-element key, else appends), `lookup/2` (returns Erlang list — `[Tuple]` if found else `[]`), `delete/1` (drop table), `delete/2` (drop key; rebuilds entry list), `tab2list/1` (full list view), `info/2` with `size` only. Keys are full Erlang terms compared via `er-equal?`. 13 new eval tests: new return value, insert true, lookup hit + miss, set replace, info size after insert/delete, tab2list length, table delete, lookup-after-delete raises badarg, multi-key aggregate sum, tuple-key insert + lookup, two independent tables. Total suite 490/490.
- **2026-04-25 binary pattern matching green** — Parser additions: `<<...>>` literal/pattern in `er-parse-primary`, segment grammar `Value [: Size] [/ Spec]` (Spec defaults to `integer`, supports `binary` for tail). Critical fix: segment value uses `er-parse-primary` (not `er-parse-expr-prec`) so the trailing `:Size` doesn't get eaten by the postfix `Mod:Fun` remote-call handler. Runtime value: `{:tag "binary" :bytes (list of int 0-255)}`. Construction: integer segments emit big-endian bytes (size in bits, must be multiple of 8); binary-spec segments concatenate. Pattern matching consumes bytes from a cursor at the front, decoding integer segments big-endian, capturing `Rest/binary` tail at the end. Whole-binary length must consume exactly. New BIFs: `is_binary/1`, `byte_size/1`. Binaries participate in `er-equal?` (byte-wise) and format as `<<b1,b2,...>>`. 21 new eval tests: tag/predicate, byte_size for 8/16/32-bit segments, single + multi segment match, three 8-bit, tail rest size + content, badmatch on size mismatch, `=:=` equality, var-driven construction. Total suite 477/477.
@@ -131,4 +251,24 @@ _Newest first._
## Blockers
- _(none yet)_
- **Phase 10a — opcode emission requires `lib/compiler.sx` (out of scope)** (2026-05-15). Architecture fully traced this iteration: the OCaml JIT (`sx_vm.ml` `jit_compile_lambda`, ref-set at line 1206) invokes the SX-level `compile` from **`lib/compiler.sx`** via the CEK machine; that is the sole SX→bytecode producer. Erlang's hot helpers (`er-match-tuple`, `er-bif-*`, …) are SX functions in `transpile.sx` that get JIT-compiled through this path. To emit `erlang.OP_*` they must be recognized as intrinsics inside `compiler.sx`'s `compile-call` (the file's own docstring already anticipates this: "Compilers call `extension-opcode-id` to emit extension opcodes" — designed, not yet implemented). `lib/compiler.sx` is **lib-root**, excluded by the ground rules ("Don't edit lib/ root") and absent from the widened `lib/erlang/** + hosts/ocaml/** (extension only)` scope — editing it changes every guest language's JIT, so it must be owned by a shared-compiler session, not this loop. **Fix path:** that session implements 10a.1 (intrinsic registry in `compiler.sx`) + 10a.2 (`compile-call` emits the opcode when registered & `extension-opcode-id` non-nil, else generic CALL). Erlang's BIF handlers (10b, ids 230-239, all real) light up the instant emission exists — zero further work here. The control opcodes (222-229) additionally need 10a.3 (operand contract) + OCaml↔SX runtime-state bridging (Erlang scheduler/mailbox live in `lib/erlang/runtime.sx`, not OCaml).
- **Phase 9g — Perf bench gated on 9a** (2026-05-14). The conformance half of 9g (709/709 with stub VM loaded) is satisfied; the perf-bench half requires 9a's bytecode compiler to actually emit the new opcodes at hot call sites. Until then a benchmark would measure today's `er-bif-*` / `er-match-*` numbers unchanged (since the stub handlers wrap them 1-to-1). Re-fire 9g after 9a lands.
- **Phase 9a — Opcode extension mechanism** — **RESOLVED 2026-05-15.** User widened scope to include hosts/ (merging back anyhow). Cherry-picked vm-ext phases A-E + force-linked `Sx_vm_extensions` into sx_server.exe. `extension-opcode-id` live; conformance 709/709. Remaining integration work (erlang_ext.ml + wiring the SX stub dispatcher to consult real ids) tracked as ordinary in-scope checkboxes now, not blockers.
- **RESOLVED (2026-05-18) — SX runtime now exposes the platform
primitives Phase 8 BIFs need.** Delivered by `loops/fed-prims`
(see `plans/fed-sx-host-primitives.md` Handoff). Pure-OCaml,
WASM-safe except `http-listen` (native only). Wire Phase 8 BIFs:
- `crypto:hash/2``crypto-sha256` / `crypto-sha512` /
`crypto-sha3-256` (each `(bytes) -> hex-string`).
- `cid:from_bytes/1``cid-from-bytes` `(codec mh-bytes)`;
`cid:to_string/1` / canonical CID → `cid-from-sx` `(value)`;
dag-cbor via `cbor-encode` / `cbor-decode`.
- signature verify → `ed25519-verify` `(pk msg sig)` and
`rsa-sha256-verify` `(spki msg sig)` — both total (→ false).
- `file:list_dir/1``file-list-dir` `(path) -> (list string)`.
- fed-sx transport → `http-listen` `(port handler)` (native only).
Still deferred (leave blocked): `httpc` (HTTP client, v2) and
`sqlite-*` (v2 indexes) — not provided by fed-prims.

2638
plans/fed-sx-design.md Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,290 @@
# fed-sx host primitives — `hosts/ocaml/`
The single blocker between Erlang Phase 8 (FFI mechanism — done) and starting
fed-sx Milestone 1: the SX OCaml runtime exposes no crypto / CID / HTTP host
primitives for the Phase 8 BIF wrappers to call. This plan adds exactly that
surface, pure-OCaml where it must stay WASM-safe, native-only where it can't.
Reference: `plans/fed-sx-milestone-1.md` (build steps 1-8),
`plans/erlang-on-sx.md` Blockers ("SX runtime lacks platform primitives …").
## The hard constraint — WASM boundary
`hosts/ocaml/lib/` is the `sx` library. `hosts/ocaml/browser/dune` links it
with `(modes byte js wasm)`. **Anything added to `lib/sx_primitives.ml` must
compile under `js_of_ocaml` AND `wasm_of_ocaml`.** Therefore:
- **Pure OCaml only** for hash / CBOR / CID / Ed25519 / RSA. No `digestif`,
no `mirage-crypto`, no C stubs, no `Unix` dependency in these primitives.
(None of those libs are even installed — the switch has only
re/unix/yojson/otfm/js_of_ocaml. Pure OCaml is both required and hermetic.)
- **HTTP server is native-only**: it needs sockets/threads. Register it in
`bin/sx_server.ml` via `Sx_primitives.register` (precedent: `eval-in-env` at
`bin/sx_server.ml:721`), **not** in the shared lib. It must never enter the
WASM build.
- **`file-list-dir`** uses `Sys.readdir` (stdlib, WASM-stubbed) — safe in lib,
but the fed-sx server is native anyway; native registration is acceptable too.
**Every phase must prove the WASM build still links** (`sx_build target="wasm"`
or `bash hosts/ocaml/browser/test_boot.sh`) before its commit. A broken WASM
browser kernel is a hard regression and fails the phase.
## Primitive surface (what fed-sx Milestone 1 actually needs)
Mapped to `plans/fed-sx-milestone-1.md` build steps:
| Primitive (SX name) | Signature | fed-sx step | Host |
|---|---|---|---|
| `crypto-sha256` | `(bytes) -> hex-string` | 1, 2 | lib (pure) |
| `crypto-sha512` | `(bytes) -> hex-string` | 2 | lib (pure) |
| `crypto-sha3-256` | `(bytes) -> hex-string` | 1 (CID default) | lib (pure) |
| `cbor-encode` | `(sx-value) -> bytes` (dag-cbor, deterministic) | 1 | lib (pure) |
| `cbor-decode` | `(bytes) -> sx-value` | 1 (round-trip tests) | lib (pure) |
| `cid-from-bytes` | `(codec multihash-bytes) -> cid-string` | 1 | lib (pure) |
| `cid-from-sx` | `(sx-value) -> cid-string` (canonicalize→cbor→sha→mh→cidv1) | 1 | lib (pure) |
| `ed25519-verify` | `(pubkey-32 msg sig-64) -> bool` | 2 | lib (pure) |
| `rsa-sha256-verify` | `(der-spki msg sig) -> bool` (PKCS#1 v1.5) | 2 | lib (pure) |
| `file-list-dir` | `(path) -> (list string)` | 3 | lib/native |
| `http-listen` | `(port handler-fn) -> never` (handler: req-dict→resp-dict) | 8 | **native only** |
Deferred (not Milestone 1): `httpc-request` (HTTP client — federation is v2),
`sqlite-*` (Milestone 1 is file-on-disk; sqlite is v2 indexes).
## Registration pattern (established)
`lib/sx_primitives.ml`:
```ocaml
register "crypto-sha256" (fun args ->
match args with
| [String s] -> String (Sha2.sha256_hex s)
| _ -> raise (Eval_error "crypto-sha256: (bytes)"))
```
Errors: `raise (Eval_error "name: shape")`. Byte strings are OCaml `string`
(SX `String`). Lists are `Pair`/`Nil` per `sx_types.ml`. Native-only prims go in
`bin/sx_server.ml` the same way.
## Phasing — one feature per loop iteration
Dependency order. Each phase: implement → `dune build` (ocaml) → **WASM build
check** → tests → commit → tick box → Progress-log line → push.
### Phase A — SHA-2 (sha256 + sha512), pure OCaml ✅ DONE
- New `lib/sx_sha2.ml` (or inline in primitives if small): SHA-256 + SHA-512.
- Primitives `crypto-sha256`, `crypto-sha512` → lowercase hex string.
- Tests (`bin/run_tests.ml` or a dedicated `bin/test_crypto.ml`): NIST vectors —
`""`, `"abc"`, the 896-bit message, a 1MB "a" repetition.
- sha256("") = `e3b0c442…b7852b855`; sha256("abc") = `ba7816bf…f20015ad`
- sha512("abc") = `ddaf35a1…2a9ac94f…`
- **Acceptance:** vectors pass; WASM build links; OCaml conformance unchanged.
### Phase B — SHA-3 / Keccak-256, pure OCaml ✅ DONE
- Keccak-f[1600] + SHA3-256 padding. Primitive `crypto-sha3-256`.
- Tests: sha3-256("") = `a7ffc6f8…0f8434a`; sha3-256("abc") = `3a985da7…11431532`.
- **Acceptance:** NIST SHA-3 vectors pass; WASM links.
### Phase C — dag-cbor encoder + decoder, pure OCaml ✅ DONE
- RFC 8949 deterministic subset (RFC 8742 dag-cbor): unsigned/negative ints,
byte strings, text strings, arrays, maps with **keys sorted by
length-then-bytewise**, bool, null, tag 42 (CID link). No floats unless a
fed-sx shape needs them (defer; document).
- SX↔CBOR mapping: `Integer`→int, `String`→text str, `Bool`, `Nil`→null,
`Pair/Nil`→array, `Dict`→map (sorted keys), keyword/symbol→text str.
- Primitives `cbor-encode`, `cbor-decode`. Round-trip property tests + RFC 8949
appendix-A vectors + a "reordered dict keys → identical bytes" determinism test.
- **Acceptance:** vectors + round-trip + determinism pass; WASM links.
### Phase D — CID computation, pure OCaml ✅ DONE
- Multihash (sha2-256 = 0x12, sha3-256 = 0x16; varint code + varint len + digest).
- CIDv1 = `0x01 || codec-varint || multihash`. Codecs: dag-cbor 0x71, raw 0x55.
- Multibase base32 lower (`b` prefix, RFC 4648 no-pad).
- Primitives `cid-from-bytes` (codec, raw mh bytes), `cid-from-sx`
(canonicalize → cbor-encode → sha2-256 → multihash → cidv1 → base32).
- Tests: known IPFS CIDs — cross-check against `ipfs` CLI if present, else the
fixed vectors for `{}` dag-cbor and `"abc"` raw (hardcode expected strings).
Determinism: same SX value (whitespace/comment/key-order variants) → same CID.
- **Acceptance:** matches reference CIDs; determinism holds; WASM links. Satisfies
fed-sx Milestone 1 Step 1.
### Phase E — Ed25519 verify, pure OCaml ✅ DONE
- Curve25519/edwards25519 field arith (mod 2^255-19), point decompress,
SHA-512-based verify per RFC 8032 §5.1.7. (Reuse Phase A sha512.)
- Primitive `ed25519-verify (pubkey msg sig) -> bool`. Bad-length args → false,
not exception (verify is total).
- Tests: RFC 8032 §7.1 vectors (TEST 1-4 + the 1024-byte one). Tampered msg/sig
→ false. Wrong-length key → false.
- **Acceptance:** all RFC 8032 vectors pass; WASM links. Satisfies fed-sx Step 2
(Ed25519 sig-suite).
### Phase F — RSA-SHA256 verify (PKCS#1 v1.5), pure OCaml ✅ DONE
- Minimal pure-OCaml bignum (only need modexp + DER parse). Parse SPKI DER →
(n, e). RSASSA-PKCS1-v1_5 verify with SHA-256 (Phase A).
- Primitive `rsa-sha256-verify (der-spki msg sig) -> bool`.
- Tests: a generated 2048-bit keypair's signature (vectors hardcoded in the test
from a one-off openssl run, documented in a comment), tamper → false.
- **Acceptance:** vector verifies; tamper fails; WASM links. Satisfies fed-sx
Step 2 (rsa-sha256-2018 sig-suite). **Lower priority** than E — Ed25519 is the
modern default; RSA can land after the HTTP phase if time-boxed.
### Phase G — `file-list-dir`, native-safe ✅ DONE
- `Sys.readdir` → sorted SX list of names (no `.`/`..`). Errors → `enoent`/
`enotdir` classified like the existing `file-read` error mapping.
- Tests: list a known dir, missing dir → error, file-not-dir → error.
- **Acceptance:** passes; WASM build still links (Sys.readdir is stubbed there).
Satisfies fed-sx Step 3 segment replay.
### Phase H — HTTP/1.1 server, **native-only** (`bin/sx_server.ml`) ✅ DONE
- Minimal threaded HTTP/1.1: accept loop (`Unix` + `Thread`), parse request
line + headers + body (Content-Length), build an SX request dict
`{:method :path :query :headers :body}`, call the SX handler callable, take an
SX response dict `{:status :headers :body}`, write it. Connection: close
(keep-alive optional, defer). Bind `127.0.0.1:<port>`.
- Primitive `http-listen (port handler) -> never-returns` registered ONLY in
`bin/sx_server.ml`. Document that it is absent from the WASM kernel.
- Tests: `bin/test_http.sh` — start a server on a port with a tiny SX echo
handler in a subprocess, `curl` GET/POST/404/headers, assert responses, kill.
- **Acceptance:** curl test script green; WASM build untouched (prim not in lib).
Satisfies fed-sx Step 8 transport.
### Phase I — handoff ✅ DONE
- Flip the `plans/erlang-on-sx.md` Blockers entry "SX runtime lacks platform
primitives …" to **RESOLVED**, listing the exact SX primitive names so the
Erlang loop can one-line-wire its blocked Phase 8 BIFs (`crypto:hash/2`,
`cid:from_bytes/1`, `cid:to_string/1`, `file:list_dir/1`, plus note
`httpc`/`sqlite` still deferred). **Do not edit `lib/erlang/`** — that wiring
is the Erlang loop's job; this phase only updates the blocker text + this
plan's "Handoff" section with the primitive→BIF mapping.
- **Acceptance:** blocker text updated; fed-sx Milestone 1 Steps 1-3 + 8
prerequisites all green.
## Scope (hard)
- **Edit only:** `hosts/ocaml/lib/**`, `hosts/ocaml/bin/**`, this plan file.
- **Do NOT edit:** `lib/erlang/**` (Erlang loop owns BIF wiring), `spec/`,
`lib/` root, other `lib/<lang>/`, `plans/erlang-on-sx.md` *except* the one
Blockers entry in Phase I.
- **Pure OCaml for lib primitives.** No new opam deps. If a phase seems to need
one, stop and add a Blockers entry instead.
- **Prove WASM every phase.** No commit without `test_boot.sh` (or wasm build)
green.
- **Never push to `main` or `architecture`.** Branch `loops/fed-prims`, push
`origin/loops/fed-prims`.
- One feature per commit. Short factual messages: `fed-prims: SHA-256 + 4 NIST
vectors`. Tick the box, append a dated Progress-log line (newest first).
- **Never call `sx_build` with no timeout-awareness** — OCaml builds are slow;
use the MCP `sx_build target="ocaml"` / `target="wasm"` tools or
`dune build` with a generous timeout. If the build hangs >10min, Blockers +
stop.
## Build & test reference
```bash
cd hosts/ocaml && dune build bin/sx_server.exe 2>&1 | tail # native
bash hosts/ocaml/browser/test_boot.sh # WASM links + boots
cd hosts/ocaml && dune exec bin/run_tests.exe 2>&1 | tail # OCaml unit tests
SX_SERVER=hosts/ocaml/_build/default/bin/sx_server.exe \
timeout 400 bash lib/erlang/conformance.sh 2>&1 | tail -3 # no-regression gate
```
A primitive is reachable from SX via the epoch protocol:
```bash
printf '(epoch 1)\n(crypto-sha256 "abc")\n' | \
hosts/ocaml/_build/default/bin/sx_server.exe
```
## Handoff (Phase I fills this in)
| SX primitive | Erlang Phase 8 BIF it unblocks |
|---|---|
| `crypto-sha256` / `crypto-sha512` / `crypto-sha3-256` | `crypto:hash/2` |
| `cid-from-bytes` / `cid-from-sx` | `cid:from_bytes/1`, `cid:to_string/1` |
| `ed25519-verify` / `rsa-sha256-verify` | `crypto:verify` / sig-suites |
| `file-list-dir` | `file:list_dir/1` |
| `http-listen` | fed-sx kernel `http:listen/2` (Milestone 1 Step 8) |
**Status: DELIVERED (Phases AH, 2026-05-18).** All primitives are
registered and reachable from SX (`(eval "(crypto-sha256 \"abc\")")`
via the epoch protocol). Signatures the Erlang loop can one-line-wire:
- `(crypto-sha256 bytes) -> hex-string` — also `crypto-sha512`,
`crypto-sha3-256`. lib (`Sx_sha2`/`Sx_sha3`), WASM-safe.
- `(cbor-encode value) -> bytes` / `(cbor-decode bytes) -> value` —
deterministic dag-cbor, lib (`Sx_cbor`), WASM-safe.
- `(cid-from-bytes codec mh-bytes) -> cid-string` /
`(cid-from-sx value) -> cid-string` — lib (`Sx_cid`), WASM-safe.
- `(ed25519-verify pk msg sig) -> bool` /
`(rsa-sha256-verify spki msg sig) -> bool` — total (bad input →
false), lib (`Sx_ed25519`/`Sx_rsa`), WASM-safe.
- `(file-list-dir path) -> (list string)` — sorted, lib, WASM-stubbed.
- `(http-listen port handler) -> never` — **NATIVE ONLY**
(`bin/sx_server.ml`); absent from the WASM kernel by design.
Still **deferred** (not Milestone 1, not provided here): `httpc-request`
(HTTP client / federation v2), `sqlite-*` (v2 indexes). The Erlang loop
should leave `httpc`/`sqlite` BIFs blocked with that note.
## Progress log
_Newest first._
- 2026-05-18 — Phase I: handoff. `erlang-on-sx.md` Blockers gained one
RESOLVED entry (no "SX runtime lacks…" entry pre-existed; it read
"_(none yet)_") mapping every delivered primitive → its Phase 8 BIF,
with httpc/sqlite explicitly left deferred. Handoff section here
filled with signatures + native/WASM notes. Doc-only (no lib/erlang/
edits); Erlang 530/530 unchanged. **fed-sx Milestone 1 Steps 1-3 + 8
prerequisites all green — plan complete (Phases AI done).**
- 2026-05-18 — Phase H: `http-listen` primitive in `bin/sx_server.ml`
(NATIVE ONLY — Unix sockets + Thread per connection, Mutex around
the shared-runtime handler call; HTTP/1.1, Connection: close;
req {:method :path :query :headers :body} → resp {:status :headers
:body}). Test `bin/test_http.sh`: curl GET+query / POST+body / 404
/ custom header — 6/6. NOT in lib, so WASM kernel untouched (boot
green); run_tests 4897 unchanged; Erlang 530/530. Satisfies fed-sx
Milestone 1 Step 8 transport.
- 2026-05-18 — Phase G: `file-list-dir` primitive in
`lib/sx_primitives.ml` (Sys.readdir → sorted names, no "."/"..";
Sys_error prefixed like file-read, msg carries enoent/enotdir).
4 tests: sorted listing, missing dir, not-a-dir, arity. WASM boot
green (Sys.readdir stubbed there); Erlang 530/530; run_tests +4.
Satisfies fed-sx Step 3 segment replay.
- 2026-05-18 — Phase F: pure-OCaml `lib/sx_rsa.ml` (self-contained
bignum modexp, minimal DER SPKI reader, RFC 8017 §8.2.2 PKCS#1
v1.5 verify with SHA-256 DigestInfo prefix). Primitive
`rsa-sha256-verify` total. 5 tests on a fixed RSA-2048 vector
(one-off python-cryptography keygen, hardcoded): valid, tampered
msg/sig, garbage SPKI, non-string. WASM boot green with new lib
module; Erlang 530/530; run_tests +5. Satisfies fed-sx Step 2
(rsa-sha256-2018 sig-suite).
- 2026-05-18 — Phase E: pure-OCaml `lib/sx_ed25519.ml` (minimal
base-2^26 bignum, edwards25519 extended-coord points, RFC 8032
§5.1.7 cofactorless verify reusing Phase-A sha512). Primitive
`ed25519-verify` is total (bad/short/non-string args → false).
8 tests: RFC 8032 §7.1 TEST 1-3 (re-derived independently via
python-cryptography), tampered msg/sig, wrong-length, non-string.
WASM boot green with new lib module; Erlang 530/530; run_tests +8.
Satisfies fed-sx Milestone 1 Step 2 (Ed25519 sig-suite).
- 2026-05-18 — Phase D: pure-OCaml `lib/sx_cid.ml` (unsigned-varint,
multihash, CIDv1, multibase base32-lower), primitives `cid-from-bytes`
/ `cid-from-sx` (cbor→sha2-256→mh→cidv1, dag-cbor codec 0x71). 5 tests:
raw "abc"=bafkreif2pall7d…, raw ""=bafkreihdwdcefg…, dag-cbor {}=
bafyreigbtj4x7i… (all match canonical IPFS CIDs; no `ipfs` CLI so
vectors independently derived in Python), key-order determinism. WASM
boot green with new lib module; Erlang 530/530; run_tests +5.
- 2026-05-18 — Phase C: pure-OCaml `lib/sx_cbor.ml` (dag-cbor encode/
decode), primitives `cbor-encode`/`cbor-decode`. RFC 8949 Appendix-A
vectors, length-then-bytewise key sort + order-invariance determinism,
decode∘encode round-trip (30 tests). Floats unsupported (raise, no
fed-sx shape needs them); tag-42 decode = inner-item passthrough.
WASM boot green with new lib module; Erlang 530/530; run_tests +30.
- 2026-05-18 — Phase B: pure-OCaml `lib/sx_sha3.ml` (Keccak-f[1600] +
SHA-3 pad, domain 0x06), primitive `crypto-sha3-256`. 4 NIST FIPS 202
vectors pass (empty/abc/896-bit + 1600-bit 0xa3 multi-block). WASM boot
green with new lib module; Erlang conformance 530/530; run_tests +4.
- 2026-05-18 — Phase A: pure-OCaml `lib/sx_sha2.ml` (SHA-256 + SHA-512),
primitives `crypto-sha256`/`crypto-sha512`. 7 NIST FIPS 180-4 vectors pass
(empty/abc/896-bit/1M-'a' for sha256; empty/abc/896-bit for sha512). WASM
boot green with new lib module; Erlang conformance 530/530 unchanged.
## Blockers
- _(none yet)_

922
plans/fed-sx-milestone-1.md Normal file
View File

@@ -0,0 +1,922 @@
# fed-sx Milestone 1 — Kernel + Registries + Pin Smoke Test
Concrete implementation plan for the smallest fed-sx that proves the architecture
works end-to-end. Reference: `plans/fed-sx-design.md`. Prerequisite: Erlang-on-SX
Phases 7 (hot reload) + 8 (FFI BIFs).
## Goal
Ship a single-instance, single-actor fed-sx server that:
1. Boots from a verified genesis bundle.
2. Accepts and durably appends signed activities via `POST /activity`.
3. Folds them into projections in real time.
4. Serves AP-standard endpoints (actor, outbox, artifacts, capabilities).
5. Demonstrates **two extensibility proof-points** end-to-end with zero kernel
code changes between definition and use:
- **Verb extensibility** (§5 meta-level): publish `DefineActivity{Pin}` +
`DefineProjection{pin-state}`, then publish a `Pin` activity, observe it
validated and projected.
- **Reactive application extensibility** (§§18-19): publish
`DefineSubscription{Topic}` + `Subscribe{topic: smoketest}` +
`DefineTrigger{when: that subscription, then: publish TestEcho}`, then
publish a tagged Note, observe the subscription match, the trigger fire,
and the derived activity appear in the outbox.
Federation, multi-actor, advanced verbs, IPFS, browser UI, operator dashboard
are **explicitly v2**.
## Non-goals (what milestone 1 deliberately does NOT do)
- **Federation.** No `POST /inbox` from peers, no `Follow`, no delivery queue, no
webfinger discovery flow. Single instance only.
- **Multi-actor.** Single domain actor (`acct:next@next.rose-ash.com`).
- **IPFS / S3 storage backends.** Files on disk only.
- **Advanced verbs.** No `Endorse`, `Supersede`, `Test`, `Build`, `Compose`,
`Note`, `Announce`. Only the four bootstrap verbs (`Create`, `Update`, `Delete`)
plus a defined-from-the-log `Pin` for the smoke test. (`Announce` deferred —
no use case until federation exists.)
- **Browser UI.** Curl-shaped API only.
- **Operator dashboard, quarantine UX.** Logs only.
- **Performance work.** Functional correctness first; perf when measured.
- **Cross-host conformance test corpus.** Only the OCaml/Erlang-on-SX host runs
fed-sx in v1; conformance suite for other hosts is v2.
## Architecture summary
```
POST /activity
┌──────────────────────────┐
│ HTTP server (Erlang-on-SX)│
└─────────────┬─────────────┘
┌─────────────▼──────────────┐
│ Validation pipeline driver │
│ (envelope→sig→schema→...) │
└─────────────┬──────────────┘
┌─────────────▼──────────────┐
│ Log append (JSONL segment) │ ← canonical
└─────────────┬──────────────┘
┌─────────────▼──────────────┐
│ Projection workers │ ← gen_server per
│ (fold scheduler) │ projection
└─────────────────────────────┘
Projection state
(queryable via HTTP)
Native primitives (Erlang-on-SX BIFs from Phase 8):
crypto:* cid:* fs:* http:* sqlite:*
Genesis bundle (binary-embedded SX):
activity-types object-types projections
validators codecs sig-suites
```
## Build order
Eight steps in dependency order. Each step has concrete deliverables, testable
in isolation, and a clear acceptance check.
| Step | Title | Depends on |
|------|-------|------------|
| **1** | Repo skeleton + canonical CID computation | Phase 8 (cid BIFs) |
| **2** | Activity envelope + signature verify | Phase 8 (crypto BIFs) |
| **3** | JSONL log + sequence numbers | Phase 8 (fs BIFs) |
| **4** | Genesis bundle (SX sources + bundling + CID verification) | Step 1 |
| **5** | Registry mechanism + bootstrap-projection dispatch | Steps 2, 4 |
| **6** | Validation pipeline driver + `POST /activity` | Steps 2, 3, 5 |
| **7** | Projection scheduler (gen_server per projection) | Steps 5, 6 |
| **8** | HTTP server, AP endpoints, projection queries | Steps 6, 7 |
| **9** | Smoke tests (Pin verb + reactive application) | Steps 1-8 |
---
## Step 1 — Repo skeleton + canonical CID
**Deliverables:**
```
next/
├── README.md # what this is
├── kernel/ # Erlang-on-SX
│ └── (empty for now)
├── genesis/ # core SX bootstrap definitions
│ └── (empty for now)
├── tests/ # smoke test scripts
│ └── (empty for now)
└── data/ # gitignored runtime state
├── log/
├── objects/
├── snapshots/
├── indexes/
└── keys/
```
Plus one Erlang-on-SX module:
```erlang
% next/kernel/cid.erl
-module(cid).
-export([from_sx/1, to_string/1, from_string/1, equals/2]).
from_sx(SxValue) ->
Cbor = cid:cbor_encode(canonicalize_sx(SxValue)),
Hash = crypto:sha2_256(Cbor),
cid:from_bytes(<<"raw">>, Hash). % defaults to dag-cbor codec
canonicalize_sx(V) -> ... % sorts dict keys, normalizes strings
```
**Tests:**
- Same SX value → same CID across multiple invocations.
- Different SX values → different CIDs.
- Whitespace/comment differences in source → identical CIDs (parsed AST identical).
- Reordered dict keys → identical CIDs (sorted-key canonicalization).
- Cross-host parity (just OCaml host for v1, but write the test so adding hosts is mechanical).
**Acceptance:** `bash next/tests/cid.sh` passes 10+ cases.
---
## Step 2 — Activity envelope + signature verify
**Deliverables:**
```erlang
% next/kernel/envelope.erl
-module(envelope).
-export([validate_shape/1, canonical_bytes/1, verify_signature/2]).
% Envelope shape per design §3.1:
% #{id, type, actor, published, to, cc, audience_extras,
% object | target | origin | result,
% capabilities_required, proofs, signature}
validate_shape(Activity) -> ok | {error, Reason}.
canonical_bytes(Activity) ->
% Strip signature, canonicalize via dag-cbor, return bytes for sig coverage
Stripped = maps:remove(signature, Activity),
cid:cbor_encode(canonicalize_for_sig(Stripped)).
verify_signature(Activity, ActorState) ->
% Time-aware: find key with id == sig.key_id that was active at published
% Per design §9.6
...
```
**Tests:**
- Envelope shape: required fields present (id, type, actor, published, signature)
- Envelope shape: type is a known activity-type or unknown-but-string
- Envelope shape: signature has key_id, algorithm, value
- Sig verify: valid RSA-SHA256 signature against published key → ok
- Sig verify: valid Ed25519 signature → ok
- Sig verify: tampered envelope → fail
- Sig verify: key superseded before activity timestamp → fail
- Sig verify: key superseded after activity timestamp → ok (historical valid)
**Acceptance:** `bash next/tests/envelope.sh` passes 15+ cases.
---
## Step 3 — JSONL log + sequence numbers
**Deliverables:**
```erlang
% next/kernel/log.erl
-module(log).
-export([open/1, append/2, read_segment/2, tip/1, replay/3]).
% Per design §15.2: per-actor outbox, segments cap ~64MB,
% format = JSONL (one canonical JSON-LD activity per line)
open(ActorId) ->
BasePath = log_path_for_actor(ActorId),
fs:mkdir_p(BasePath),
{ok, #{base => BasePath, current => current_segment(BasePath), seq => next_seq(BasePath)}}.
append(LogState, Activity) ->
Json = jsonld:encode(Activity),
Path = current_segment_path(LogState),
Line = <<Json/binary, "\n">>,
fs:append_file(Path, Line),
NewSeq = LogState#{seq := LogState.seq + 1},
rotate_if_needed(NewSeq).
% replay/3 calls Fun(Activity, Acc) for every activity in chronological order
replay(LogState, InitAcc, Fun) -> ...
```
**Tests:**
- Append + read back gives identical activity (round-trip).
- Sequence numbers monotonic and gap-free per actor.
- Segment rotation at size threshold.
- Replay visits all activities in append order across multiple segments.
- Restart preserves tip pointer (seq number resumes correctly).
- Concurrent appends (using gen_server-mediated access) are serialized correctly.
**Acceptance:** `bash next/tests/log.sh` passes 10+ cases.
---
## Step 4 — Genesis bundle
**Deliverables:**
Genesis bundle SX sources (per design §12.2). Each is a small SX file authored
by hand for the bootstrap set:
```
next/genesis/
├── manifest.sx # bundle root: lists all definitions
├── activity-types/
│ ├── create.sx # DefineActivity{name: "Create", ...}
│ ├── update.sx
│ └── delete.sx
├── object-types/
│ ├── sx-artifact.sx
│ ├── note.sx
│ ├── tombstone.sx
│ ├── define-activity.sx # DefineObject for the Define* meta types
│ ├── define-object.sx
│ ├── define-projection.sx
│ ├── define-validator.sx
│ ├── define-codec.sx
│ ├── define-sig-suite.sx
│ └── snapshot.sx
├── projections/
│ ├── activity-log.sx # identity projection
│ ├── by-type.sx
│ ├── by-actor.sx
│ ├── by-object.sx
│ ├── actor-state.sx
│ ├── define-registry.sx # the chicken-and-egg projection
│ └── audience-graph.sx
├── validators/
│ ├── envelope-shape.sx
│ ├── signature.sx
│ └── type-schema.sx
├── codecs/
│ ├── dag-cbor.sx # delegates to cid:cbor_encode/decode BIFs
│ ├── raw.sx
│ └── dag-json.sx
├── sig-suites/
│ ├── rsa-sha256-2018.sx
│ └── ed25519-2020.sx
└── audience/
├── public.sx
├── followers.sx
└── direct.sx
```
Plus a build-time bundler:
```erlang
% next/kernel/bootstrap.erl
-module(bootstrap).
-export([build_genesis/1, verify_genesis/1, load_genesis/1]).
build_genesis(SourceDir) ->
% Walk SourceDir, parse each .sx file, build a single dag-cbor bundle,
% compute its CID, write bundle.cbor + CID to data/genesis/
...
verify_genesis(BundlePath) ->
% Compute CID of the bundle as loaded; compare to expected (hardcoded
% in the kernel binary). Mismatch → halt.
...
load_genesis(BundlePath) ->
% Parse the bundle, register all definitions in the in-memory registry
...
```
**Tests:**
- All genesis SX files parse cleanly.
- Bundle CID is deterministic (rebuild same sources → same CID).
- Bundle reload reproduces the exact same registry state.
- Tampered bundle → `verify_genesis` returns `{error, cid_mismatch}`.
**Acceptance:** `bash next/tests/bootstrap.sh` passes; `next/data/genesis/bundle.cbor`
created with a known stable CID.
---
## Step 5 — Registry mechanism + bootstrap dispatch
**Deliverables:**
Registries are gen_servers, one per kind, each holding the active version map:
```erlang
% next/kernel/registry.erl
-module(registry).
-behaviour(gen_server).
-export([start_link/0, lookup/2, register/3, list/1]).
% Internal state:
% #{activity_types => #{Name => #{cid, schema_fn, semantics_fn, supersedes}},
% object_types => ...,
% projections => ...,
% validators => ...,
% codecs => ...,
% sig_suites => ...,
% ...}
lookup(Kind, Name) -> {ok, Entry} | {error, not_found}.
register(Kind, Name, Entry) -> ok | {error, Reason}.
list(Kind) -> [#{name, cid}].
```
The `define-registry` projection's fold updates this gen_server's state when
new `Define*` activities arrive. (Bootstrapping circle resolved: at startup,
`bootstrap:load_genesis/1` populates the registry directly; from then on, the
projection fold maintains it.)
**Tests:**
- After genesis load, `registry:list(activity_types)` returns Create/Update/Delete.
- `registry:lookup(activity_types, "Create")` returns the schema and semantics.
- A new `DefineActivity{name: "Pin"}` activity (synthesised, hand-signed for the
test) routes through the projection fold, ends up in the registry.
- Lookup never caches across activities (verified by introducing a new definition
mid-test and confirming the next lookup sees it).
**Acceptance:** `bash next/tests/registry.sh` passes 10+ cases.
---
## Step 6 — Validation pipeline + POST /activity
**Deliverables:**
```erlang
% next/kernel/pipeline.erl
-module(pipeline).
-export([validate_inbound/1, validate_outbound/1]).
% Per design §14, run stages in order, halt on first failure.
validate_inbound(Activity) ->
Stages = [
fun stage_envelope/1,
fun stage_signature/1,
fun stage_replay/1,
fun stage_audience/1,
fun stage_activity_schema/1,
fun stage_object_schema/1,
fun stage_content_validators/1,
fun stage_capabilities/1,
fun stage_trust/1
],
run_stages(Activity, Stages).
validate_outbound(Activity) ->
% Subset of inbound stages (no replay, no trust check; auth done at HTTP layer)
...
```
```erlang
% next/kernel/outbox.erl
-module(outbox).
-export([publish/2]).
publish(ActorId, ActivityRequest) ->
Activity = construct_envelope(ActorId, ActivityRequest),
Signed = sig:sign(Activity, ActorId),
case pipeline:validate_outbound(Signed) of
ok ->
log:append(actor_log(ActorId), Signed),
projection:async_fold(Signed),
{ok, #{cid => cid:from_sx(Signed),
ap_id => maps:get(id, Signed)}};
{error, Reason} ->
{error, Reason}
end.
```
**Tests:**
- Valid activity through full pipeline → appended to log.
- Bad envelope → 400, not in log.
- Bad signature → 401, not in log.
- Replayed activity → 200 duplicate, not re-appended.
- Schema violation (e.g. Create with no object) → 422.
- Activity logged before projection completes (async).
**Acceptance:** `bash next/tests/pipeline.sh` passes 15+ cases.
---
## Step 7 — Projection scheduler
**Deliverables:**
```erlang
% next/kernel/projection.erl
-module(projection).
-export([start_link/1, async_fold/1, query/2, snapshot/1]).
-behaviour(gen_server).
% One gen_server per active projection. State:
% #{cid, name, fold_fn, current_state, log_tip,
% snapshot_dir, last_snapshot_at}
% async_fold/1 broadcasts a new activity to every projection gen_server;
% each folds it into its own state. Failures (gas, sandbox violation)
% tag the activity but don't affect log durability.
% query/2 returns current state (or state-as-of)
% snapshot/1 forces a snapshot now (also runs periodically)
```
```erlang
% next/kernel/sandbox.erl
-module(sandbox).
-export([eval_pure/2, eval_crypto/2, eval_effectful/3]).
% eval_pure runs an SX function in pure mode: no IO platform, gas budget,
% deterministic. Used by projection folds, validators, audience predicates.
% Wrapper over the SX runtime evaluator with a stripped platform.
```
**Tests:**
- New activity → all projections fold it concurrently.
- Projection fold completes within gas budget.
- Gas-exhausting fold → activity tagged, projection state unchanged, no kernel crash.
- Sandbox violation (fold tries IO) → same handling.
- Snapshot create + reload → state matches.
- Snapshot CID stable across kernel restarts.
**Acceptance:** `bash next/tests/projection.sh` passes 15+ cases.
---
## Step 8 — HTTP server + endpoints
**Deliverables:**
Core endpoints (per design §16.1):
```
GET /actors/<id> # actor doc
GET /actors/<id>/outbox # OrderedCollection
GET /actors/<id>/outbox?page=true # OrderedCollectionPage
POST /activity # publish (auth: bearer token)
GET /artifacts/<cid> # CID-addressed artifact
GET /artifacts/<cid>/raw
GET /projections # list of projections
GET /projections/<name> # full state
GET /projections/<name>?at=<ts> # time-travel
GET /projections/<name>/<key> # indexed lookup
GET /define-registry
GET /.well-known/sx-capabilities
GET /.well-known/webfinger
```
```erlang
% next/kernel/http_server.erl
-module(http_server).
-export([start/1, route/1]).
start(Port) ->
http:listen(Port, fun ?MODULE:route/1).
route(Request) -> {Status, Headers, Body}.
```
Content negotiation per `Accept`:
- `application/activity+json` (default)
- `application/cbor` (dag-cbor)
- `application/json` (compact, no @context expansion)
- `application/sx`
Auth on `POST /activity`: bearer token from env var `NEXT_PUBLISH_TOKEN`.
**Tests:**
- Each endpoint returns expected shape for known artifact.
- Content negotiation: same artifact in 4 representations.
- 404 for unknown artifact CID.
- 401 for `POST /activity` without token.
- Pagination: outbox with > 50 activities returns OrderedCollectionPage.
**Acceptance:** `bash next/tests/http.sh` passes 20+ cases.
---
## Step 9 — Smoke tests
**The proof points.** Two end-to-end smoke tests demonstrate, between them, that
fed-sx is genuinely a substrate for distributed reactive applications expressed
as data — not a system you extend by writing kernel code.
- **9a — Pin smoke test (`next/tests/smoke_pin.sh`)** — verb extensibility:
defining a new activity type and projection at runtime via `Define*`
artifacts. Verifies the meta-level (§5).
- **9b — Reactive application smoke test (`next/tests/smoke_app.sh`)** —
application extensibility: defining a new subscription type, subscribing,
registering a trigger, and observing the full reactive loop fire end-to-end
without kernel code changes. Verifies §§18-19.
Both must pass for milestone 1 acceptance.
### Step 9a — Pin smoke test
**Test script:** `next/tests/smoke_pin.sh`
```bash
#!/usr/bin/env bash
set -euo pipefail
# 0. Start a fresh fed-sx kernel (background)
./next/scripts/start.sh fresh
sleep 2
TOKEN=$(cat next/data/keys/publish.token)
# 1. Verify actor exists
curl -s http://localhost:9999/actors/next | jq -e '.type == "Person"'
# 2. Verify outbox has actor's first Create{Person}
curl -s http://localhost:9999/actors/next/outbox?page=true \
| jq -e '.orderedItems | length == 1 and .[0].type == "Create"'
# 3. Verify Pin is NOT a known activity type
curl -s http://localhost:9999/define-registry?kind=activity_types \
| jq -e '.[] | select(.name == "Pin") | length == 0' || exit 1
# 4. Publish DefineActivity{name: "Pin", schema: ..., semantics: ...}
PIN_DEF=$(cat <<'JSON'
{
"type": "Create",
"object": {
"type": "DefineActivity",
"name": "Pin",
"schema": "(fn (act) (and (string? (-> act :object :path)) (cid? (-> act :object :cid))))",
"semantics": "(fn (state act) (assoc-in state [:pins (-> act :object :path)] (-> act :object :cid)))"
}
}
JSON
)
curl -s -X POST http://localhost:9999/activity \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/activity+json" \
-d "$PIN_DEF" | jq -e '.cid' > /dev/null
# 5. Verify Pin IS now a known activity type
curl -s http://localhost:9999/define-registry?kind=activity_types \
| jq -e '.[] | select(.name == "Pin") | length == 1'
# 6. Also publish a DefineProjection{name: "pin-state"} that folds Pin into state
PIN_PROJ=$(cat <<'JSON'
{
"type": "Create",
"object": {
"type": "DefineProjection",
"name": "pin-state",
"initial-state": "{}",
"fold": "(fn (state act) (if (= (:type act) \"Pin\") (assoc state (-> act :object :path) (-> act :object :cid)) state))"
}
}
JSON
)
curl -s -X POST http://localhost:9999/activity \
-H "Authorization: Bearer $TOKEN" \
-d "$PIN_PROJ" | jq -e '.cid'
# 7. Now publish a Pin activity
PIN=$(cat <<'JSON'
{
"type": "Pin",
"object": {
"type": "PinSpec",
"path": "/docs/intro",
"cid": "bafyreigh2akiscaildc3xqxx4xqxx4xqxx4xqxx4xqxx4xqxx4xqxx4xqxxe"
}
}
JSON
)
curl -s -X POST http://localhost:9999/activity \
-H "Authorization: Bearer $TOKEN" \
-d "$PIN" | jq -e '.cid'
# 8. Verify Pin appears in outbox
curl -s http://localhost:9999/actors/next/outbox?page=true \
| jq -e '.orderedItems | map(select(.type == "Pin")) | length == 1'
# 9. Verify pin-state projection has the entry
sleep 1 # allow async projection
curl -s http://localhost:9999/projections/pin-state \
| jq -e '."/docs/intro" == "bafyreigh2akiscaildc3xqxx4xqxx4xqxx4xqxx4xqxx4xqxx4xqxx4xqxxe"'
# 10. Negative test: publish a malformed Pin (missing path) → expect 422
BAD_PIN='{"type": "Pin", "object": {"cid": "bafy..."}}'
HTTP_STATUS=$(curl -s -o /dev/null -w "%{http_code}" -X POST http://localhost:9999/activity \
-H "Authorization: Bearer $TOKEN" -d "$BAD_PIN")
[[ "$HTTP_STATUS" == "422" ]] || { echo "expected 422, got $HTTP_STATUS"; exit 1; }
# 11. Restart kernel; verify state recovers
./next/scripts/stop.sh
./next/scripts/start.sh
sleep 2
curl -s http://localhost:9999/projections/pin-state \
| jq -e '."/docs/intro" == "bafyreigh2akiscaildc3xqxx4xqxx4xqxx4xqxx4xqxx4xqxx4xqxxe"'
echo "✓ Pin smoke test passed — verb extensibility demonstrated end-to-end"
```
**Acceptance for 9a:** smoke test exits 0. The whole flow happens with **zero
fed-sx kernel code changes** between defining the verb and using it.
### Step 9b — Reactive application smoke test
**The bigger proof point.** Demonstrates that fed-sx supports distributed
reactive applications composed of `DefineSubscription` + `DefineTrigger` +
`DefineProjection` — the application model from §§18-19.
The test runs on a single instance (federation is v2), so the "subscriber" and
"publisher" are the same actor. That's intentional — milestone 1 proves the
mechanism; milestone 2 spreads it across instances.
**Test script:** `next/tests/smoke_app.sh`
```bash
#!/usr/bin/env bash
set -euo pipefail
# Assumes 9a has already run (fresh kernel optional; can run alongside).
TOKEN=$(cat next/data/keys/publish.token)
BASE=http://localhost:9999
# 1. Verify "Topic" subscription type and "Subscribe" verb are NOT yet defined.
curl -s "$BASE/define-registry?kind=subscription_types" \
| jq -e 'map(select(.name == "Topic")) | length == 0'
# 2. Publish DefineSubscription{name: "Topic", ...}
TOPIC_DEF=$(cat <<'JSON'
{
"type": "Create",
"object": {
"type": "DefineSubscription",
"name": "Topic",
"schema": "(fn (sub) (string? (-> sub :tag)))",
"match": "(fn (sub act) (and (= (:type act) \"Note\") (member? (-> sub :tag) (or (-> act :object :tags) (list)))))",
"delivery": "{:default :push :modes (list :push :pull)}"
}
}
JSON
)
curl -s -X POST "$BASE/activity" \
-H "Authorization: Bearer $TOKEN" -d "$TOPIC_DEF" | jq -e '.cid'
# 3. Verify Topic IS now a known subscription type.
curl -s "$BASE/define-registry?kind=subscription_types" \
| jq -e 'map(select(.name == "Topic")) | length == 1'
# 4. Subscribe to the "smoketest" topic.
SUBSCRIBE=$(cat <<'JSON'
{
"type": "Subscribe",
"object": {"type": "Topic", "tag": "smoketest"}
}
JSON
)
SUB_CID=$(curl -s -X POST "$BASE/activity" \
-H "Authorization: Bearer $TOKEN" -d "$SUBSCRIBE" | jq -r '.cid')
# 5. Verify subscriptions projection has the new entry.
sleep 1
curl -s "$BASE/projections/subscriptions" \
| jq -e '.["https://next.rose-ash.com/actors/next"] | map(select(.type == "Topic")) | length == 1'
# 6. Define a projection that records matched activities (per-application
# namespace would happen via DefineApplication in v1.x; for v1 the
# projection is global to the actor).
TOPIC_PROJ=$(cat <<'JSON'
{
"type": "Create",
"object": {
"type": "DefineProjection",
"name": "topic-events",
"initial-state": "{}",
"fold": "(fn (state act) (if (and (= (:type act) \"Note\") (member? \"smoketest\" (or (-> act :object :tags) (list)))) (assoc-in state [(:cid act)] act) state))"
}
}
JSON
)
curl -s -X POST "$BASE/activity" \
-H "Authorization: Bearer $TOKEN" -d "$TOPIC_PROJ" | jq -e '.cid'
# 7. Define a trigger: when a Topic{smoketest} subscription matches, publish
# a TestEcho activity. We need an "Echo" activity type first.
ECHO_DEF=$(cat <<'JSON'
{
"type": "Create",
"object": {
"type": "DefineActivity",
"name": "TestEcho",
"schema": "(fn (act) (cid? (-> act :object :echoes)))",
"semantics": "(fn (state act) state)"
}
}
JSON
)
curl -s -X POST "$BASE/activity" \
-H "Authorization: Bearer $TOKEN" -d "$ECHO_DEF" | jq -e '.cid'
TRIGGER=$(cat <<JSON
{
"type": "Create",
"object": {
"type": "DefineTrigger",
"name": "echo-on-smoketest",
"when-subscription": "$SUB_CID",
"cascade-limit": 1,
"then": "(fn (act sub env) {:publish (list {:type \"TestEcho\" :object {:echoes (:cid act)}})})"
}
}
JSON
)
curl -s -X POST "$BASE/activity" \
-H "Authorization: Bearer $TOKEN" -d "$TRIGGER" | jq -e '.cid'
# 8. Capture outbox length so we can detect new entries.
BEFORE=$(curl -s "$BASE/actors/next/outbox?page=true" \
| jq -r '.orderedItems | length')
# 9. Publish a Note tagged "smoketest" — should match subscription, fire trigger,
# cause TestEcho to be published.
NOTE=$(cat <<'JSON'
{
"type": "Create",
"object": {
"type": "Note",
"content": "hello reactive world",
"tags": ["smoketest"]
}
}
JSON
)
NOTE_CID=$(curl -s -X POST "$BASE/activity" \
-H "Authorization: Bearer $TOKEN" -d "$NOTE" | jq -r '.cid')
# 10. Wait for projection + trigger.
sleep 2
# 11. Verify topic-events projection captured the Note.
curl -s "$BASE/projections/topic-events" \
| jq -e ". | to_entries | length == 1"
# 12. Verify outbox grew by exactly TWO activities (the Note + the trigger's TestEcho).
AFTER=$(curl -s "$BASE/actors/next/outbox?page=true" \
| jq -r '.orderedItems | length')
[[ $((AFTER - BEFORE)) == 2 ]] || { echo "expected +2 activities, got $((AFTER - BEFORE))"; exit 1; }
# 13. Verify the latest activity is a TestEcho referencing the original Note's CID.
curl -s "$BASE/actors/next/outbox?page=true" \
| jq -e ".orderedItems[0] | .type == \"TestEcho\" and .object.echoes == \"$NOTE_CID\""
# 14. Negative case: publish a Note WITHOUT the "smoketest" tag — must NOT
# trigger, must NOT echo.
BEFORE2=$(curl -s "$BASE/actors/next/outbox?page=true" | jq -r '.orderedItems | length')
NOTE_OTHER=$(cat <<'JSON'
{"type": "Create", "object": {"type": "Note", "content": "no match", "tags": ["other"]}}
JSON
)
curl -s -X POST "$BASE/activity" \
-H "Authorization: Bearer $TOKEN" -d "$NOTE_OTHER" | jq -e '.cid'
sleep 2
AFTER2=$(curl -s "$BASE/actors/next/outbox?page=true" | jq -r '.orderedItems | length')
[[ $((AFTER2 - BEFORE2)) == 1 ]] || { echo "expected +1 activity (no echo), got $((AFTER2 - BEFORE2))"; exit 1; }
# 15. Cascade limit check: prove the trigger doesn't recursively echo TestEcho.
# The TestEcho activity itself should NOT match the Topic{smoketest}
# subscription (it's not a Note), so no cascade, but verify cascade-depth
# was set to 1 on the echo so a future trigger on TestEcho would refuse.
LATEST_ECHO=$(curl -s "$BASE/actors/next/outbox?page=true" \
| jq -r '.orderedItems | map(select(.type == "TestEcho")) | .[0]')
echo "$LATEST_ECHO" | jq -e '."cascade-depth" == 1'
# 16. Restart kernel; verify subscription, trigger, projection all survive.
./next/scripts/stop.sh
./next/scripts/start.sh
sleep 2
curl -s "$BASE/projections/subscriptions" \
| jq -e '.["https://next.rose-ash.com/actors/next"] | map(select(.type == "Topic")) | length == 1'
curl -s "$BASE/projections/topic-events" | jq -e ". | to_entries | length >= 1"
curl -s "$BASE/define-registry?kind=triggers" \
| jq -e 'map(select(.name == "echo-on-smoketest")) | length == 1'
echo "✓ Reactive application smoke test passed — Subscribe + Trigger + Projection demonstrated end-to-end"
```
**What this proves (and what it doesn't):**
Proves:
- `DefineSubscription` + `Subscribe` mechanism works end-to-end.
- Subscription's `match-fn` evaluates correctly in pure mode against inbound
activities.
- `DefineTrigger` fires on subscription matches.
- Trigger's `then-sx` can publish derived activities (the `:publish` result).
- Cascade-depth metadata propagates correctly.
- Subscription state, trigger registration, and projection state all survive
kernel restart (snapshot + log replay).
- The full reactive application loop works without any kernel code changes
between defining the components and exercising them.
Does NOT prove (deferred to milestone 2+):
- Cross-instance subscriptions (federation).
- Trigger `:effect` results calling effectful primitives.
- `DefineApplication` bundle install/update/fork.
- Per-application namespace isolation.
- Cascade prevention against malicious cascading from peer instances.
**Acceptance for 9b:** smoke test exits 0. Like 9a, **zero fed-sx kernel code
changes** between defining the application components and observing them
operate.
---
## Acceptance criteria for milestone 1
All of:
1. **Each step's test suite passes** (`bash next/tests/<step>.sh`).
2. **Both smoke tests pass** (`bash next/tests/smoke_pin.sh` and
`bash next/tests/smoke_app.sh`).
3. **Erlang-on-SX baseline preserved** — adding fed-sx kernel modules in
`next/kernel/*.erl` doesn't break Phase 1-8 conformance.
4. **Restart durability** — kill the kernel mid-write, restart, projections
resume from snapshot, no log corruption.
5. **Manual Mastodon poke** — point a Mastodon account at
`https://next.rose-ash.com/actors/next` and verify the actor doc fetches and
webfinger discovery works (read-only AP interop, no follow).
## What lands when
This is the work-order an agent (or human) follows. Steps 1-3 can be done in
parallel after the Erlang Phase 8 BIFs land. Steps 4-7 are sequential. Step 8
can start in parallel with step 7. Step 9 is the integration test.
```
Phase 7+8 (loops/erlang) ───┐
┌─── Step 1 ──┬─── Step 2 ──┬─── Step 3
│ │ │
└─────────────┼─── Step 4 ──┴────┐
│ │
└─── Step 5 ───────┤
Step 6 ─────┤
Step 7 ─────┤
Step 8 ─────┤
Step 9 ─────┘
```
Estimated effort if done by a focused agent loop, one feature per iteration:
~30-50 commits across all 9 steps. Could plausibly be a `loops/fed-sx` workstream
once Phase 7+8 are done.
## What's deferred to milestone 2
- **Federation** (the second-biggest piece). `POST /inbox`, Follow lifecycle,
delivery queue, backfill, capability negotiation between peers. Whole of
design §13.
- **Multi-actor** with per-user OAuth and capability tokens. Design §9.5.
- **IPFS storage backend** as a `DefineStorage` entry. Design §15.3.
- **Browser client + operator dashboard** (probably in Elm-on-SX or similar).
- **Rich verbs**: `Endorse`, `Supersede`, `Test`, `Build`, `Compose`, `Note`,
`Announce`. All defined as `DefineActivity` artifacts, federated.
- **Cross-host conformance** — Python/JS/Haskell hosts running fed-sx. Design
§11.8.
- **OpenTimestamps proofs** as a `DefineProof` entry.
- **Performance work** — JIT-compiled folds, snapshot acceleration, federation
batching.
Milestone 2 unlocks "real federation between two fed-sx instances." Milestone 3
is the rose-ash port (blog, market, events, federation, account, orders) as
fed-sx applications.
---
## Appendix A: open questions for milestone 1
A few things still under-specified; resolve as work begins.
1. **HTTP server library.** Does the Phase 8 `http:listen/2` BIF wrap an
existing OCaml HTTP server (the sx.rose-ash.com one) or something simpler?
Implementation choice deferred to Phase 8.
2. **JSON-LD library.** AP wire format requires JSON-LD canonicalization for
signature coverage. Either pull a library or write a minimal subset for the
shapes we actually use. Probably the latter — our envelope is well-defined.
3. **Bearer token rotation.** v1 uses a single env-var token. Token rotation
without restart needs registry-style mgmt; can wait.
4. **Snapshot rate limits.** Default in design is "every 1000 activities or
60 seconds." Tunable per-projection later; v1 uses the default.
5. **Genesis bundle format.** Dag-cbor map per §12.2; concrete schema needs
one round of refinement once we author the actual definitions in step 4.

View File

@@ -1,24 +1,64 @@
# Go-on-SX: Go on the CEK/VM
# Go-on-SX Go as an SX guest language
Compile Go source to SX AST; the existing CEK evaluator runs it. The unique angle: Go's
goroutines and channels map cleanly onto SX's IO suspension machinery (`perform`/`cek-resume`)
— a goroutine is a `cek-step-loop` running in a cooperative scheduler, a channel send/receive
is a `perform` that suspends until the other end is ready.
Port Go to SX as the **first static-typed, bidirectional-checked guest** in
the rose-ash language family. Goal isn't a production Go compiler; it's to
prove the substrate from a paradigm angle the existing eleven guests don't
cover, and to chisel out the lib/guest kits that statically-typed guests N+1
and N+2 will need.
End-state goal: **core Go programs running**, including goroutines, channels, defer/panic/recover,
interfaces, and structs. Not a full Go compiler — no generics, no CGo, no full stdlib — but
a faithful runtime for idiomatic Go concurrent programs.
Reference:
- `plans/lib-guest.md` — parent, chiselling discipline, two-language rule.
- `plans/lib-guest-scheduler.md` — sister kit; Go's scheduler pairs with
Erlang's. Extraction gated on this loop reaching Phase 5.
- `plans/lib-guest-static-types-bidirectional.md` — sister kit; Go's
checker pairs with a TBD second consumer. Extraction gated on this loop
reaching Phase 3.
- `plans/erlang-on-sx.md` — reference implementation for paradigm-port:
process model, BIF registry, hot reload, VM bytecode opcodes.
## Ground rules
**Branch:** `loops/go` (loop-style workstream once kicked off). SX files via
`sx-tree` MCP only.
- **Scope:** only touch `lib/go/**` and `plans/go-on-sx.md`. Do **not** edit `spec/`,
`hosts/`, `shared/`, or other `lib/<lang>/`.
- **Shared-file issues** go under "Blockers" below with a minimal repro; do not fix here.
- **SX files:** use `sx-tree` MCP tools only.
- **Architecture:** Go source → Go AST → SX AST. No standalone Go evaluator.
- **Concurrency model:** cooperative, not preemptive. Goroutines yield at channel ops and
`time.Sleep`. A round-robin scheduler in SX drives them.
- **Commits:** one feature per commit. Keep `## Progress log` updated and tick boxes.
## Thesis — why Go
Eleven guests already live in `lib/`: apl, common-lisp, datalog, erlang,
forth, haskell, hyperscript, js, kernel, lua, minikanren, ocaml, prolog,
ruby, scheme, smalltalk, tcl. Every one is either **dynamically typed**
(most) or **HM-inferred** (haskell, ocaml). None exercise:
1. **Bidirectional static type checking** — annotation-driven, locally-
inferred, the dominant paradigm of modern statically-typed languages.
2. **Anonymous-channel concurrency** — Go's `chan` and `select`. Erlang has
addressed processes + mailboxes; Go has anonymous values + structural
pairing. Two different vocabularies for the same underlying scheduler
machinery.
3. **Structural interfaces**`io.Reader` is "anything with this method
signature", not a declared subtype relationship. Different from Haskell
typeclasses (nominal), different from Lua duck typing (no declaration).
These three together make Go an unusually high-value port for proving SX.
If SX can host Go cleanly, it can host the next decade of mainstream
statically-typed languages (Rust, TS, Swift, Kotlin, Scala 3, Hack) because
they share these three properties.
Like Erlang-on-SX validated the actor model on the substrate, Go-on-SX
validates the goroutine model + bidirectional types.
## Non-goals (deliberate)
Out of scope. Reject feature requests for these without further consideration:
- **`unsafe` package.** Memory mucking. Skip entirely.
- **CGo.** C interop. Out of scope at every level.
- **Full `reflect`.** Provide enough for `fmt.Println` to render values;
reject the rest.
- **Build tags, modules, vendoring.** Treat source as monolithic. One
package per file, no real import resolution.
- **Production performance.** Conformance tests pass; benchmarks don't.
- **Garbage collection tuning.** SX's GC is what you get.
- **Race detector, escape analysis, inlining.** Out of scope.
- **`os`, `net/http`, full stdlib.** Provide a deliberately small slice
(Phase 8 below).
## Architecture sketch
@@ -26,113 +66,335 @@ a faithful runtime for idiomatic Go concurrent programs.
Go source text
lib/go/tokenizer.sx — Go tokens: keywords, idents, string/rune/number literals,
operators, semicolon insertion rules
lib/go/lex.sx — tokens; ASI; literals; operators
(consumes lib/guest/core/lex.sx)
lib/go/parser.sx — Go AST: package, import, var, const, type, func, struct,
interface, goroutine, channel ops, defer, select, for range
lib/go/parse.sx AST: package/import/var/const/type/func/struct/
│ interface; expressions; statements
│ (consumes lib/guest/core/pratt.sx + ast.sx)
lib/go/transpile.sx — Go AST → SX AST
lib/go/types.sx bidirectional type checker. Synth + check judgments;
structural interface satisfaction; pluggable subtype
│ (INDEPENDENT — no lib/guest/static-types-bidirectional
│ yet; this loop builds the first consumer)
lib/go/runtime.sx — goroutine scheduler, channel primitives, defer stack,
panic/recover, interface dispatch, slice/map ops
lib/go/eval.sx — tree-walk evaluator on CEK. Variables as mutable cells;
slices = (length, capacity, backing-vector); maps =
│ SX dict; defer stack per frame.
CEK / VM
lib/go/sched.sx — goroutine scheduler + channels + select
│ (INDEPENDENT — no lib/guest/scheduler yet; this loop
│ builds the first consumer)
lib/go/std/ — minimal stdlib slice (fmt, strings, strconv, sync,
time, errors)
```
Key semantic mappings:
- `go fn()`spawn new coroutine (SX coroutine primitive, Phase 4 of primitives)
- `ch <- v` (send) → `perform` that suspends until receiver ready; scheduler picks next goroutine
- `v := <-ch` (receive) → `perform` that suspends until sender ready
- `select { case ... }`scheduler checks all channel readiness, picks first ready
- `defer fn()` → push onto a per-goroutine defer stack; run on return/panic
- `panic(v)``raise` the value; `recover()` catches it in deferred function
- `interface{}` → any SX value (duck typed)
- `struct { ... }` → SX hash table with field names as keys
- `slice` → SX vector with length + capacity metadata
- `map[K]V` → SX mutable hash table (Phase 10 of primitives)
Semantic mappings (operational):
- `go fn(args)``task-spawn` on the local scheduler.
- `ch <- v` `task-block` with predicate "receiver waiting on ch".
- `v := <-ch` `task-block` with predicate "sender waiting on ch".
- `select { case ... }``task-block` with predicate "any case ready".
- `defer fn()` → push thunk onto per-frame defer stack; runs LIFO on
return or panic.
- `panic(v)` → raise SX exception; deferred fns run while unwinding.
- `recover()` → CEK exception capture inside a deferred fn.
- `interface{T}` → type-check matches structurally against T's method
set; at runtime, the value carries its concrete-type metadata.
- `struct{...}` → SX dict + type tag; methods are functions in the type's
method table.
- `*T` (pointer) → mutable cell (Common Lisp port did the same).
- `[]T` (slice) → triple (length, capacity, backing-vector).
- `map[K]V` → SX dict; iteration order spec-undefined (v1 = sorted for
determinism — programs that depend on indeterminism fail loudly, which
is a feature not a bug).
## Roadmap
## Conformance scoreboard
### Phase 1 — tokenizer + parser
- [ ] Tokenizer: keywords (`package`, `import`, `func`, `var`, `const`, `type`, `struct`,
`interface`, `go`, `chan`, `select`, `defer`, `return`, `if`, `else`, `for`, `range`,
`switch`, `case`, `default`, `break`, `continue`, `goto`, `fallthrough`, `map`,
`make`, `new`, `nil`, `true`, `false`), automatic semicolon insertion, string literals
(interpreted + raw `` `...` ``), rune literals `'a'`, number literals (int, float, hex,
octal, binary, complex), operators, slices `[:]`
- [ ] Parser: package clause, imports, top-level `func`/`var`/`const`/`type`; function
bodies: short variable decl `:=`, assignments, `if`/`else`, `for`/`range`, `switch`,
`return`, struct literals, slice literals, map literals, composite literals, type
assertions `v.(T)`, method calls `v.Method(args)`, goroutine `go`, channel ops
`<-ch`, `ch <- v`, `defer`, `select`
- [ ] Tests in `lib/go/tests/parse.sx`
Following `lib/erlang/scoreboard.json` precedent. Add
`lib/go/scoreboard.json` on first iteration; populate as suites land.
Suites planned:
### Phase 2 — transpile: basic Go (no goroutines)
- [ ] `go-eval-ast` entry
- [ ] Arithmetic, string ops, comparison, boolean
- [ ] Variables, short decl, assignment, multiple assignment
- [ ] `if`/`else if`/`else`
- [ ] `for` (C-style), `for range` over slice/map/string
- [ ] Functions: named + anonymous, multiple return values (SX multiple values, Phase 8)
- [ ] Structs → SX hash tables; field access `.field`; struct literals `T{f: v}`
- [ ] Slices → SX vectors; `len`, `cap`, `append`, `copy`, slice expressions `s[a:b]`
- [ ] Maps → SX hash tables; `make(map[K]V)`, `m[k]`, `m[k] = v`, `delete(m, k)`,
comma-ok `v, ok := m[k]`
- [ ] Pointers — modelled as single-element mutable vectors; `&x` creates wrapper, `*p` dereferences
- [ ] `fmt.Println`/`fmt.Printf`/`fmt.Sprintf` → SX IO perform (print)
- [ ] 40+ eval tests in `lib/go/tests/eval.sx`
| Suite | Tests target | What it covers |
|---|---|---|
| `lex` | 50+ | Keywords, operators, literals, ASI |
| `parse` | 80+ | All statement & expression shapes |
| `types` | 90+ | Synth, check, interface satisfaction, generics |
| `eval` | 100+ | Tree-walk over typed AST |
| `runtime` | 60+ | Goroutines, channels, select, close |
| `stdlib` | 40+ | fmt, strings, strconv, sync, time, errors |
| `e2e` | 10+ | Complete representative programs |
### Phase 3 — defer / panic / recover
- [ ] Defer stack per function frame — SX list of thunks, run LIFO on return
- [ ] `defer` statement pushes thunk; transpiler wraps function body in try/finally equivalent
- [ ] `panic(v)` → `raise` with Go panic wrapper
- [ ] `recover()` → catches panic value inside a deferred function; returns nil otherwise
- [ ] Panic propagation across call stack until recovered or fatal
- [ ] Tests: defer ordering, panic/recover, panic in goroutine without recover
## Phasing — one feature per commit
### Phase 4 — goroutines + channels
- [ ] Coroutine-based goroutine type using SX coroutine primitive (Phase 4 of primitives)
- [ ] Round-robin scheduler in `lib/go/runtime.sx`: maintains run queue, steps each
goroutine one turn at a time, suspends at channel ops
- [ ] Unbuffered channels: `make(chan T)` → rendezvous point; send suspends until receive
and vice versa. Implemented as a pair of waiting queues + `cek-resume`.
- [ ] Buffered channels: `make(chan T, n)` → circular buffer; send only blocks when full,
receive only blocks when empty
- [ ] `close(ch)` — mark channel closed; receivers drain then get zero value + `false`
- [ ] `select` — scheduler inspects all cases, picks a ready one (random if multiple),
blocks if none ready until at least one becomes ready
- [ ] `go fn(args)` — spawns new goroutine on run queue
- [ ] `time.Sleep(d)` — yields current goroutine, re-queues after d milliseconds
(simulated with IO perform timer)
- [ ] Tests: ping-pong, fan-out, fan-in, select with default, range over channel
Loop-style. Each phase: implement → test → commit → tick `[ ]` → append
Progress-log line → push `origin/loops/go`.
### Phase 5interfaces
- [ ] Interface type → SX dict `{:type "T" :methods {...}}` dispatch table
- [ ] `interface{}` / `any` → any SX value (already implicit)
- [ ] Type assertion `v.(T)` → check `:type` field, panic if mismatch
- [ ] Type switch `switch v.(type) { case T: ... }` → dispatches on `:type`
- [ ] Method sets — structs implement interfaces implicitly if they have the right methods
- [ ] Value vs pointer receivers — pointer receiver gets the mutable vector wrapper
- [ ] Built-in interfaces: `error` (`Error() string`), `Stringer` (`String() string`)
- [ ] Tests: interface satisfaction, type assertion, type switch, error interface
### Phase 1Tokenizer (`lib/go/lex.sx`) ⬜
- [x] Scaffold + scoreboard + conformance runner (consumes lib/guest/lex.sx)
- [x] Identifiers + 25 keywords
- [x] Decimal integer literals
- [x] Interpreted string literals `"..."` with `\n \t \r \\ \" \'` escapes
- [x] Rune literals `'x'` (single char + simple escapes)
- [x] Line + block comments (block w/ newline triggers ASI)
- [x] Common operator/punct set incl. `:= <- ++ -- == != <= >= && || ...`
- [x] **Automatic semicolon insertion** (Go spec § Semicolons) — newline,
EOF, and block-comment-with-newline trigger `;` after
ident/int/string/rune/{break,continue,fallthrough,return}/{++,--,),],}}.
- [ ] Float / imaginary literals
- [ ] Raw string literals `` `...` ``
- [ ] Hex/octal/binary integer literals (0x… 0o… 0b…) + underscores
- [ ] Full operator set audit (47 distinct per Go spec)
- **Acceptance:** lex/ suite at 50+ tests. Current: 78/78.
### Phase 6standard library subset
- [ ] `fmt` — `Println`, `Printf`, `Sprintf`, `Fprintf`, `Errorf`, `Stringer` dispatch
- [ ] `strings` — `Contains`, `HasPrefix`, `HasSuffix`, `Split`, `Join`, `TrimSpace`,
`ToUpper`, `ToLower`, `Replace`, `Index`, `Count`, `Repeat`
- [ ] `strconv` — `Itoa`, `Atoi`, `FormatFloat`, `ParseFloat`, `ParseInt`, `FormatInt`
- [ ] `math` — full surface via SX math primitives (Phase 15)
- [ ] `sort` — `sort.Slice`, `sort.Ints`, `sort.Strings`
- [ ] `errors` — `errors.New`, `errors.Is`, `errors.As`
- [ ] `sync` — `sync.Mutex` (cooperative — just a boolean flag + goroutine queue),
`sync.WaitGroup`, `sync.Once`
- [ ] `io` — `io.Reader`/`io.Writer` interfaces; `io.ReadAll`; `strings.NewReader`
### Phase 2Parser (`lib/go/parse.sx`) ⬜
- Consume `lib/guest/core/pratt.sx` + `lib/guest/core/ast.sx`. Chisel notes
`consumes-pratt consumes-ast`.
- Grammar coverage:
- Declarations: `package`, `import`, `var`, `const`, `type`, `func`
- Types: basic, slice `[]T`, array `[N]T`, map `map[K]V`, chan `chan T`,
func `func(...)...`, struct, interface, pointer `*T`
- Expressions: literals, identifier, call, index `[]`, slice `[a:b]`,
type assertion `v.(T)`, operators
- Statements: `if`/`else`, `for` (C-style + range), `switch`, `select`,
`return`, `defer`, `go`, `break`/`continue`, assign, short-decl `:=`,
send `ch <- v`, recv `<-ch`
- Output: SX-shaped AST per `lib/guest/core/ast.sx` conventions.
- Tests: round-trip parse of hello world, fibonacci, FizzBuzz, goroutine
ping-pong, struct + method.
- **Acceptance:** parse/ suite at 80+ tests.
### Phase 7full conformance target
- [ ] Vendor a Go test suite or hand-build 100+ program tests in `lib/go/tests/programs/`
- [ ] Drive scoreboard
### Phase 3Bidirectional type checker, MVP (`lib/go/types.sx`) ⬜
- **Independent implementation.** Do NOT use lib/guest/static-types-
bidirectional/ — that kit doesn't exist yet and depends on this work
for its design. See `plans/lib-guest-static-types-bidirectional.md`.
- Synth + check judgments. Context as a value (per-block scope).
- Coverage MVP: declared-type variables, function signatures (params +
returns), call type-checking, simple composite types (slice, map, chan
element), interface satisfaction (structural match against method sets),
short variable declaration `:=` (synth from RHS).
- **Untyped constants.** `42` has type `untyped int` until contextualised;
this is the canonical pitfall (see Gotchas below).
- Defer: generics (Phase 7), full conversion rules.
- Tests: positive (type-correct programs check) + negative (mismatched
types fail with informative errors carrying AST paths).
- **Acceptance:** types/ suite at 60+ tests. Chisel note `shapes-static-
types-bidirectional` — append a paragraph to the sister plan's design
diary describing what synth/check shape emerged.
### Phase 4 — Tree-walk evaluator (`lib/go/eval.sx`) ⬜
- AST-walking interpreter over CEK. Each Go statement maps to one step
function (precedent: `step-sf-if` etc. in spec/evaluator.sx).
- Variables: mutable cells. Pointer semantics: `&x` returns the cell,
`*p` dereferences.
- Slices: triple (length, capacity, backing-vector). `append` honours
capacity-grow per spec.
- Maps: SX dict + key-type metadata.
- Structs: SX dict + type tag. Methods looked up via type's method table.
- Functions: closures over enclosing scope; multiple return values.
- Channels: stub (Phase 5 wires them).
- Tests: arithmetic, control flow, recursion, closures, slices, maps,
structs, methods, pointer semantics, multiple-return.
- **Acceptance:** eval/ suite at 80+ tests. No concurrency yet.
### Phase 5 — Goroutines + channels + select (`lib/go/sched.sx`) ⬜
- **Independent implementation.** Do NOT use lib/guest/scheduler/ — that
kit doesn't exist yet and depends on this work for its design. See
`plans/lib-guest-scheduler.md`.
- `go expr` — spawn a goroutine; returns nothing.
- `chan T` — `make(chan T)` creates an unbuffered channel; `make(chan T,n)`
creates a buffered channel (Phase 5b — defer buffer to a sub-phase).
- `<-ch` — receive (blocks until sender ready).
- `ch <- v` — send (blocks until receiver ready for unbuffered, or buffer
has room for buffered).
- `select { case ... }` — non-deterministic multiplexing; `default` makes
it non-blocking.
- `close(ch)` — closes channel. Receive on closed → zero value + ok=false.
- Tests: ping-pong, fan-out/fan-in, work queue, select with default,
select with timeout (via a `time.After`-like stub), close semantics,
range over channel.
- **Acceptance:** runtime/ suite at 40+ tests. Chisel note `shapes-
scheduler` — append a paragraph to the sister plan's design diary
describing what task-spawn/block/wake/yield shape emerged.
### Phase 5b — Buffered channels + select fairness ⬜
- Buffered: send blocks only when buffer full; recv only when empty.
- `select` random case ordering (spec mandates pseudo-random; v1 uses a
fixed seed for determinism with a `runtime`-package knob to randomise).
- Tests: buffer-full blocking, buffer-empty blocking, select fairness
over many iterations.
- **Acceptance:** runtime/ +20 tests.
### Phase 6 — `defer` + panic/recover ⬜
- Defer stack per function frame; runs LIFO on return (normal or panic).
- `panic(v)` unwinds frames running deferreds; `recover()` inside a
deferred fn captures the panic value and stops unwinding.
- Goroutine panic propagation: a panicking goroutine that doesn't recover
crashes the whole program (honour Go spec, or document divergence).
- Tests: defer order (LIFO), defer + named-return mutation, panic/recover,
panic across goroutines, defer in a loop (push per iter, run on fn
return — common bug).
- **Acceptance:** eval/ +20 tests.
### Phase 7 — Generics (Go 1.18+) ⬜
- Type parameters with constraints (type sets: `interface{ int | float64
}`, `comparable`, `any`).
- Type inference at call sites — basic; the full Go inference algorithm
is notoriously complex. Implement enough for common cases; document
limitations in a Blockers section below.
- Tests: generic function (`func Map[T, U any](xs []T, f func(T) U) []U`),
generic data structure (linked list), constrained type param.
- **Acceptance:** types/ +30 tests.
### Phase 8 — Minimal stdlib (`lib/go/std/`) ⬜
- Implement just what's needed for representative programs:
- `fmt` — `Println`, `Printf`, `Sprintf`, `Fprintf`, `Errorf`,
`Stringer` dispatch. Verbs: `%d %s %v %t %f %T %+v`.
- `strings` — `Contains`, `HasPrefix`, `HasSuffix`, `Split`, `Join`,
`TrimSpace`, `ToUpper`, `ToLower`, `Replace`, `Index`, `Count`,
`Repeat`, `NewReader`.
- `strconv` — `Itoa`, `Atoi`, `FormatFloat`, `ParseFloat`, `ParseInt`,
`FormatInt`.
- `errors` — `New`, `Is`, `As`, `Unwrap`.
- `sync` — `Mutex` (cooperative — flag + waiter queue), `WaitGroup`,
`Once`, `RWMutex`.
- `time` — `Now`, `Since`, `After` (channel-returning timer), `Sleep`,
`Duration`, `Time`.
- `io` — `Reader`/`Writer` interfaces; `ReadAll`; `Copy`.
- `sort` — `Slice`, `Ints`, `Strings`.
- Tests: round-trip Itoa/Atoi, fmt verb coverage, sync.WaitGroup with
goroutines, time.After in a select, sort.Slice with custom less fn.
- **Acceptance:** stdlib/ suite at 40+ tests.
### Phase 9 — End-to-end programs ⬜
- Complete programs from canonical sources (gopl.io, "concurrency
patterns" talk examples) running end-to-end:
- Concurrent prime sieve
- HTTP-ish ping-pong over stubbed transport
- Word frequency counter
- Pipeline (channel chain)
- Producer/consumer with sync.WaitGroup
- "Bounded parallelism" pattern (worker pool over a job channel)
- **Acceptance:** e2e/ suite at 10+ tests, all passing.
### Phase 10 — lib/guest extraction enabler ⬜
- Now that Go has lex+parse+types+eval+sched, sister plans are unblocked
on the Go side. This phase is **doc-only** in `loops/go`:
- Cross-reference `plans/lib-guest-scheduler.md` — mark its Phase 1
(Go scheduler independent) as complete from Go's side.
- Cross-reference `plans/lib-guest-static-types-bidirectional.md` —
mark its Phase 1 as complete from Go's side.
- Update the chiselling diary in each sister plan with the actual
Go-side surface that emerged.
- **Acceptance:** sister plans cross-referenced + diaries updated. No
new Go code.
### Phase 11 — VM bytecode opcodes (deferred, optional) ⬜
- Following Erlang-on-SX Phase 10 precedent: identify hot paths in the
tree-walk evaluator, define Go-specific bytecode opcodes, compile hot
fns through them. Substantial work; only justified if Go programs
exercise enough volume that performance starts mattering.
- **Acceptance:** TBD on demand.
## Ground rules (loop-style)
- **Scope:** only `lib/go/**` and this plan. Do not touch `spec/`,
`hosts/`, `shared/`, `lib/guest/**` (read-only consumer at this phase),
or other `lib/<lang>/`.
- **Consume `lib/guest/core/`** for lex/parse/ast/match/layout. Hand-
rolling defeats the chiselling goal.
- **Do NOT extract into `lib/guest/scheduler/` or `lib/guest/static-
types-bidirectional/` from this loop.** Those extractions are gated on
two consumers AND the discipline of writing each consumer
independently. Extraction is its own workstream after Go and the
second consumer both exist.
- **Substrate gaps** → Blockers entry with minimal repro. Don't fix the
substrate from this loop. Belongs to `sx-improvements.md`.
- **NEVER call `sx_build` without timeout awareness** — 600s watchdog.
- **SX files:** `sx-tree` MCP tools ONLY. `sx_validate` after every edit.
- **Worktree:** branch `loops/go`, push `origin/loops/go`. Never `main`,
never `architecture`.
- **Commit granularity:** one feature per commit. Short factual messages:
`go: parse short-decl + 6 tests [consumes-pratt]`. Chisel note at end
in brackets.
- **Plan file:** update Progress log + tick boxes every commit.
- **If blocked** for two iterations on the same issue, add to Blockers
and move on. Phases 1-4 are sequential; Phases 5-8 are largely
independent once 4 lands.
## Chisel discipline (per parent lib-guest plan)
Every commit ends its message with a chisel note in brackets:
- `[consumes-X]` — used `lib/guest/X` kit.
- `[shapes-scheduler]` / `[shapes-static-types-bidirectional]` — revealed
something about what the sister lib-guest kits should look like. Add a
paragraph to the relevant sister plan's design diary.
- `[proposes-Y]` — revealed a gap in another existing kit. Blockers entry
in the kit's plan.
- `[nothing]` — pure Go work that didn't touch substrate or lib/guest
story. Acceptable; if it shows up twice in a row, stop and reflect.
## Go-specific gotchas
- **ASI (automatic semicolon insertion).** Newline becomes `;` after
identifier/literal/`)`/`]`/`}`. Build into the tokenizer; the Go spec's
"Semicolons" section is unusually precise — follow it literally.
- **Untyped constants.** `42` has type `untyped int` until used in a
context that forces a type. The canonical example: `var x float64 = 42
/ 7` — must compute as `untyped int / untyped int = 6` then convert to
`float64 = 6.0`. Wrong: float-coercing eagerly gives 6.0 prematurely.
Wrong: integer-truncating after coercion gives `5.something`. Test it.
- **Methods vs functions.** `func (r Receiver) Method()` is a method
bound to a type; `func Function(r Receiver)` is just a function.
Methods on pointer-receivers vs value-receivers have asymmetric
satisfaction in interfaces — pointer-receiver methods are NOT in the
value's method set for interface satisfaction.
- **Interface satisfaction is structural and silent.** Type satisfies an
interface if its method set contains all the interface's methods.
Lazy check: at every point a value flows into an interface-typed slot.
- **Channels are first-class values.** Pass them, store them, send them
through other channels. Each channel has identity.
- **`select` with `default`** = non-blocking. Without `default`, blocks
until a case is ready.
- **`nil` is typed.** `var x *int` makes x a `(*int)(nil)`. Comparison
`x == nil` works on typed nil; but `var i interface{} = (*int)(nil); i
== nil` is `false` — i holds a typed-nil-of-type-`*int`, not untyped
nil. The classic Go footgun. Test it.
- **Goroutine panic propagation.** A panicking goroutine that doesn't
recover crashes the whole program. Implement faithfully or document
divergence.
- **`defer` in a loop.** Each iteration pushes; they all run on function
return. Common bug; tests should cover.
- **Iteration order of maps.** Spec: unspecified. v1 = sorted by SX-
canonical key order for determinism; document that programs depending
on iteration order are not Go-conformant. Add a `runtime`-package knob
to enable randomisation later.
## Style
- No comments in `.sx` unless non-obvious. Cite Go spec sections inline
for non-obvious decisions (Go's spec is rigorous; citations work).
- No new planning docs — update this plan inline.
- One feature per iteration. Commit. Log. Push. Next.
## Open questions
1. **Module/import model.** Go has packages and import paths. Probably
model "package" as one or more `.sx` files in a directory, no real
import resolution against a remote module graph. Decide in Phase 2.
2. **Goroutine identity.** Spec says goroutines have no identity; the
scheduler does internally. Expose to user code? No (not Go). Expose
for debugging? Yes via a `runtime`-package stub.
3. **Error handling: panic-as-exception vs explicit error returns.** Go
strongly prefers explicit errors. Stdlib stubs follow that: `strconv.
Atoi("x")` returns `(0, err)`, not panic.
4. **Memory model.** Go has a happens-before model for atomics + channel
ops. SX runtime is single-threaded under the scheduler — every channel
op is a synchronization point automatically. Don't model relaxed
memory; document the simplification.
5. **Iteration order of maps.** Already addressed in Gotchas; flagged
here as a known divergence from spec.
## Blockers
@@ -140,6 +402,16 @@ _(none yet)_
## Progress log
_Newest first._
_Newest first. Append one dated entry per commit._
_(awaiting phase 1)_
- 2026-05-26 — Phase 1 first slice: `lib/go/lex.sx` tokenizer consuming
`lib/guest/lex.sx` predicates. 25 keywords, ident/int/string/rune lits,
line+block comments, common operators, automatic semicolon insertion per
Go spec § Semicolons (newline / EOF / block-comment-with-newline triggers).
Scoreboard + conformance.sh wired. 78/78 tests. `[consumes-lex]`.
- 2026-05-26 — Plan rewritten to integrate the lib/guest framework
(chiselling discipline, sister plans for scheduler + bidirectional
types, type-checker phase added, conformance scoreboard model adopted).
Original 2026-04-26 draft preserved in git history. Loop not yet
kicked off; Phase 1 (tokenizer) is the first iteration when this loop
spins up.

View File

@@ -0,0 +1,235 @@
# lib/guest/scheduler — extraction plan
Two distinct concurrency models — Erlang's addressed processes + mailboxes, and
Go's anonymous channels + goroutines — sit on the same underlying machinery:
a fork/yield/block/resume scheduler over CEK io-suspended continuations. This
plan captures that machinery as `lib/guest/scheduler/` so language N+1 with a
new concurrency model costs ~200 lines of model-specific code instead of
re-inventing the scheduler.
Reference: `plans/lib-guest.md` (parent — two-language rule, stratification),
`plans/erlang-on-sx.md` (first consumer, in production), Go-on-SX (second
consumer, see `plans/go-on-sx.md` once that lands).
**Branch:** `architecture`. SX files via `sx-tree` MCP only.
## Thesis
The substrate already provides what a scheduler needs: CEK io-suspension
(`make-cek-suspended`, `cek-resume`) gives suspendable execution; first-class
environments give each unit of execution its own scope; the trampolined
evaluator means we never blow the host stack. What every guest with concurrency
*re-implements* on top of this is the **fork/yield/block/resume protocol**
the bookkeeping that decides which suspended computation runs next.
Two concrete consumers, two different concurrency vocabularies, sharing one
underlying scheduler, is the proof. If only Erlang lives on it, "scheduler kit"
is a euphemism for "Erlang scheduler with a Go skin." The two-language rule
is the gate.
## Current state (2026-05-26)
- **Erlang-on-SX** has the full pattern in production: 729/729 conformance,
spawn/send/receive, selective receive, monitor/link, hot reload. The
scheduler logic is currently coupled to Erlang-shaped concepts (PIDs,
mailboxes, links) — extraction-blocking but not extraction-defeating.
- **Go-on-SX** does not exist yet. `plans/go-on-sx.md` is the umbrella plan
(TBD); this scheduler plan is a sibling/dependency.
- **lib/guest/scheduler/** does not exist. The two-language rule blocks
extraction until Go-on-SX independently implements its scheduler.
**Status: Phase 0 (Erlang shape capture).** No code change in this plan yet.
## Why the two models actually share a kit
The non-obvious claim is that Erlang processes and Go goroutines really do
share machinery beneath their different vocabularies. The mapping:
| Concept | Erlang | Go | Common kit name |
|---|---|---|---|
| Unit of execution | process (PID-addressed) | goroutine (anonymous) | **task** |
| Spawn | `spawn(Fun)` → PID | `go expr` → nothing | `task-spawn` |
| Block target | mailbox match | channel send/recv | `task-block` |
| Wake condition | message arrives | counterpart ready | `task-resume` predicate |
| Yield | `receive` with no match | channel blocked | scheduler hands off |
| Termination | exit reason → linked tasks | panic / return | task lifecycle |
| Selection | selective `receive` | `select` statement | both = "wait for any of N predicates" |
What the kit owns:
- The **task table** (token → suspended CEK continuation + status).
- The **runnable queue** + scheduling policy (round-robin v1; pluggable).
- The **block→resume protocol**: a blocked task registers a predicate; when
any task changes state, blocked tasks are re-polled; first whose predicate
fires becomes runnable.
- The **fairness/preemption budget** — gas per step before forced yield.
What each language owns:
- The semantics layer on top: Erlang's PID→task map + mailbox per task +
selective-receive predicates; Go's channel value → blocked-task list per
channel + send/recv pairing + select multiplexing.
- The language-visible API (`spawn`/`!`/`receive` vs `go`/`<-`/`select`).
This is exactly the lib/guest pattern: extract the dispatch skeleton, keep
the rules in the language layer.
## API surface (proposed — design only, not yet implemented)
```
(make-scheduler &key gas-per-step ;; default 1000
policy) ;; :round-robin | :fifo
-> scheduler-handle
(task-spawn sched body-thunk) -> task-token
;; body-thunk is a 0-arg fn whose body runs as the task.
;; Returns immediately; task is enqueued runnable.
(task-current sched) -> task-token
;; Inside a task, the token of the running task. Useful for self-reference.
(task-yield sched) -> nil
;; Voluntary yield. Caller is re-enqueued at the tail of runnable.
(task-block sched resume-predicate) -> any
;; Caller suspends. Predicate is (fn () -> resume-value-or-#f).
;; When predicate returns non-#f, caller resumes with that value.
;; Predicate is polled on every scheduler tick when there's nothing
;; obviously runnable. (Optimisation: language layer can wake explicitly —
;; see task-wake.)
(task-wake sched task) -> nil
;; Hint to the scheduler: re-poll this task's resume-predicate now.
;; Used by sender-side when a receiver might unblock.
(task-status sched task) -> :runnable | :blocked | :finished | :crashed
(task-result sched task) -> value | {:error reason}
;; After :finished or :crashed.
(scheduler-step sched) -> :ran | :idle | :all-done
;; Run at most gas-per-step instructions of one task. Caller drives the
;; loop.
(scheduler-run sched) -> nil
;; Run until :all-done. Equivalent to (until (= :all-done (scheduler-step
;; sched))).
```
Notes on the design:
- `task-block` with a resume-predicate is the universal blocking primitive.
Erlang's `receive` is `(task-block sched (fn () (mailbox-match self pat)))`.
Go's `<-ch` is `(task-block sched (fn () (channel-recv-ready ch)))`.
- `task-wake` is the optimisation: instead of polling every blocked task
every step, the language layer wakes the specific task whose predicate
is now likely true. v1 can omit it; performance work later.
- `gas-per-step` gives fairness without true preemption. Tasks that don't
yield within their gas budget are force-yielded by the CEK loop. (CEK
io-suspension already does this for IO; gas budget extends to plain
instructions.)
- No priority/affinity in v1. Both Erlang and Go default to non-priority
scheduling; specialised cases (Erlang's high-priority processes) are
language-layer concerns.
## Build order — phases
This is a long-running plan paced against Go-on-SX. Phases are not loop-style
"one commit per phase" — they're milestone gates.
### Phase 0 — Erlang shape capture (doc-only) ⬜
- Read `lib/erlang/runtime.sx` scheduler code (currently coupled to Erlang
vocabulary).
- Write a 1-page summary of what's actually a scheduler and what's actually
Erlang. Identify the boundary.
- **Acceptance:** summary committed to this plan as a new section "Erlang
scheduler shape (captured 2026-MM-DD)". No code change.
- **Output:** clear-eyed mental model. Without this, we'll merge Erlang's
scheduler shape into the kit and pretend it generalises.
### Phase 1 — Go scheduler independent implementation ⬜
- During Go-on-SX, implement `lib/go/sched.sx` from scratch. Do NOT look at
Erlang's scheduler while doing this. (Or read it once, then close it.)
- Pass Go's channel + goroutine + select conformance tests.
- **Acceptance:** Go scheduler green, lib/go/scoreboard.json includes scheduler
tests, two-consumer rule now passable.
- **Output:** two independent, working implementations of the same idea.
### Phase 2 — Diff and proposed kit ⬜
- Side-by-side diff: Erlang's scheduler vs Go's scheduler. Where do they
agree? Where does each have language-specific bookkeeping?
- The diff is the kit. Things in *both* go in `lib/guest/scheduler/`; things
in only one stay in `lib/erlang/` or `lib/go/`.
- Draft `lib/guest/scheduler/api.sx` (signatures only, no body) reflecting the
proposed surface.
- **Acceptance:** API draft circulated for review; agreement that the surface
covers both consumers; no merge yet.
### Phase 3 — Implement `lib/guest/scheduler/` ⬜
- Implement the kit per the agreed API. New file(s) in `lib/guest/scheduler/`.
- The kit has its own tests in `lib/guest/scheduler/tests/` — agnostic of any
particular language vocabulary.
- **Acceptance:** kit tests pass. Erlang and Go conformance scoreboards
unchanged (the language implementations still use their own scheduler —
we haven't refactored yet).
### Phase 4 — Refactor Erlang to use the kit ⬜
- `lib/erlang/runtime.sx` scheduler logic deleted; replaced with calls into
`lib/guest/scheduler/`. Erlang's PID table, mailbox-per-PID, selective
receive stay in `lib/erlang/`.
- **No-regression gate:** Erlang conformance holds at current pass count
(currently 729/729). Hard requirement.
- **Acceptance:** Erlang scoreboard unchanged; `lib/erlang/runtime.sx`
meaningfully smaller (the scheduler code is gone).
### Phase 5 — Refactor Go to use the kit ⬜
- Same exercise for Go. `lib/go/sched.sx` shrinks to channel/goroutine
bookkeeping + delegation.
- **No-regression gate:** Go conformance scoreboard at its current pass
count.
- **Acceptance:** Go scoreboard unchanged; `lib/go/sched.sx` meaningfully
smaller.
### Phase 6 — Documentation + design-diary close ⬜
- Document `lib/guest/scheduler/` API in `lib/guest/README.md` (or wherever
the lib/guest API index lives).
- Capture the chiselling diary: what *almost* went in the kit but ended up
language-specific, and why. This is the load-bearing knowledge for the
third consumer when it arrives.
- **Acceptance:** API documented; diary section added to this plan.
## Two-language rule — gating
**The rule is hard.** No code in `lib/guest/scheduler/` lands until BOTH
Phase 1 (Go independent) AND Phase 0 (Erlang capture) are complete AND a
review confirms the two implementations actually share machinery in a way
the kit captures.
If, during Phase 2 diff, we discover that the agreement is shallow (e.g.,
both have a runnable queue but the policies are fundamentally incompatible),
the **right outcome is to NOT extract**. Add a "rejected extraction" note to
this plan documenting what we learned and close it. That outcome is fine —
it tells us the two concurrency models aren't actually sister, which is a
real result.
## Open questions
- **Preemption.** v1 is cooperative; gas-per-step gives fairness but not
hard preemption. Erlang BEAM does true preemption (reduction counting).
Go uses async preemption (signal-driven since 1.14). Neither extreme fits
cooperatively over CEK. Is gas-per-step + voluntary yield enough? Probably
for v1; revisit if a guest needs hard real-time.
- **Priority/affinity.** Both Erlang and Go can run without it. Defer.
- **Distribution.** Erlang nodes, Go's distributed channels — both are
language-specific layers on top of the local scheduler. Out of scope.
- **Cancellation.** Go has `context.Context`; Erlang has `exit/2`. Both
bottom out at "deliver an exception to a task." Worth modelling? Probably
as a kit primitive `(task-cancel sched task reason)` that delivers an
exception via CEK exception machinery, language layer wraps it.
- **Third consumer.** If/when JS-on-SX gets a proper async/await + Promise
scheduler, that'd be a great third consumer to validate the kit didn't
over-fit to Erlang+Go.
## Progress log
_Newest first. Append one dated entry per milestone landed._
- 2026-05-26 — Plan drafted. Phase 0 unstarted. Awaiting Go-on-SX to begin
Phase 1.

View File

@@ -0,0 +1,287 @@
# lib/guest/static-types-bidirectional — design-diary plan
Capture the dispatch skeleton of bidirectional type checking
(synthesis/checking judgments, context as a value, pluggable subtyping and
unification) as `lib/guest/static-types-bidirectional/`, so static-typed
guest languages that aren't Hindley-Milner-inferred cost ~300 lines of
language-specific rules instead of re-inventing the checker plumbing.
Reference: `plans/lib-guest.md` (parent — two-language rule, stratification),
`lib/guest/hm.sx` (sister module — full Hindley-Milner for inference-heavy
languages like Haskell-on-SX), Go-on-SX (planned first consumer), TBD second
consumer.
**Branch:** `architecture`. SX files via `sx-tree` MCP only.
## Thesis
`lib/guest/hm.sx` covers languages where the user writes few type annotations
and the checker infers the rest globally (Haskell-on-SX, an eventual ML port,
a typed-Scheme-with-Damas-Milner). But most modern statically-typed languages
in actual production — Go, Rust, Swift, TypeScript, Kotlin, Scala 3, Hack —
do **bidirectional checking instead**: declarations carry annotations, locals
are inferred from immediate context, return types thread inwards from call
sites. This isn't a weaker form of HM; it's a different design that scales
better to mutation, subtyping, ad-hoc polymorphism, and gradual typing —
none of which HM handles cleanly.
If `lib/guest/` is going to credibly host the next decade of statically-typed
languages, it needs a bidirectional kit alongside `hm.sx`. They're sisters,
not rivals.
**This plan is a design diary, not an implementation queue.** The two-language
rule blocks extraction until two consumers exist. Go-on-SX is the first; the
second is TBD. Until then, this plan documents what the API surface *should*
be based on a single consumer, openly acknowledging that the second consumer
will revise it.
## Current state (2026-05-26)
- `lib/guest/hm.sx` exists, used by Haskell-on-SX. 180 lines. The HM kit is
the sister extraction this plan complements.
- No bidirectional kit anywhere in `lib/guest/`.
- Go-on-SX does not exist yet. When it does, `lib/go/types.sx` will be the
first consumer.
- Second consumer is unidentified. Most likely candidates, in order:
1. **TypeScript-on-SX** — purely structural, gradual typing, the most-
popular bidirectional language alive. Natural pair.
2. **Rust-on-SX** — bidirectional with substantial extras (lifetimes,
traits, borrow checking). Heavyweight; lifetimes don't go in this kit.
3. **Typed Racket subset** — if anyone ports it. Bidirectional + gradual.
4. **Hack / Flow / Python-with-types** — same shape.
**Status: Phase 0 (literature survey).** No code in this plan yet.
## Why bidirectional, not HM (for the languages that need it)
Five reasons HM doesn't fit these languages:
1. **Subtyping.** HM unification requires equality of types; subtyping
requires a different judgment (`t ⊑ u`). Go's `interface{}` accepts any
concrete type that satisfies it — subtyping, not unification.
2. **Mutation.** HM's let-polymorphism interacts pathologically with
mutable references (the value restriction). Go, Rust, TS all have
first-class mutation and need rules that handle it directly.
3. **Annotations as ground truth.** Bidirectional treats declared types as
*given*, then propagates them. HM treats every type as a variable to be
solved. For languages where annotations are expected, bidirectional is
the natural shape.
4. **Generics with constraints.** Go's type parameters carry constraints
(`type T comparable`); Rust has trait bounds. HM has typeclasses but
they're orthogonal to its constraint solver. Bidirectional weaves
constraints into the checking rules naturally.
5. **Gradual typing.** TS's `any`, Hack's pessimistic mode, Python's
`Any` — gradual checking is built on bidirectional's "check or skip"
distinction. HM either checks or it doesn't.
These languages collectively are the majority of new statically-typed code.
Hosting them on lib/guest at all requires the bidirectional shape.
## API surface (proposed — design only, will revise with second consumer)
```
;; --- judgments ---
(synth ctx expr) -> {:type T} | {:error msg}
;; "expr synthesises type T in context ctx."
;; Used at function calls (arg types known), let bindings, literals.
(check ctx expr expected-type) -> :ok | {:error msg}
;; "expr checks against expected-type in context ctx."
;; Used in function bodies (return type known), arguments (param type known),
;; assignments (LHS type known).
;; --- context ---
(make-ctx) -> ctx
(ctx-extend ctx name type) -> ctx ;; functional update
(ctx-lookup ctx name) -> type | nil
;; --- pluggable rules ---
(register-synth-rule! kit ast-tag synth-fn) -> nil
;; ast-tag: a keyword identifying the AST node shape (eg. :call :let :lit-int)
;; synth-fn: (ctx node) -> {:type T} | {:error msg}
(register-check-rule! kit ast-tag check-fn) -> nil
;; check-fn: (ctx node expected-type) -> :ok | {:error msg}
(register-type-equiv! kit pred) -> nil
;; pred: (t1 t2) -> bool. The "are these types compatible" predicate.
;; For Go: structural-interface-match-or-equal.
;; For TS: structural-equality-with-any-bidirectional-bottom.
;; For Rust: nominal equality + trait obligations.
(register-subtype! kit pred) -> nil
;; pred: (sub super) -> bool. Optional; defaults to type-equiv.
;; Go has no subtyping between concrete types but interface satisfaction
;; is morally subtyping. TS has structural subtyping properly.
(register-unify! kit unifier) -> nil
;; Optional; for type-variable resolution (generics).
;; unifier: (t1 t2 subst) -> {:subst s'} | {:error msg}
;; --- driver ---
(make-kit) -> kit
(check-program kit ctx program) -> {:ok ctx'} | {:error msg path-to-error}
```
Design notes:
- **The kit dispatches on AST tags**, which is what makes it pluggable. Each
language registers rules for its node types. There's no hardcoded set of
expression shapes in the kit.
- **Synth and check are mutually recursive.** Inside a synth-rule for `call`,
the rule synthesises the function's type, then `check`s each argument
against the corresponding parameter type. Inside a check-rule for `lambda`,
the rule pulls argument types from the expected function type and
`synth`s the body. This pingponging is the bidirectional core.
- **Pluggable type-equiv + subtype + unify** is the three-knob shape. Pierce
& Turner ("Local Type Inference") and Dunfield & Krishnaswami ("Sound and
Complete Bidirectional Typechecking") both factor it this way.
- **No type variables in the core API.** Generics handling is a kit
*extension*: when a language registers a `unify` predicate, the kit
threads a substitution through synth/check. Languages without generics
(early Go) leave it null.
- **Errors carry a path.** `{:error msg path}` where path is a list of AST
tags leading to the failure. Good error messages are why bidirectional is
practical; the kit must support them.
## What's NOT in the kit (language-layer concerns)
Per the chiselling discipline, the kit is the dispatch skeleton; rules stay
in the language. Specifically:
- **The literal type table.** Go's `42` is `untyped int` until contextualised;
TS's `42` is the literal type `42`. Each language ships its own.
- **Specific subtyping rules.** Go's interface satisfaction is recursive
structural matching against method sets. TS's depends on object property
satisfaction. Each language ships its own predicate.
- **Generics constraint solving.** Go's type-set-based constraints, Rust's
trait bounds, TS's conditional types — each is non-trivial and language-
specific. The kit threads a substitution; the language defines what's in
it.
- **Effects, lifetimes, ownership.** Rust's borrow checker is not a type
checker in the bidirectional-kit sense — it's a separate dataflow pass.
Out of scope.
- **Gradual fallback.** TS's `any` lets unchecked code coexist with checked
code. The kit supports this via "check returns :ok on a sentinel any-type"
but the sentinel is registered by the language.
## Build order — phases
### Phase 0 — Literature survey + Go's type system specifics ⬜
- Read: Pierce & Turner "Local Type Inference" (2000); Dunfield & Krishnaswami
"Sound and Complete Bidirectional Typechecking for Higher-Rank Polymorphism"
(2013, 2019 revision); the Go language spec § "Types" + "Expressions".
- Survey how Rust / TS / Kotlin / Scala 3 implement bidirectional in practice
(their compilers are open source). Note where they diverge.
- Output: a short summary section "Bidirectional design space (captured
2026-MM-DD)" appended to this plan. Specifically: list every place
language implementations diverge, so we can predict which divergences will
show up between Go and the second consumer.
- **Acceptance:** survey committed to this plan. No code.
### Phase 1 — Go independent implementation ⬜
- During Go-on-SX, implement `lib/go/types.sx` from scratch. Do not write
with extraction in mind — write the simplest Go-specific bidirectional
checker.
- Hit Go's distinctive type-system features: untyped constants, interface
satisfaction (structural), generics (Go 1.18 type parameters with type-set
constraints — defer this if scope explodes).
- Pass Go's type-checker conformance tests.
- **Acceptance:** Go conformance scoreboard includes type-checker tests, all
passing.
- **Output:** one consumer. Two-language rule still not met; no extraction.
### Phase 2 — Pick + start the second consumer ⬜
- Decide between TS, Rust-subset, or typed-Scheme-subset. Recommendation:
**TypeScript** — most-different from Go (gradual, structural everywhere),
testing the kit's range maximally. Rust's lifetime/borrow machinery isn't
part of this kit, so a Rust port wouldn't actually exercise the kit very
hard.
- Implement just enough of the second language to type-check a non-trivial
function. Don't port the whole language; port the type checker.
- **Acceptance:** second consumer's type checker green on its small slice.
### Phase 3 — Diff and proposed kit ⬜
- Side-by-side: Go's checker vs the second consumer's checker. Where do they
agree (the kit). Where does each diverge (the language).
- Draft `lib/guest/static-types-bidirectional/api.sx` (signatures only).
- Compare against the API sketch in this plan. The API WILL change at this
step; that's the whole point of having two consumers.
- **Acceptance:** revised API committed to this plan; agreement that both
consumers can adopt it.
### Phase 4 — Implement the kit ⬜
- `lib/guest/static-types-bidirectional/` with the agreed API. Kit tests in
`lib/guest/static-types-bidirectional/tests/` — using a minimal "toy"
language (synth-rule for `:int`, check-rule for `:lambda`) to verify the
dispatch skeleton works.
- **Acceptance:** kit tests pass; both consumers' scoreboards still green
with their own implementations.
### Phase 5 — Refactor both consumers to use the kit ⬜
- Go: `lib/go/types.sx` becomes a thin layer over the kit — registers Go's
synth/check/equiv rules, calls `check-program`. Lifecycle code shrinks.
- Second consumer: same exercise.
- **No-regression gate:** both consumers' conformance scoreboards unchanged.
- **Acceptance:** both `lib/<lang>/types.sx` files meaningfully smaller; kit
is doing real work.
### Phase 6 — Documentation + chiselling diary ⬜
- Document the API in lib/guest's README index.
- Diary section in this plan: what we considered putting in the kit but
ended up keeping language-specific, and why.
- **Acceptance:** documentation present; diary captured.
## Two-language rule — gating
Same as `lib-guest-scheduler.md`. The kit does not exist until both consumers
independently work AND we've reviewed the diff AND we believe the shared
skeleton is real. Rejected-extraction is a valid outcome.
## Relationship to `lib/guest/hm.sx`
Sister modules, not rivals. Some languages will use HM (full inference,
let-polymorphism); some will use bidirectional (annotation-driven, subtyping-
friendly). Some might use both — Scala-on-SX, hypothetically, has local-type-
inference in expressions and global-HM-style constraint solving in implicit
resolution. The kit boundaries are:
- `hm.sx` — unification-based, whole-expression inference. Damas-Milner core.
Best for: ML family, Haskell, OCaml subset, Standard ML.
- `static-types-bidirectional/` — synth/check judgments, pluggable equiv +
subtype. Best for: Go, Rust, TS, Kotlin, Swift, Scala 3, Hack.
A language can call into both: bidirectional for the surface, HM-style
unification inside generics resolution. That's actually how Scala 3 works.
The kits compose; design accordingly.
## Open questions
- **Variance.** Go has none; TS has covariant/contravariant/bivariant; Rust
has variance markers per type parameter. Does the kit need a variance
predicate as a fourth pluggable knob? Probably yes, but defer until the
second consumer forces the question.
- **Effect tracking.** Some bidirectional checkers (Koka, Eff, certain
capability-effect TS variants) track effects in types. Out of scope for
v1; the kit must not actively prevent it though.
- **Refinement types.** TS has narrowing (`typeof x === "string"` refines
`x` to `string`); Hack and Flow are similar. These layer above the kit
(the kit's `check` returns a refined context as part of `:ok`). Sketch
this in Phase 3 if TS is the second consumer.
- **Error recovery.** Real-world type checkers don't halt on first error;
they recover and continue to surface as many errors as possible. The kit
needs an error-accumulation mode. Design it in Phase 4.
- **Performance.** For toy languages, naive synth/check is fine. For Go-
sized programs, the checker has to be memoised on synthesised types of
subexpressions. Not a v1 concern; flag if it bites.
## Progress log
_Newest first. Append one dated entry per milestone landed._
- 2026-05-26 — Plan drafted as design diary. Phase 0 unstarted. Gated on
Go-on-SX (first consumer) and a TBD second consumer (recommendation:
TypeScript). No code yet — kit cannot exist before two consumers do.

View File

@@ -0,0 +1,555 @@
# SX VM Opcode Extension Mechanism
Mechanism in `hosts/ocaml/lib/` that lets language ports register specialized
bytecode opcodes without modifying the SX VM core. Direct prerequisite for
**erlang-on-sx Phase 9** (the BEAM analog) and a structural enabler for any
future language port that wants performance-critical opcodes.
Reference: `plans/erlang-on-sx.md` Phase 9, `plans/fed-sx-design.md` §17.5,
`hosts/ocaml/lib/sx_vm.ml` (current VM).
Status: **complete** on `loops/sx-vm-extensions` (Phases A-E landed
2026-05-14 / 2026-05-15). Ready for first real consumer
(`hosts/ocaml/lib/extensions/erlang.ml`, replacing the Phase 9b stub
dispatcher in `lib/erlang/vm/dispatcher.sx`).
---
## Goal
Allow language ports to register custom bytecode opcodes in the SX VM, with:
- **Zero overhead for core opcodes.** Existing opcodes (current ceiling 175,
see `sx_vm.ml`) must dispatch identically. No regression for any existing
language port or the core SX runtime.
- **One additional dispatch step for extension opcodes.** Acceptable cost; the
win comes from avoiding the general CEK machinery.
- **Per-extension state slot.** Erlang's process scheduler, Haskell's thunk
cache, etc. need somewhere to hang state alongside the VM.
- **Compiler awareness.** The bytecode compiler (`lib/compiler.sx`) must be
able to emit extension opcodes by name, looked up against the registered
set.
- **JIT compatibility.** Existing JIT (lazy lambda compilation) continues to
work for code paths using only core opcodes. Extension opcodes are
interpreted in v1; JITing them is a follow-up.
## Non-goals
- **Hot opcode reload.** Adding/replacing opcodes mid-runtime is not in
scope. Extensions are compile-time additions to the OCaml binary. (If
needed, that's a separate project.)
- **Per-instance opcode sets.** All running instances of the SX VM share
the same opcode set determined at build time. Selective opcode loading
per instance is out of scope.
- **Opcode hot-swap or supersession.** Once registered, opcodes are stable
for the lifetime of the binary.
- **Language-port isolation at the dispatch layer.** Two language ports can
see each other's opcodes (they share the dispatch table). Isolation is a
build-time concern — don't compile in extensions you don't trust.
---
## Why now
The Erlang-on-SX Phase 9 work needs this. Without it, Phase 9b-9g (the actual
opcode implementations) have nowhere to plug in. The Erlang loop hit this
dependency as a Blocker (`0abf05ed`); this design is what unblocks it.
It also enables the **shared opcode pattern** discussed in `plans/fed-sx-
design.md` §17.5: opcodes Erlang Phase 9 produces that other ports could
plausibly use (pattern match, perform/handle, record access) get chiselled
out to `lib/guest/vm/` when a second port has an actual second use. Without
the extension mechanism, each port would have to fork the SX VM core or
modify shared dispatch — neither acceptable.
---
## Architectural overview
```
┌──────────────────────────────────────────┐
│ SX VM core (hosts/ocaml/lib/sx_vm.ml) │
│ │
│ ┌────────────────────────────────────┐ │
│ │ Bytecode dispatch loop │ │
│ │ │ │
│ │ match op with │ │
│ │ | 1 (OP_CONST) -> ... │ │
│ │ | 2 (OP_NIL) -> ... │ │
│ │ | ... │ │
│ │ | 175 -> ... (last core opcode) │ │
│ │ | op when op >= 200 -> │ │
│ │ !extension_dispatch_ref op │ │ ◄── new
│ │ vm frame │ │
│ └────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────┐ │
│ │ Extension registry │ │
│ │ opcode_id -> handler │ │ ◄── Phase B
│ │ opcode_name -> opcode_id │ │
│ │ extension_state per extension │ │
│ └────────────────────────────────────┘ │
└──────────────────────────────────────────┘
│ register at startup
┌──────────────────┴──────────────────────┐
│ Extension modules │
│ hosts/ocaml/lib/extensions/erlang.ml │
│ hosts/ocaml/lib/extensions/haskell.ml │
│ hosts/ocaml/lib/extensions/datalog.ml │
│ hosts/ocaml/lib/extensions/guest_vm.ml │ ◄── shared opcodes
└─────────────────────────────────────────┘
```
### Opcode ID space partition
Current SX VM uses opcode IDs from 1 to 175 (per inspection of `sx_vm.ml`,
ceiling at OP_DEC = 175). We partition the 0-255 space:
| Range | Use |
|---------|------------------------------------------------------------------|
| 0 | reserved / NOP |
| 1-199 | **core opcodes** — owned by the SX VM, locked schema |
| 200-247 | **extension opcodes** — registered by extensions (ports + shared) |
| 248-255 | reserved for future expansion / multi-byte opcodes |
This gives the core 24 free slots above the current 175 ceiling for future
core additions, and 48 slots for extensions. Erlang Phase 9 expects to need
fewer than 30 specialized opcodes, so this is comfortable headroom.
The plan originally proposed a finer split (`128-199` for `lib/guest/vm/`
shared, `200-247` for ports). That distinction is preserved at the **naming
level** (`guest_vm.OP_X` vs `erlang.OP_Y`) and policed by the registry
(duplicate IDs fail at startup), without consuming separate ID ranges. The
chiselling discipline (move an opcode to `guest_vm` when a second port uses
it) operates at the source level.
If we need more than 256 opcodes total, multi-byte opcodes (a leading 248-255
byte plus a second byte) extend the space without breaking the schema.
### Extension module signature
```ocaml
(* hosts/ocaml/lib/sx_vm_extension.ml *)
(** A handler for an extension opcode. Reads operands from bytecode,
manipulates the VM stack, updates the frame's instruction pointer.
May raise exceptions (which propagate via the existing VM error path). *)
type handler = vm -> frame -> unit
(** State an extension carries alongside the VM. Opaque to the VM core;
extensions cast as needed. *)
type extension_state = ..
module type EXTENSION = sig
(** Stable name for this extension (e.g. "erlang", "guest_vm"). *)
val name : string
(** Initialize per-instance state. Called once when the VM starts and the
extension is loaded. *)
val init : unit -> extension_state
(** Opcodes this extension provides. Each is (opcode_id, opcode_name, handler).
opcode_id must be in 200-247. Conflicts cause startup failure. *)
val opcodes : extension_state -> (int * string * handler) list
end
```
### Registration and dispatch
```ocaml
(* hosts/ocaml/lib/sx_vm_extensions.ml *)
let extensions : (module EXTENSION) list ref = ref []
let states : (string, extension_state) Hashtbl.t = Hashtbl.create 8
let by_id : (int, handler) Hashtbl.t = Hashtbl.create 64
let by_name : (string, int) Hashtbl.t = Hashtbl.create 64
let register (m : (module EXTENSION)) =
let module M = (val m) in
let st = M.init () in
Hashtbl.add states M.name st;
List.iter (fun (id, name, h) ->
if Hashtbl.mem by_id id then
failwith (Printf.sprintf "Opcode %d (%s) already registered" id name);
Hashtbl.add by_id id h;
Hashtbl.add by_name name id
) (M.opcodes st);
extensions := m :: !extensions
let dispatch op vm frame =
match Hashtbl.find_opt by_id op with
| Some handler -> handler vm frame
| None -> raise (Invalid_opcode op)
let id_of_name name = Hashtbl.find_opt by_name name
let state_of_extension name = Hashtbl.find_opt states name
```
Phase B installs this dispatcher into `Sx_vm.extension_dispatch_ref` at
module init. Until then, the ref's default raises `Invalid_opcode op` for
any opcode ≥ 200, which is the Phase A test condition.
The dispatch path adds **one hashtable lookup per extension opcode**.
Acceptable cost — and Erlang's specialized opcodes win >100× over going
through the general CEK machine, so the overhead is negligible by comparison.
### Bytecode compiler integration
The compiler (`lib/compiler.sx`) needs to know extension opcode IDs to emit
them. New SX primitive exposed to the compiler:
```sx
(extension-opcode-id "erlang.OP_PATTERN_TUPLE_2") ; → 200, or nil if not loaded
```
When the compiler wants to emit a specialized opcode, it queries by name. If
the extension isn't loaded, the compiler falls back to the general path
(emit a `CALL_PRIM` or general SX `case`). This means a language port's
optimization is opt-in per build, and missing extensions degrade to slower
correct execution rather than failure.
Naming convention: `<extension-name>.OP_<NAME>`. So `erlang.OP_PATTERN_TUPLE_2`,
`guest_vm.OP_PERFORM`, etc.
### Per-extension state access
Some opcodes need state beyond the VM stack (Erlang's scheduler, mailbox
state, etc.). Extensions store state in their `init`-returned value, accessed
via `state_of_extension`:
```ocaml
let op_spawn vm frame =
let st = Sx_vm_extensions.state_of_extension "erlang"
|> Option.get
|> Obj.magic in (* extension casts to its known type *)
let body = pop vm in
let pid = Erlang_scheduler.spawn st body in
push vm (pid_value pid);
frame.ip <- frame.ip + 1
```
Shared scheduler state lives in the Erlang extension's state value. Other
extensions don't see it.
---
## Phase plan
Five sub-phases in dependency order. Each is testable in isolation.
### Phase A — Opcode ID partition + dispatch fallthrough
- [x] Define `exception Invalid_opcode of int` in `sx_vm.ml`.
- [x] Add `extension_dispatch_ref : (int -> vm -> frame -> unit) ref`
whose default handler raises `Invalid_opcode op`. Forward-declared in
the same style as the existing `jit_compile_ref`.
- [x] Add `| op when op >= 200 -> !extension_dispatch_ref op vm frame` arm
in the dispatch loop, immediately before the catch-all.
- [x] Document the partition in a comment near the top of the opcode list.
**Tests:**
- All existing OCaml VM/CEK tests pass unchanged (zero regression for core).
- Constructed bytecode using opcode 200 raises `Invalid_opcode 200` when no
extension is registered.
**Effort:** small. ~50 lines + tests.
### Phase B — Extension registry module
`hosts/ocaml/lib/sx_vm_extensions.ml` per the sketch above. Pure plumbing, no
opcodes yet. Phase B's module init installs the real `dispatch` into
`Sx_vm.extension_dispatch_ref`, replacing Phase A's stub.
- [x] `Sx_vm_extension` interface module (handler type, EXTENSION sig).
- [x] `Sx_vm_extensions` registry module (`register`, `dispatch`,
`id_of_name`, `state_of_extension`).
- [x] Wire the registry's `dispatch` into `Sx_vm.extension_dispatch_ref` at
module init.
**Tests:**
- Register a test extension with one opcode; dispatch finds it.
- Duplicate opcode-id registration fails at startup.
- `id_of_name` and `state_of_extension` lookups work.
**Effort:** small. ~150 lines + tests.
### Phase C — Compiler-side opcode lookup primitive
Expose `extension-opcode-id` as an SX primitive in `hosts/ocaml/lib/`. The
compiler in `lib/compiler.sx` can call it to emit extension opcodes by name.
Does not require any extension to actually exist — the primitive returns
`nil` for unknown names, and the compiler falls back.
- [x] Register `extension-opcode-id` in `sx_primitives.ml`.
- [x] Returns `Integer id` when registered, `Nil` otherwise.
**Tests:**
- Primitive returns nil for unknown name.
- After registering a test extension, primitive returns the registered ID.
**Effort:** small. Single primitive registration + compiler-side use docs.
### Phase D — Test extension demonstrating end-to-end flow
A dummy extension at `hosts/ocaml/lib/extensions/test_ext.ml` registering
one or two trivial opcodes (e.g. `OP_TEST_PUSH_42`, `OP_TEST_DOUBLE_TOS`).
Wired into the build, available when running tests.
Compiler test: write SX that triggers the test compiler-extension to emit
`OP_TEST_PUSH_42`, then verify the VM executes it correctly via
`bytecode-inspect` and `vm-trace`.
- [x] `test_ext.ml` registers two opcodes.
- [x] Wired into the build (extensions registered at startup).
- [x] Bytecode emission via name lookup produces the right ID.
- [x] `bytecode-inspect` shows the opcode by name.
**Tests:**
- Bytecode emission via name lookup produces the right ID.
- Execution produces the expected stack effect.
- `bytecode-inspect` shows the opcode by name.
- `vm-trace` correctly reports the extension opcode.
**Effort:** small. ~100 lines including build wiring.
### Phase E — JIT awareness (interpreted-only for v1)
The JIT (lazy lambda compilation) currently compiles based on opcode ranges.
Extension opcodes (≥200) should fall through to interpretation, not be
JIT-compiled in v1.
- [x] Mark extension opcodes as "interpret only" in the JIT pre-analysis.
- [x] Lambda containing only core opcodes JIT-compiles as before.
- [x] Lambda containing any extension opcode runs interpreted.
JITing extension opcodes is a follow-up project; v1 keeps the JIT scope
unchanged and just makes it correctly route mixed bytecode.
**Tests:**
- Lambda with only core opcodes: JIT-compiled, fast path.
- Lambda with extension opcode: interpreted, correct result.
- Mixed lambda: interpreted, correct result.
**Effort:** small-medium. Requires understanding the JIT's pre-analysis
(per `project_jit_compilation.md` memory: "Lazy JIT implemented: lambda
bodies compiled on first VM call, cached, failures sentinel-marked").
Extension-opcode detection becomes another reason to mark a lambda
"interpret-only."
---
## Acceptance criteria
1. **Phase A-D pass their test suites.**
2. **Zero regression on existing SX VM tests.** All language-port test
suites currently passing on the architecture branch (Erlang 530+, Haskell
285+, Datalog 276+, Smalltalk 625+, the SX core test suite, etc.) still
pass.
3. **Test extension demonstrates the flow end-to-end.** SX source compiles
via the compiler with a registered extension opcode, executes through the
VM via the dispatch fallthrough, returns correct result.
4. **Documentation:** README in `hosts/ocaml/lib/extensions/` explaining the
pattern, with a worked example (the test extension is the canonical one).
After acceptance, the Erlang-on-SX Phase 9 work in `lib/erlang/vm/` can use
this mechanism. The Erlang loop's Blocker for 9a is resolved.
---
## Risk and mitigation
**Risk: regression in core opcode dispatch.** A misplaced `match` arm could
break something. *Mitigation:* run every existing language-port conformance
suite before merging.
**Risk: opcode ID conflicts as more extensions land.** If Erlang Phase 9
claims IDs 200-220 and Haskell wants 215-235, we have a problem.
*Mitigation:* maintain a registry document at `hosts/ocaml/lib/extensions/
README.md` listing claimed ID ranges per extension. Convention: each
extension claims a contiguous block at first registration; collisions caught
at startup with a clear error.
**Risk: extension state types leak through `Obj.magic`.** The extension state
is type-erased in the registry. *Mitigation:* extensions cast in their own
opcode handlers, never expose state to other extensions or the VM core.
First-class modules / GADTs could add more type safety; deferred unless
this becomes a concrete pain point.
**Risk: extensions become a back door for kernel mutation.** An extension
opcode handler has full access to the VM. *Mitigation:* extensions are
build-time additions, not runtime; they're as trusted as the rest of the
binary. Operators audit at build time, not runtime. Same trust model as
any other compiled-in code.
**Risk: shared `lib/guest/vm/` opcodes evolve under different language
ports' needs.** *Mitigation:* the chiselling discipline (move to guest only
on second use) ensures the shared opcodes are tested against at least two
ports' actual usage before being considered stable.
---
## Open questions
To be resolved during implementation, not blocking design approval:
1. **Multi-byte opcode encoding.** If we need >256 opcodes total, the
leading-byte 248-255 schema accommodates it. Do we need multi-byte at
v1? Probably not — 48 extension opcodes is more than any single port
should reasonably want.
2. **Extension ordering matters?** If two extensions register opcodes that
read the same VM state, ordering of registration could matter for
initialization. Probably not in practice; flag if it bites.
3. **Hot-reload of extensions.** Out of scope for v1 (per non-goals). If
wanted later, the registry would need teardown + re-registration; the
`gen_server` `code_change/3` model from Erlang Phase 7 is a precedent.
4. **Cross-extension opcode composition.** Can `guest_vm.OP_PERFORM` invoke
`erlang.OP_RECEIVE_SCAN`? In principle yes — handlers can do anything.
The interface is clean; the question is whether we want any conventions
to keep ergonomics tractable. Defer until composition appears in
practice.
---
## Implementation roadmap and sequencing
This is a sister workstream to `loops/erlang`. Driven by Erlang Phase 9.
Single bounded loop on `loops/sx-vm-extensions`, ~1-2 weeks.
Recommended sequencing (one phase per loop fire):
1. **Phase A** — dispatch fallthrough. Smallest viable change to `sx_vm.ml`.
2. **Phase B** — extension registry module.
3. **Phase C** — compiler-side opcode lookup primitive.
4. **Phase D** — test extension demonstrating end-to-end flow.
5. **Phase E** — JIT awareness (interpret-only routing).
After acceptance:
- **`hosts/ocaml/lib/extensions/erlang.ml`** becomes the *first real
consumer* — written by whoever takes over from the Erlang loop's stub
dispatcher in `lib/erlang/vm/dispatcher.sx`. That's the integration
moment that closes the loop.
Estimated total effort: 1-2 weeks for one focused engineer with OCaml SX VM
familiarity.
---
## Relationship to other plans
- **`plans/erlang-on-sx.md` Phase 9:** unblocked by this work. Erlang loop
develops opcodes against a stub dispatcher in `lib/erlang/vm/`; once this
mechanism lands, swap stub for real registration via
`hosts/ocaml/lib/extensions/erlang.ml`.
- **`plans/fed-sx-design.md` §17.5:** documents this as Layer-1 prerequisite.
The shared-opcode discipline (lib/guest/vm/) is designed on top of this
mechanism's namespace allocation.
- **Future language ports (Haskell, Datalog, Smalltalk perf phases):** will
use the same mechanism. Each adds an extension module, claims an opcode
range, registers handlers. The `lib/guest/vm/` opcodes get
cross-referenced when the second port's needs justify chiselling.
- **JIT roadmap (per `project_jit_architecture.md` memory):** extension
opcodes are interpreted in v1. JITing them is a logical follow-up but
a separate project.
---
## Progress log
Newest first.
- **2026-05-15** — Phase E done. Loop complete (acceptance criteria
1-4 all met). New `Sx_vm.bytecode_uses_extension_opcodes` walks
bytecode operand-aware (CONST u16 indices, CALL_PRIM u16+u8,
CLOSURE u16+dynamic upvalue descriptors) so values that happen to
be ≥200 don't false-positive as extension opcodes. Wired into
`jit_compile_lambda`: when the inner closure's bytecode contains
any extension opcode, JIT returns None and the lambda runs
interpreted via CEK (the dispatch fallthrough still routes
extension opcodes through the registry — this just prevents the
JIT from claiming ownership of code it can't optimise). 7 new
foundation tests (`jit extension-opcode awareness` suite): pure
core eligible, head/middle/post-CLOSURE detection, CONST + CALL_PRIM
+ CLOSURE-descriptor false-positive avoidance. +7 pass vs Phase D
baseline (4833 vs 4826), 1111 pre-existing failures unchanged.
Conformance suites green: erlang 530/530, haskell 285/285, datalog
276/276, prolog 590/590, smalltalk 847/847, common-lisp 487/487,
apl 562/562, js 148/148, forth 632/638 (pre-existing), tcl 3/4
(pre-existing), ocaml-on-sx unit 607/607.
Loop done. Hand-off: the Erlang loop's Phase 9b stub dispatcher in
`lib/erlang/vm/dispatcher.sx` can now be replaced with a real
`hosts/ocaml/lib/extensions/erlang.ml` consumer.
- **2026-05-15** — Phase D done. New `hosts/ocaml/lib/extensions/` subtree
wired into the `sx` library via `(include_subdirs unqualified)`.
`extensions/test_ext.ml` is the canonical worked example: two
operand-less opcodes (`test_ext.OP_TEST_PUSH_42` = 220,
`test_ext.OP_TEST_DOUBLE_TOS` = 221) carrying `TestExtState` (an
invocation counter that exercises the per-extension state slot).
`extensions/README.md` documents the registration pattern, opcode-ID
range conventions, and naming rules.
`Sx_vm.opcode_name` now consults `extension_opcode_name_ref` (forward
ref) so disassembly shows extension opcodes by name instead of
`UNKNOWN_n`. Registry maintains `name_of_id_table` (reverse of
`by_name`) and installs the lookup at module init alongside the
dispatch ref. 5 new foundation tests (`extensions/test_ext` suite):
`extension-opcode-id` finds OP_TEST_PUSH_42, end-to-end bytecode runs
to 84, disassemble shows opcode names, unregistered ext opcodes still
fall back to UNKNOWN_n, per-extension state counter increments.
+5 pass vs Phase C baseline (4826 vs 4821), 1111 pre-existing failures
unchanged. Conformance suites green: erlang 530/530, haskell 285/285,
datalog 276/276, prolog 590/590, smalltalk 847/847, common-lisp
487/487, apl 562/562, js 148/148, forth 632/638 (pre-existing), tcl
3/4 (pre-existing), ocaml-on-sx unit 607/607.
- **2026-05-15** — Phase C done. `extension-opcode-id` SX primitive
registered from `sx_vm_extensions.ml` module init (avoids the
`sx_primitives ↔ sx_vm` cycle by registering downstream of both).
Accepts a string or symbol; returns `Integer id` for registered
opcode names, `Nil` for unknown — so a missing extension at compile
time degrades to a fallback rather than failure. 5 new foundation
tests (`extension-opcode-id primitive` suite): registered lookup,
unknown → nil, symbol arg, zero-arg rejection, integer-arg
rejection. +5 pass vs Phase B baseline (4821 vs 4816), 1111
pre-existing failures unchanged. Conformance suites green: erlang
530/530, haskell 285/285, datalog 276/276, prolog 590/590, smalltalk
847/847, common-lisp 487/487, apl 562/562, js 148/148, forth 632/638
(pre-existing), tcl 3/4 (pre-existing), ocaml-on-sx unit 607/607.
- **2026-05-14** — Phase B done. Added `hosts/ocaml/lib/sx_vm_extension.ml`
(interface: `handler` type, `extension_state` extensible variant,
`EXTENSION` module type) and `sx_vm_extensions.ml` (registry: `register`,
`dispatch`, `id_of_name`, `state_of_extension`, `_reset_for_tests`).
`let () = install_dispatch ()` at module init replaces Phase A's stub
with the real registry dispatch — Phase A behavior preserved (empty
registry still raises `Invalid_opcode` for unregistered ops). Registry
rejects opcode IDs outside 200-247, duplicate IDs, duplicate names, and
duplicate extension names. 9 new foundation tests (`vm-extension-registry`
suite): id_of_name resolve+miss, state_of_extension resolve+miss,
end-to-end VM dispatch (push 42), opcode composition (push 42 → double
→ 84), duplicate-id / out-of-range / duplicate-name rejection. +9 pass
vs Phase A baseline (4816 vs 4807), 1111 pre-existing failures unchanged.
Conformance suites green: erlang 530/530, haskell 285/285, datalog
276/276, prolog 590/590, smalltalk 847/847, common-lisp 487/487, apl
562/562, js 148/148, forth 632/638 (pre-existing), tcl 3/4 (pre-existing),
ocaml-on-sx unit 607/607.
- **2026-05-14** — Phase A done. Added `Invalid_opcode of int` exception,
`extension_dispatch_ref` (default raises `Invalid_opcode op`), and the
`| op when op >= 200 -> !extension_dispatch_ref op vm frame` arm before the
catch-all in `sx_vm.ml`. Partition comment documents 1-199 core / 200-247
extensions / 248-255 reserved (current core ceiling is OP_DEC = 175).
4 new foundation tests (3 × Invalid_opcode for opcodes 200/224/247, 1 ×
Eval_error for opcode 199 to pin the threshold). Foundation 64/64;
full OCaml test suite +4 pass vs baseline (4807 vs 4803), 1111 pre-existing
failures unchanged. Conformance suites green: erlang 530/530, haskell
285/285, datalog 276/276, prolog 590/590, smalltalk 847/847, common-lisp
305/305, apl 562/562, js 148/148, forth 632/638 (pre-existing), tcl 3/4
(pre-existing), ocaml-on-sx unit 607/607. (Lua 0/16 and ocaml-conformance
baseline programs not exercised — pre-existing scoreboard state and
multi-hour runtime respectively.)