Compare commits

...

19 Commits

Author SHA1 Message Date
46e0653911 fed-prims: Phase J — http-request + 6 tests
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 2m48s
NATIVE-ONLY http-request primitive (bin/sx_server.ml). HTTP/1.1 over
Unix sockets + gethostbyname; inline http:// URL parsing (full
url-parse deferred to Phase K); Connection: close + Host +
Content-Length headers auto-supplied; reads response via
Content-Length or read-to-EOF; chunked transfer-encoding rejected.
Test bin/test_http_client.sh spins a Phase-H echo server and drives
a second sx_server: GET+query, POST+body, 404, custom request
header reflected, non-http scheme rejected, integer status — 6/6.
WASM boot green (prim not in lib); Erlang conformance 530/530.
2026-05-26 19:53:58 +00:00
4548461bfc fed-prims: Phase I — handoff (RESOLVED blocker + primitive->BIF mapping)
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 2m50s
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 18:48:35 +00:00
7d9dddcc80 fed-prims: Phase H — native-only http-listen HTTP/1.1 server + curl test
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 2m53s
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 18:25:24 +00:00
36be6bf44b fed-prims: Phase G — file-list-dir (Sys.readdir, sorted, native-safe)
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 2m52s
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 17:57:20 +00:00
f8fc04840a fed-prims: Phase F — RSA-SHA256 PKCS#1 v1.5 verify, pure OCaml, RSA-2048 vector
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 3m9s
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 17:32:35 +00:00
76d1e9f53a fed-prims: Phase E — Ed25519 verify (RFC 8032), pure-OCaml bignum + edwards25519
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 3m2s
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 17:05:59 +00:00
d8b57784fe fed-prims: Phase D — CIDv1 (multihash + base32 multibase), pure OCaml, canonical IPFS vectors
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 3m2s
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 16:36:42 +00:00
bcaaa11916 fed-prims: Phase C — dag-cbor encode/decode, pure OCaml, RFC 8949 vectors + determinism
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 3m8s
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 16:10:36 +00:00
451bd4be62 fed-prims: Phase B — SHA3-256 (Keccak-f[1600]), pure OCaml, 4 NIST vectors
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 2m41s
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 15:43:51 +00:00
19932a42a9 fed-prims: Phase A — SHA-256 + SHA-512, pure OCaml, 7 NIST vectors
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 3m33s
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 15:17:35 +00:00
3629dd96a9 fed-prims: bootstrap plan + loop briefing
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 3m53s
Pure-OCaml crypto/CBOR/CID/Ed25519/RSA + native HTTP server in
hosts/ocaml/, the host-primitive surface Erlang Phase 8 BIFs and
fed-sx Milestone 1 are blocked on. WASM-safe lib boundary enforced.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 15:00:33 +00:00
a341041627 datalog: scoreboard bump (preserve before loops/erlang merge) 2026-05-18 14:48:00 +00:00
715fab86d2 Merge loops/sx-vm-extensions into architecture: hosts/ocaml VM opcode extension mechanism
5 phases (A-E) per plans/sx-vm-opcode-extension.md:

- A: Sx_vm dispatch fallthrough for opcodes ≥200 + Invalid_opcode + extension_dispatch_ref
- B: Sx_vm_extension interface + Sx_vm_extensions registry (register / dispatch /
     id_of_name / state_of_extension), installs into the dispatch_ref at module init
- C: extension-opcode-id SX primitive for compiler-side lookup
- D: lib/extensions/ subtree wired via include_subdirs, test_ext.ml as the canonical
     worked example, opcode_name forward-ref so disassemble shows ext opcodes by name
- E: bytecode_uses_extension_opcodes scanner + JIT skip path so lambdas containing
     extension opcodes run interpreted via CEK

26 new foundation tests across 5 suites, all green. Zero regressions across 11
language-port conformance suites (erlang 530, haskell 285, datalog 276, prolog 590,
smalltalk 847, common-lisp 487, apl 562, js 148, forth 632, tcl 3, ocaml-on-sx unit 607).

Hand-off: lib/erlang/vm/dispatcher.sx (Phase 9b stub) can now be replaced with a real
hosts/ocaml/lib/extensions/erlang.ml consumer.
2026-05-15 07:22:29 +00:00
f026177e63 vm-ext: phase E — JIT skips lambdas containing extension opcodes
Adds Sx_vm.bytecode_uses_extension_opcodes — an operand-aware
bytecode scanner that walks past CONST u16, CALL_PRIM u16+u8, and
CLOSURE u16+dynamic upvalue descriptors so operand bytes that happen
to be ≥200 don't false-positive as extension opcodes.

jit_compile_lambda calls the scanner on the inner closure's bytecode.
On hit it returns None — the lambda then runs through CEK
interpretation. The VM's dispatch fallthrough still routes the
extension opcodes themselves through the registry; this change just
prevents the JIT from claiming code it has no plan for.

Tests: 7 new foundation cases — pure core eligible, head/middle/
post-CLOSURE detection, CONST + CALL_PRIM + CLOSURE-descriptor false-
positive avoidance. +7 pass vs Phase D baseline, no regressions
across 11 conformance suites.

Loop complete: acceptance criteria 1-4 met. Hand-off to the Erlang
loop — lib/erlang/vm/dispatcher.sx's Phase 9b stub can now be
replaced with a real hosts/ocaml/lib/extensions/erlang.ml consumer.
2026-05-15 01:53:39 +00:00
f3192f7fda vm-ext: phase D — extensions/ subtree + test_ext + opcode_name lookup
lib/extensions/ becomes the new home for VM extensions, wired in via
(include_subdirs unqualified). README documents the registration
pattern, opcode-ID range conventions (200-209 guest_vm, 210-219
inline test, 220-229 test_ext, 230-247 ports), and naming rules.

extensions/test_ext.ml is the canonical worked example — two
operand-less opcodes (220 push 42, 221 double TOS) carrying a per-
extension state slot (TestExtState invocation counter). Test_ext.register
called from run_tests.ml at the start of the Phase D suite, on top of
the inline test_reg from earlier suites (disjoint opcode IDs).

Sx_vm.opcode_name now consults extension_opcode_name_ref (forward ref
in the same style as extension_dispatch_ref), so disassemble shows
extension opcodes by name instead of UNKNOWN_n. Registry maintains
name_of_id_table and installs the lookup at module init.

Tests: 5 new foundation cases — primitive resolves test_ext name,
end-to-end bytecode (push + double + return → 84), disassemble shows
"test_ext.OP_TEST_PUSH_42" / "test_ext.OP_TEST_DOUBLE_TOS",
unregistered ext opcodes still fall back to UNKNOWN_n, invocation
counter records the two dispatches. +5 pass vs Phase C baseline, no
regressions across 11 conformance suites.
2026-05-15 01:05:30 +00:00
57af0f386f vm-ext: phase C — extension-opcode-id SX primitive
Registers extension-opcode-id from sx_vm_extensions.ml module init.
Lives downstream of both sx_primitives and sx_vm to avoid a build
cycle. Accepts a string or symbol; returns Integer id when the opcode
is registered, Nil otherwise.

Compilers (lib/compiler.sx) call this to emit extension opcodes by
name. Returning Nil rather than failing on unknown names lets a port's
optimization opt in per-build — missing extensions degrade to slower
correct execution.

Tests: 5 new foundation cases — registered lookup, unknown → nil,
symbol arg, zero-arg + integer-arg rejection. +5 pass vs Phase B
baseline, no regressions across 11 conformance suites.
2026-05-15 00:16:03 +00:00
8c33a6f8d5 vm-ext: phase B — extension registry module
sx_vm_extension.ml: handler type, extensible extension_state variant,
EXTENSION first-class module signature.

sx_vm_extensions.ml: register / dispatch / id_of_name /
state_of_extension. install_dispatch () runs at module init,
swapping Phase A's stub for the real registry. Rejects out-of-range
opcode IDs (must be 200-247), duplicate IDs, duplicate names, and
duplicate extension names.

Tests: 9 new foundation cases — lookup hits/misses, end-to-end VM
dispatch including opcode composition, all four rejection paths.
+9 pass vs Phase A baseline, no regressions across 11 conformance
suites.
2026-05-14 23:28:24 +00:00
cf597f1b5f vm-ext: phase A — extension dispatch fallthrough in sx_vm.ml
Adds Invalid_opcode of int exception and extension_dispatch_ref forward
ref (default raises Invalid_opcode op), plus the |op when op >= 200 arm
before the catch-all in the bytecode dispatch loop. Partition comment
documents 1-199 core / 200-247 extensions / 248-255 reserved.

Phase B will install the real registry's dispatch into the ref at module
init, replacing this stub.

Tests: 4 new foundation cases (Invalid_opcode for 200/224/247, Eval_error
for 199 to pin the threshold). +4 pass vs baseline, no regressions.
2026-05-14 22:29:50 +00:00
183bfeebe1 vm-ext: bootstrap loops/sx-vm-extensions plan + loop briefing
plans/sx-vm-opcode-extension.md ports over from loops/erlang (f6a68656)
with the opcode partition adjusted to match real VM usage: 1-199 core
(current ceiling 175 = OP_DEC), 200-247 extensions, 248-255 reserved.

plans/agent-briefings/sx-vm-extensions-loop.md captures the per-fire
workflow and ground rules.
2026-05-14 22:29:15 +00:00
23 changed files with 3672 additions and 7 deletions

View File

@@ -67,6 +67,14 @@ let rec deep_equal a b =
| NativeFn _, NativeFn _ -> a == b | NativeFn _, NativeFn _ -> a == b
| _ -> false | _ -> false
(* ====================================================================== *)
(* Test extensions for the VM extension registry suite (Phase B) *)
(* ====================================================================== *)
(* Extend the extensible variant from sx_vm_extension.ml so the test
extensions below can carry their own private state. *)
type Sx_vm_extension.extension_state += TestRegState of int ref
(* ====================================================================== *) (* ====================================================================== *)
(* Build evaluator environment with test platform functions *) (* Build evaluator environment with test platform functions *)
(* ====================================================================== *) (* ====================================================================== *)
@@ -1282,7 +1290,620 @@ let run_foundation_tests () =
let l = { l_params = ["x"]; l_body = Symbol "x"; l_closure = Sx_types.make_env (); l_name = None; l_compiled = None; l_call_count = 0; l_uid = Sx_types.next_lambda_uid () } in let l = { l_params = ["x"]; l_body = Symbol "x"; l_closure = Sx_types.make_env (); l_name = None; l_compiled = None; l_call_count = 0; l_uid = Sx_types.next_lambda_uid () } in
assert_true "is_lambda" (Bool (Sx_types.is_lambda (Lambda l))); assert_true "is_lambda" (Bool (Sx_types.is_lambda (Lambda l)));
ignore (Sx_types.set_lambda_name (Lambda l) "my-fn"); ignore (Sx_types.set_lambda_name (Lambda l) "my-fn");
assert_eq "lambda name mutated" (String "my-fn") (lambda_name (Lambda l)) assert_eq "lambda name mutated" (String "my-fn") (lambda_name (Lambda l));
Printf.printf "\nSuite: crypto-sha2\n";
(* NIST FIPS 180-4 published vectors. *)
assert_eq "sha256 empty"
(String "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855")
(call "crypto-sha256" [String ""]);
assert_eq "sha256 abc"
(String "ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad")
(call "crypto-sha256" [String "abc"]);
assert_eq "sha256 896-bit"
(String "248d6a61d20638b8e5c026930c3e6039a33ce45964ff2167f6ecedd419db06c1")
(call "crypto-sha256"
[String "abcdbcdecdefdefgefghfghighijhijkijkljklmklmnlmnomnopnopq"]);
assert_eq "sha256 1M 'a'"
(String "cdc76e5c9914fb9281a1c7e284d73e67f1809a48a497200e046d39ccc7112cd0")
(call "crypto-sha256" [String (String.make 1000000 'a')]);
assert_eq "sha512 empty"
(String "cf83e1357eefb8bdf1542850d66d8007d620e4050b5715dc83f4a921d36ce9ce47d0d13c5d85f2b0ff8318d2877eec2f63b931bd47417a81a538327af927da3e")
(call "crypto-sha512" [String ""]);
assert_eq "sha512 abc"
(String "ddaf35a193617abacc417349ae20413112e6fa4e89a97ea20a9eeee64b55d39a2192992a274fc1a836ba3c23a3feebbd454d4423643ce80e2a9ac94fa54ca49f")
(call "crypto-sha512" [String "abc"]);
assert_eq "sha512 896-bit"
(String "8e959b75dae313da8cf4f72814fc143f8f7779c6eb9f7fa17299aeadb6889018501d289e4900f7e4331b99dec4b5433ac7d329eeb6dd26545e96e55b874be909")
(call "crypto-sha512"
[String ("abcdefghbcdefghicdefghijdefghijkefghijklfghijklmghijklmn"
^ "hijklmnoijklmnopjklmnopqklmnopqrlmnopqrsmnopqrstnopqrstu")]);
Printf.printf "\nSuite: crypto-sha3\n";
(* NIST FIPS 202 published vectors. *)
assert_eq "sha3-256 empty"
(String "a7ffc6f8bf1ed76651c14756a061d662f580ff4de43b49fa82d80a4b80f8434a")
(call "crypto-sha3-256" [String ""]);
assert_eq "sha3-256 abc"
(String "3a985da74fe225b2045c172d6bd390bd855f086e3e9d525b46bfe24511431532")
(call "crypto-sha3-256" [String "abc"]);
assert_eq "sha3-256 896-bit"
(String "41c0dba2a9d6240849100376a8235e2c82e1b9998a999e21db32dd97496d3376")
(call "crypto-sha3-256"
[String "abcdbcdecdefdefgefghfghighijhijkijkljklmklmnlmnomnopnopq"]);
(* 1600-bit message: 0xa3 * 200 — exercises multi-block absorb (>136B). *)
assert_eq "sha3-256 1600-bit 0xa3"
(String "79f38adec5c20307a98ef76e8324afbfd46cfd81b22e3973c65fa1bd9de31787")
(call "crypto-sha3-256" [String (String.make 200 '\xa3')]);
Printf.printf "\nSuite: dag-cbor\n";
let mkdict pairs =
let d = Sx_types.make_dict () in
List.iter (fun (k, v) -> Hashtbl.replace d k v) pairs;
Dict d
in
let enc v = call "cbor-encode" [v] in
(* RFC 8949 Appendix A — minimal-length deterministic encoding. *)
assert_eq "cbor 0" (String "\x00") (enc (Integer 0));
assert_eq "cbor 23" (String "\x17") (enc (Integer 23));
assert_eq "cbor 24" (String "\x18\x18") (enc (Integer 24));
assert_eq "cbor 100" (String "\x18\x64") (enc (Integer 100));
assert_eq "cbor 1000" (String "\x19\x03\xe8") (enc (Integer 1000));
assert_eq "cbor 1000000"
(String "\x1a\x00\x0f\x42\x40") (enc (Integer 1000000));
assert_eq "cbor -1" (String "\x20") (enc (Integer (-1)));
assert_eq "cbor -100" (String "\x38\x63") (enc (Integer (-100)));
assert_eq "cbor -1000" (String "\x39\x03\xe7") (enc (Integer (-1000)));
assert_eq "cbor false" (String "\xf4") (enc (Bool false));
assert_eq "cbor true" (String "\xf5") (enc (Bool true));
assert_eq "cbor null" (String "\xf6") (enc Nil);
assert_eq "cbor \"\"" (String "\x60") (enc (String ""));
assert_eq "cbor \"a\"" (String "\x61\x61") (enc (String "a"));
assert_eq "cbor \"IETF\"" (String "\x64IETF") (enc (String "IETF"));
assert_eq "cbor []" (String "\x80") (enc (List []));
assert_eq "cbor [1,2,3]"
(String "\x83\x01\x02\x03")
(enc (List [Integer 1; Integer 2; Integer 3]));
assert_eq "cbor [1,[2,3],[4,5]]"
(String "\x83\x01\x82\x02\x03\x82\x04\x05")
(enc (List [Integer 1;
List [Integer 2; Integer 3];
List [Integer 4; Integer 5]]));
assert_eq "cbor {}" (String "\xa0") (enc (mkdict []));
assert_eq "cbor {a:1,b:[2,3]}"
(String "\xa2\x61\x61\x01\x61\x62\x82\x02\x03")
(enc (mkdict ["a", Integer 1; "b", List [Integer 2; Integer 3]]));
assert_eq "cbor {a..e:A..E}"
(String "\xa5\x61\x61\x61\x41\x61\x62\x61\x42\x61\x63\x61\x43\x61\x64\x61\x44\x61\x65\x61\x45")
(enc (mkdict ["a", String "A"; "b", String "B"; "c", String "C";
"d", String "D"; "e", String "E"]));
(* Determinism: insertion order + key length must not change bytes.
Sort is length-then-bytewise → a, c, bb. *)
let d1 = mkdict ["bb", Integer 2; "a", Integer 1; "c", Integer 3] in
let d2 = mkdict ["c", Integer 3; "bb", Integer 2; "a", Integer 1] in
assert_eq "cbor det order-invariant" (enc d1) (enc d2);
assert_eq "cbor det length-then-bytewise"
(String "\xa3\x61\x61\x01\x61\x63\x03\x62\x62\x62\x02")
(enc d1);
(* Round-trip: decode . encode = identity (structural). *)
let roundtrip name v =
assert_eq ("cbor rt " ^ name) v (call "cbor-decode" [enc v])
in
roundtrip "int" (Integer 42);
roundtrip "neg" (Integer (-99999));
roundtrip "str" (String "hello world");
roundtrip "bool" (Bool true);
roundtrip "nil" Nil;
roundtrip "nested"
(List [Integer 1; String "x"; List [Bool false; Nil]]);
roundtrip "dict"
(mkdict ["k", List [Integer 7]; "name", String "z"]);
Printf.printf "\nSuite: cid\n";
let mh_sha256 s = Sx_cid.multihash 0x12 (Sx_cid.unhex (Sx_sha2.sha256_hex s)) in
(* Authoritative vectors (independently derived; match well-known
IPFS CIDs). raw "abc" and raw "" — codec 0x55. *)
assert_eq "cid raw abc"
(String "bafkreif2pall7dybz7vecqka3zo24irdwabwdi4wc55jznaq75q7eaavvu")
(call "cid-from-bytes" [Integer 0x55; String (mh_sha256 "abc")]);
assert_eq "cid raw empty"
(String "bafkreihdwdcefgh4dqkjv67uzcmw7ojee6xedzdetojuzjevtenxquvyku")
(call "cid-from-bytes" [Integer 0x55; String (mh_sha256 "")]);
(* dag-cbor {} — canonical empty-map CID (sha2-256, codec 0x71). *)
assert_eq "cid dag-cbor {}"
(String "bafyreigbtj4x7ip5legnfznufuopl4sg4knzc2cof6duas4b3q2fy6swua")
(call "cid-from-sx" [mkdict []]);
(* Determinism: dict key insertion order must not change the CID. *)
let cda = call "cid-from-sx" [mkdict ["b", Integer 2; "a", Integer 1]] in
let cdb = call "cid-from-sx" [mkdict ["a", Integer 1; "b", Integer 2]] in
assert_eq "cid det order-invariant" cda cdb;
assert_true "cid multibase 'b' prefix"
(Bool (match call "cid-from-sx" [mkdict []] with
| String s -> String.length s > 1 && s.[0] = 'b'
| _ -> false));
Printf.printf "\nSuite: ed25519\n";
let hx = Sx_ed25519.unhex in
let edv pk msg sg = call "ed25519-verify"
[String (hx pk); String (hx msg); String (hx sg)] in
(* RFC 8032 §7.1 TEST 1-3 (deterministic; re-derived independently). *)
assert_eq "ed25519 RFC T1"
(Bool true)
(edv "d75a980182b10ab7d54bfed3c964073a0ee172f3daa62325af021a68f707511a"
""
"e5564300c360ac729086e2cc806e828a84877f1eb8e5d974d873e065224901555fb8821590a33bacc61e39701cf9b46bd25bf5f0595bbe24655141438e7a100b");
assert_eq "ed25519 RFC T2"
(Bool true)
(edv "3d4017c3e843895a92b70aa74d1b7ebc9c982ccf2ec4968cc0cd55f12af4660c"
"72"
"92a009a9f0d4cab8720e820b5f642540a2b27b5416503f8fb3762223ebdb69da085ac1e43e15996e458f3613d0f11d8c387b2eaeb4302aeeb00d291612bb0c00");
assert_eq "ed25519 RFC T3"
(Bool true)
(edv "fc51cd8e6218a1a38da47ed00230f0580816ed13ba3303ac5deb911548908025"
"af82"
"6291d657deec24024827e69c3abe01a30ce548a284743a445e3680d7db5ac3ac18ff9b538d16f290ae67f760984dc6594a7c15e9716ed28dc027beceea1ec40a");
(* Tampered message -> false. *)
assert_eq "ed25519 tampered msg"
(Bool false)
(edv "fc51cd8e6218a1a38da47ed00230f0580816ed13ba3303ac5deb911548908025"
"af83"
"6291d657deec24024827e69c3abe01a30ce548a284743a445e3680d7db5ac3ac18ff9b538d16f290ae67f760984dc6594a7c15e9716ed28dc027beceea1ec40a");
(* Tampered signature -> false. *)
assert_eq "ed25519 tampered sig"
(Bool false)
(edv "d75a980182b10ab7d54bfed3c964073a0ee172f3daa62325af021a68f707511a"
""
"f5564300c360ac729086e2cc806e828a84877f1eb8e5d974d873e065224901555fb8821590a33bacc61e39701cf9b46bd25bf5f0595bbe24655141438e7a100b");
(* Total: wrong-length pubkey / sig -> false, no exception. *)
assert_eq "ed25519 short pubkey"
(Bool false)
(call "ed25519-verify" [String "abc"; String ""; String (String.make 64 '\000')]);
assert_eq "ed25519 short sig"
(Bool false)
(call "ed25519-verify"
[String (hx "d75a980182b10ab7d54bfed3c964073a0ee172f3daa62325af021a68f707511a");
String ""; String "short"]);
assert_eq "ed25519 non-string args"
(Bool false)
(call "ed25519-verify" [Integer 1; Integer 2; Integer 3]);
Printf.printf "\nSuite: rsa-sha256\n";
(* Fixed RSA-2048 vector: one-off python-cryptography keygen +
PKCS1v15/SHA-256 sign of "fed-sx phase F rsa test". *)
let rhx = Sx_rsa.unhex in
let spki = rhx "30820122300d06092a864886f70d01010105000382010f003082010a0282010100a117b573480bce5a08b54a98384001df26d062e9173caaee2e3a2d0045c6d16f99b2a1e7fb60763f65f95f8c39ff82c18b8590338042914331db3440a06d2dbe65a2f82c82f37d293f67a8b57a1f9014b55150a093cfee90257ef3b4a215d5ab002579bd92b6fcb3536777d51b639347d01e307ddafb209073dd9b8d6a507157c44c624a19b3b9275931472462870ae02132630159132a85c1c889adfb358b6bbd3760ce3fffe6285964833a10ee436d5bc33dfab7f9ed630a74e9a32e5688f5a7797f7cc839ad2494dd1c4c4a8fab844cd26208794bf2602c16b9d12bde434066d8c0dd2d20489f4070f883bae2b4508ead4a1b80b44c576e9e37bdb5df69f10203010001" in
let rmsg = rhx "6665642d73782070686173652046207273612074657374" in
let rsig = rhx "5e1593d674ed15c0172546d38efdf1aebd252f4b0c0dfbe1f7996fd569d0bfd9f3e8689ea2b14aa45b5fc3f0a05d4f23c6b02b8820d71f6998ea3b5b0d071bb33142236e388b1226ece3ec447d33b38999f189c37564cf052cf038de94c67b2ddf9a97d5a73554bb88818f615824517209a4083258965adace55658f344104eaa0d5f2f44ea00cfac8754674aade87b40d955cccd1ccd9b7649a08b66ce3bc5dba2de96b3e859488ded3ef9fb3744a1e3495fd14841d8319b3cc08054c729d1c02739ee314eba2b20fac46e463f47eb67183d8455583eca73ba37448164612dd9cd77877135d30d12084c2843f986a5b8ad59c6600f9855b91d7cbdf7c6c4b0e" in
let rsav s m g = call "rsa-sha256-verify" [String s; String m; String g] in
assert_eq "rsa valid" (Bool true) (rsav spki rmsg rsig);
assert_eq "rsa tampered msg" (Bool false)
(rsav spki (rmsg ^ "x") rsig);
assert_eq "rsa tampered sig" (Bool false)
(rsav spki rmsg
(rhx "5f1593d674ed15c0172546d38efdf1aebd252f4b0c0dfbe1f7996fd569d0bfd9f3e8689ea2b14aa45b5fc3f0a05d4f23c6b02b8820d71f6998ea3b5b0d071bb33142236e388b1226ece3ec447d33b38999f189c37564cf052cf038de94c67b2ddf9a97d5a73554bb88818f615824517209a4083258965adace55658f344104eaa0d5f2f44ea00cfac8754674aade87b40d955cccd1ccd9b7649a08b66ce3bc5dba2de96b3e859488ded3ef9fb3744a1e3495fd14841d8319b3cc08054c729d1c02739ee314eba2b20fac46e463f47eb67183d8455583eca73ba37448164612dd9cd77877135d30d12084c2843f986a5b8ad59c6600f9855b91d7cbdf7c6c4b0e"));
assert_eq "rsa garbage spki" (Bool false)
(rsav "not der" rmsg rsig);
assert_eq "rsa non-string args" (Bool false)
(call "rsa-sha256-verify" [Integer 1; Integer 2; Integer 3]);
Printf.printf "\nSuite: file-list-dir\n";
let expect_err nm f =
(try ignore (f ());
incr fail_count; Printf.printf " FAIL: %s — no error\n" nm
with Eval_error _ ->
incr pass_count; Printf.printf " PASS: %s\n" nm
| _ ->
incr fail_count; Printf.printf " FAIL: %s — wrong exn\n" nm)
in
let tmp = Filename.temp_file "fld" "" in
Sys.remove tmp; Unix.mkdir tmp 0o755;
let touch n = let oc = open_out (Filename.concat tmp n) in close_out oc in
touch "b.txt"; touch "a.txt"; touch "c.txt";
assert_eq "file-list-dir sorted"
(List [String "a.txt"; String "b.txt"; String "c.txt"])
(call "file-list-dir" [String tmp]);
expect_err "file-list-dir missing"
(fun () -> call "file-list-dir" [String (Filename.concat tmp "nope")]);
expect_err "file-list-dir not-a-dir"
(fun () -> call "file-list-dir" [String (Filename.concat tmp "a.txt")]);
expect_err "file-list-dir arity"
(fun () -> call "file-list-dir" []);
(* best-effort cleanup *)
(try List.iter (fun n -> Sys.remove (Filename.concat tmp n))
["a.txt"; "b.txt"; "c.txt"]; Unix.rmdir tmp
with _ -> ());
Printf.printf "\nSuite: vm-extension-dispatch\n";
let make_bc op = ({
vc_arity = 0; vc_rest_arity = -1; vc_locals = 0;
vc_bytecode = [| op |]; vc_constants = [||];
vc_bytecode_list = None; vc_constants_list = None;
} : Sx_types.vm_code) in
let expect_invalid_opcode label op =
let globals = Hashtbl.create 1 in
try
let _ = Sx_vm.execute_module (make_bc op) globals in
incr fail_count;
Printf.printf " FAIL: %s — expected Invalid_opcode, got a result\n" label
with
| Sx_vm.Invalid_opcode n when n = op ->
incr pass_count;
Printf.printf " PASS: %s\n" label
| exn ->
incr fail_count;
Printf.printf " FAIL: %s — unexpected: %s\n" label (Printexc.to_string exn)
in
expect_invalid_opcode "opcode 200 raises Invalid_opcode 200" 200;
expect_invalid_opcode "opcode 224 raises Invalid_opcode 224" 224;
expect_invalid_opcode "opcode 247 raises Invalid_opcode 247" 247;
(* Opcode 199 sits just below the extension threshold — should fall to the
catch-all (Eval_error), proving the threshold is at 200, not 199. *)
let globals = Hashtbl.create 1 in
(try
let _ = Sx_vm.execute_module (make_bc 199) globals in
incr fail_count;
Printf.printf " FAIL: opcode 199 — expected Eval_error, got a result\n"
with
| Sx_vm.Invalid_opcode _ ->
incr fail_count;
Printf.printf " FAIL: opcode 199 routed to extension dispatch (threshold wrong)\n"
| Sx_types.Eval_error _ ->
incr pass_count;
Printf.printf " PASS: opcode 199 stays in core (catch-all)\n"
| exn ->
incr fail_count;
Printf.printf " FAIL: opcode 199 — unexpected: %s\n" (Printexc.to_string exn));
Printf.printf "\nSuite: vm-extension-registry\n";
(* Sx_vm_extensions self-installs its dispatcher at module init. Reset
the registry so prior loaded extensions don't interfere with this
test. *)
Sx_vm_extensions._reset_for_tests ();
let module TestExt : Sx_vm_extension.EXTENSION = struct
let name = "test_reg"
let init () = TestRegState (ref 0)
let opcodes _st = [
(210, "test_reg.OP_PUSH_42", (fun vm _frame ->
Sx_vm.push vm (Sx_types.Integer 42)));
(211, "test_reg.OP_DOUBLE_TOS", (fun vm _frame ->
let v = Sx_vm.pop vm in
match v with
| Sx_types.Integer n -> Sx_vm.push vm (Sx_types.Integer (n * 2))
| _ -> failwith "OP_DOUBLE_TOS: not an integer"));
]
end in
Sx_vm_extensions.register (module TestExt);
(match Sx_vm_extensions.id_of_name "test_reg.OP_PUSH_42" with
| Some 210 ->
incr pass_count;
Printf.printf " PASS: id_of_name resolves opcode\n"
| other ->
incr fail_count;
Printf.printf " FAIL: id_of_name: got %s\n"
(match other with Some n -> string_of_int n | None -> "None"));
(match Sx_vm_extensions.id_of_name "nonexistent.OP" with
| None ->
incr pass_count;
Printf.printf " PASS: id_of_name returns None for unknown\n"
| Some _ ->
incr fail_count;
Printf.printf " FAIL: id_of_name should return None for unknown\n");
(match Sx_vm_extensions.state_of_extension "test_reg" with
| Some (TestRegState _) ->
incr pass_count;
Printf.printf " PASS: state_of_extension returns extension state\n"
| _ ->
incr fail_count;
Printf.printf " FAIL: state_of_extension lookup\n");
(match Sx_vm_extensions.state_of_extension "nonexistent" with
| None ->
incr pass_count;
Printf.printf " PASS: state_of_extension None for unknown\n"
| Some _ ->
incr fail_count;
Printf.printf " FAIL: state_of_extension should be None\n");
(* End-to-end dispatch through the VM. Bytecode runs OP_PUSH_42 then
OP_RETURN (50); execute_module pops the result. *)
let make_bc_seq bytes = ({
vc_arity = 0; vc_rest_arity = -1; vc_locals = 0;
vc_bytecode = bytes; vc_constants = [||];
vc_bytecode_list = None; vc_constants_list = None;
} : Sx_types.vm_code) in
(let globals = Hashtbl.create 1 in
try
match Sx_vm.execute_module (make_bc_seq [| 210; 50 |]) globals with
| Integer 42 ->
incr pass_count;
Printf.printf " PASS: dispatch routes opcode 210 -> push 42\n"
| other ->
incr fail_count;
Printf.printf " FAIL: dispatch opcode 210: got %s\n"
(Sx_types.inspect other)
with exn ->
incr fail_count;
Printf.printf " FAIL: dispatch opcode 210 raised: %s\n"
(Printexc.to_string exn));
(* Compose two extension opcodes: PUSH_42 then DOUBLE_TOS then RETURN.
Verifies that successive extension dispatches share VM state. *)
(let globals = Hashtbl.create 1 in
try
match Sx_vm.execute_module (make_bc_seq [| 210; 211; 50 |]) globals with
| Integer 84 ->
incr pass_count;
Printf.printf " PASS: extension opcodes compose (42 -> 84)\n"
| other ->
incr fail_count;
Printf.printf " FAIL: composed opcodes: got %s\n"
(Sx_types.inspect other)
with exn ->
incr fail_count;
Printf.printf " FAIL: composed opcodes raised: %s\n"
(Printexc.to_string exn));
(* Duplicate opcode-id detection. *)
let module DupExt : Sx_vm_extension.EXTENSION = struct
let name = "dup_check"
let init () = TestRegState (ref 0)
let opcodes _st = [
(210, "dup_check.OP_X", (fun _vm _frame -> ()));
]
end in
(try
Sx_vm_extensions.register (module DupExt);
incr fail_count;
Printf.printf " FAIL: duplicate opcode id should have raised\n"
with Failure _ ->
incr pass_count;
Printf.printf " PASS: duplicate opcode id rejected\n");
(* Out-of-range opcode-id detection. *)
let module OutExt : Sx_vm_extension.EXTENSION = struct
let name = "out_of_range"
let init () = TestRegState (ref 0)
let opcodes _st = [
(300, "out_of_range.OP_X", (fun _vm _frame -> ()));
]
end in
(try
Sx_vm_extensions.register (module OutExt);
incr fail_count;
Printf.printf " FAIL: out-of-range opcode should have raised\n"
with Failure _ ->
incr pass_count;
Printf.printf " PASS: out-of-range opcode rejected\n");
(* Duplicate extension-name detection. *)
let module SameNameExt : Sx_vm_extension.EXTENSION = struct
let name = "test_reg" (* same as TestExt above *)
let init () = TestRegState (ref 0)
let opcodes _st = []
end in
(try
Sx_vm_extensions.register (module SameNameExt);
incr fail_count;
Printf.printf " FAIL: duplicate extension name should have raised\n"
with Failure _ ->
incr pass_count;
Printf.printf " PASS: duplicate extension name rejected\n");
Printf.printf "\nSuite: extension-opcode-id primitive\n";
let prim = Hashtbl.find Sx_primitives.primitives "extension-opcode-id" in
(* Known opcode (registered by TestExt above). *)
(match prim [String "test_reg.OP_PUSH_42"] with
| Integer 210 ->
incr pass_count;
Printf.printf " PASS: primitive returns Integer for registered opcode\n"
| other ->
incr fail_count;
Printf.printf " FAIL: registered opcode lookup: got %s\n"
(Sx_types.inspect other));
(* Unknown opcode → Nil. *)
(match prim [String "nonexistent.OP_X"] with
| Nil ->
incr pass_count;
Printf.printf " PASS: primitive returns nil for unknown opcode\n"
| other ->
incr fail_count;
Printf.printf " FAIL: unknown opcode lookup: got %s\n"
(Sx_types.inspect other));
(* Symbol arg also accepted (compilers may pass quoted symbols). *)
(match prim [Symbol "test_reg.OP_DOUBLE_TOS"] with
| Integer 211 ->
incr pass_count;
Printf.printf " PASS: primitive accepts Symbol args\n"
| other ->
incr fail_count;
Printf.printf " FAIL: symbol arg: got %s\n" (Sx_types.inspect other));
(* Wrong arity / type raises Eval_error. *)
(try
let _ = prim [] in
incr fail_count;
Printf.printf " FAIL: zero args should have raised\n"
with Sx_types.Eval_error _ ->
incr pass_count;
Printf.printf " PASS: zero args rejected\n");
(try
let _ = prim [Integer 42] in
incr fail_count;
Printf.printf " FAIL: integer arg should have raised\n"
with Sx_types.Eval_error _ ->
incr pass_count;
Printf.printf " PASS: integer arg rejected\n");
Printf.printf "\nSuite: extensions/test_ext (canonical extension)\n";
(* Phase D: the real test extension lives at lib/extensions/test_ext.ml.
Register it on top of the inline test_reg from earlier suites — the
two use disjoint opcode IDs (210/211 vs 220/221) so they coexist. *)
Test_ext.register ();
(* Lookup via the public primitive should now find OP_TEST_PUSH_42. *)
(match prim [String "test_ext.OP_TEST_PUSH_42"] with
| Integer 220 ->
incr pass_count;
Printf.printf " PASS: extension-opcode-id finds test_ext.OP_TEST_PUSH_42\n"
| other ->
incr fail_count;
Printf.printf " FAIL: opcode lookup: got %s\n" (Sx_types.inspect other));
(* End-to-end: PUSH_42 + DOUBLE_TOS + RETURN. *)
(let globals = Hashtbl.create 1 in
try
match Sx_vm.execute_module (make_bc_seq [| 220; 221; 50 |]) globals with
| Integer 84 ->
incr pass_count;
Printf.printf " PASS: extensions/test_ext bytecode executes (84)\n"
| other ->
incr fail_count;
Printf.printf " FAIL: test_ext bytecode result: got %s\n"
(Sx_types.inspect other)
with exn ->
incr fail_count;
Printf.printf " FAIL: test_ext bytecode raised: %s\n"
(Printexc.to_string exn));
(* Disassembly: opcode_name should resolve 220/221 via the registry,
not fall back to UNKNOWN_220 / UNKNOWN_221. disassemble returns a
Dict; the instruction list lives at key "bytecode". *)
(let code = make_bc_seq [| 220; 221; 50 |] in
let dis = Sx_vm.disassemble code in
let entries = match dis with
| Dict d -> (match Hashtbl.find_opt d "bytecode" with
| Some (List es) -> es
| _ -> [])
| _ -> []
in
let names = List.filter_map (fun entry -> match entry with
| Dict d ->
(match Hashtbl.find_opt d "opcode" with
| Some (String name) -> Some name
| _ -> None)
| _ -> None) entries
in
let has name = List.mem name names in
if has "test_ext.OP_TEST_PUSH_42" && has "test_ext.OP_TEST_DOUBLE_TOS" then begin
incr pass_count;
Printf.printf " PASS: disassemble shows extension opcode names\n"
end else begin
incr fail_count;
Printf.printf " FAIL: disassemble names: [%s]\n" (String.concat ", " names)
end);
(* Sanity: opcode_name on an unregistered extension opcode still
returns UNKNOWN_n. Pick 230 — out of test_ext's range. *)
(match Sx_vm.opcode_name 230 with
| "UNKNOWN_230" ->
incr pass_count;
Printf.printf " PASS: unregistered ext opcode falls back to UNKNOWN_n\n"
| other ->
incr fail_count;
Printf.printf " FAIL: opcode_name 230: got %s\n" other);
(* Per-extension state: invocation_count should reflect the two opcodes
that ran in the dispatch test above. *)
(match Test_ext.invocation_count () with
| Some n when n >= 2 ->
incr pass_count;
Printf.printf " PASS: extension state recorded %d invocations\n" n
| other ->
incr fail_count;
Printf.printf " FAIL: invocation_count: %s\n"
(match other with Some n -> string_of_int n | None -> "None"));
Printf.printf "\nSuite: jit extension-opcode awareness\n";
let scan = Sx_vm.bytecode_uses_extension_opcodes in
let no_consts = [||] in
(* Pure core ops: scan reports false. *)
(* OP_TRUE OP_RETURN *)
if not (scan [| 3; 50 |] no_consts) then begin
incr pass_count;
Printf.printf " PASS: pure core bytecode is JIT-eligible\n"
end else begin
incr fail_count;
Printf.printf " FAIL: pure core bytecode flagged as extension\n"
end;
(* Extension opcode anywhere → true. *)
if scan [| 220; 50 |] no_consts then begin
incr pass_count;
Printf.printf " PASS: extension opcode detected at head\n"
end else begin
incr fail_count;
Printf.printf " FAIL: extension opcode at head missed\n"
end;
(* Mixed: core + extension → true. *)
if scan [| 3; 220; 50 |] no_consts then begin
incr pass_count;
Printf.printf " PASS: extension opcode detected after core ops\n"
end else begin
incr fail_count;
Printf.printf " FAIL: extension opcode after core ops missed\n"
end;
(* Operand bytes ≥200 must NOT trigger. CONST u16 with index 220
into a synthetic constant pool — the operand is 220 (lo) 0 (hi),
not an opcode. The pool entry at 220 is irrelevant for the scan. *)
let big_consts = Array.make 256 Nil in
if not (scan [| 1; 220; 0; 50 |] big_consts) then begin
incr pass_count;
Printf.printf " PASS: CONST operand ≥200 not a false positive\n"
end else begin
incr fail_count;
Printf.printf " FAIL: CONST operand ≥200 false-positives as ext op\n"
end;
(* CALL_PRIM has 3 operand bytes (u16 + u8); all ≥200 should not
trigger. *)
if not (scan [| 52; 220; 200; 200; 50 |] big_consts) then begin
incr pass_count;
Printf.printf " PASS: CALL_PRIM operands ≥200 not a false positive\n"
end else begin
incr fail_count;
Printf.printf " FAIL: CALL_PRIM operands ≥200 false-positive\n"
end;
(* CLOSURE with upvalue descriptors: scan must skip the 2 + 2*n
dynamic operand bytes. Build a synthetic constant pool with a
Dict at index 0 declaring upvalue-count 1, descriptors that are
≥200 — the scan should skip them and not trigger.
Bytecode layout: CLOSURE 0 0 desc_is_local desc_index RETURN
op lo hi 210 220 50
With upvalue-count = 1, scan must advance past the 2-byte CLOSURE
operand AND the 2 descriptor bytes (210, 220), landing on RETURN. *)
let cl_consts = Array.make 1 Nil in
let dict = Hashtbl.create 1 in
Hashtbl.replace dict "upvalue-count" (Integer 1);
cl_consts.(0) <- Dict dict;
if not (scan [| 51; 0; 0; 210; 220; 50 |] cl_consts) then begin
incr pass_count;
Printf.printf " PASS: CLOSURE upvalue descriptors ≥200 skipped\n"
end else begin
incr fail_count;
Printf.printf " FAIL: CLOSURE upvalue descriptors false-positive\n"
end;
(* Sanity: opcode after CLOSURE+descriptors that IS an extension
opcode triggers correctly. *)
if scan [| 51; 0; 0; 210; 220; 221; 50 |] cl_consts then begin
incr pass_count;
Printf.printf " PASS: extension opcode after CLOSURE detected\n"
end else begin
incr fail_count;
Printf.printf " FAIL: extension opcode after CLOSURE missed\n"
end
(* ====================================================================== *) (* ====================================================================== *)

View File

@@ -708,6 +708,297 @@ let setup_evaluator_bridge env =
match args with match args with
| [e; expr] -> Sx_ref.eval_expr expr e | [e; expr] -> Sx_ref.eval_expr expr e
| _ -> raise (Eval_error "eval-in-env: (env expr)")); | _ -> raise (Eval_error "eval-in-env: (env expr)"));
(* fed-sx Milestone 1 Step 8 transport. NATIVE ONLY — sockets +
threads; deliberately absent from the WASM kernel (registered
here in bin/, never in lib/sx_primitives.ml). Minimal HTTP/1.1,
Connection: close. handler : req-dict -> resp-dict where
req = {:method :path :query :headers :body},
resp = {:status :headers :body}. Never returns. *)
Sx_primitives.register "http-listen" (fun args ->
let strip_cr s =
let n = String.length s in
if n > 0 && s.[n - 1] = '\r' then String.sub s 0 (n - 1) else s
in
match args with
| [port_v; handler] ->
let port = match port_v with
| Integer n -> n
| Number f -> int_of_float f
| _ -> raise (Eval_error "http-listen: (port handler)") in
let sock = Unix.socket Unix.PF_INET Unix.SOCK_STREAM 0 in
Unix.setsockopt sock Unix.SO_REUSEADDR true;
Unix.bind sock
(Unix.ADDR_INET (Unix.inet_addr_loopback, port));
Unix.listen sock 64;
(* SX runtime is shared across threads — serialize handler calls. *)
let mtx = Mutex.create () in
let reason = function
| 200 -> "OK" | 201 -> "Created" | 204 -> "No Content"
| 301 -> "Moved Permanently" | 302 -> "Found"
| 400 -> "Bad Request" | 401 -> "Unauthorized"
| 403 -> "Forbidden" | 404 -> "Not Found"
| 405 -> "Method Not Allowed" | 500 -> "Internal Server Error"
| _ -> "OK" in
let handle fd =
(try
let ic = Unix.in_channel_of_descr fd in
let oc = Unix.out_channel_of_descr fd in
let reqline = strip_cr (input_line ic) in
(match String.split_on_char ' ' reqline with
| meth :: target :: _ ->
let path, query =
match String.index_opt target '?' with
| Some i ->
String.sub target 0 i,
String.sub target (i + 1)
(String.length target - i - 1)
| None -> target, "" in
let headers = Sx_types.make_dict () in
let clen = ref 0 in
let rec rdh () =
let h = strip_cr (input_line ic) in
if h = "" then ()
else begin
(match String.index_opt h ':' with
| Some i ->
let name =
String.lowercase_ascii
(String.trim (String.sub h 0 i)) in
let value =
String.trim
(String.sub h (i + 1)
(String.length h - i - 1)) in
Hashtbl.replace headers name (String value);
if name = "content-length" then
(try clen := int_of_string value with _ -> ())
| None -> ());
rdh ()
end in
rdh ();
let body =
if !clen > 0 then begin
let b = Bytes.create !clen in
really_input ic b 0 !clen;
Bytes.unsafe_to_string b
end else "" in
let req = Sx_types.make_dict () in
Hashtbl.replace req "method" (String meth);
Hashtbl.replace req "path" (String path);
Hashtbl.replace req "query" (String query);
Hashtbl.replace req "headers" (Dict headers);
Hashtbl.replace req "body" (String body);
Mutex.lock mtx;
let resp =
(try Sx_runtime.sx_call handler [Dict req]
with e -> Mutex.unlock mtx; raise e) in
Mutex.unlock mtx;
let getk k = match resp with
| Dict h -> Hashtbl.find_opt h k | _ -> None in
let status = match getk "status" with
| Some (Integer n) -> n
| Some (Number f) -> int_of_float f
| _ -> 200 in
let rbody = match getk "body" with
| Some (String s) -> s
| Some v -> Sx_types.value_to_string v
| None -> "" in
let rhdrs = match getk "headers" with
| Some (Dict h) ->
Hashtbl.fold (fun k v acc ->
(k, (match v with
| String s -> s
| v -> Sx_types.value_to_string v)) :: acc)
h []
| _ -> [] in
let buf = Buffer.create 256 in
Buffer.add_string buf
(Printf.sprintf "HTTP/1.1 %d %s\r\n" status
(reason status));
List.iter (fun (k, v) ->
Buffer.add_string buf
(Printf.sprintf "%s: %s\r\n" k v)) rhdrs;
if not (List.exists
(fun (k, _) ->
String.lowercase_ascii k = "content-type")
rhdrs)
then Buffer.add_string buf
"Content-Type: text/plain\r\n";
Buffer.add_string buf
(Printf.sprintf "Content-Length: %d\r\n"
(String.length rbody));
Buffer.add_string buf "Connection: close\r\n\r\n";
Buffer.add_string buf rbody;
output_string oc (Buffer.contents buf);
flush oc
| _ -> ())
with _ -> ());
(try Unix.close fd with _ -> ())
in
while true do
let fd, _ = Unix.accept sock in
ignore (Thread.create handle fd)
done;
Nil
| _ -> raise (Eval_error "http-listen: (port handler)"));
(* fed-sx Milestone 1 client direction (Phase J). NATIVE ONLY —
Unix sockets + DNS; absent from the WASM kernel. HTTP/1.1
request: TCP connect, write request line + headers + body,
read status + headers + body, return {:status :headers :body}.
URL must be http://...; HTTPS is a later phase (needs TLS).
Body read: Content-Length first, else read to EOF (we send
Connection: close). Transfer-Encoding: chunked is rejected —
fed-sx Phase 8 wires this for inter-server POSTs which will
all carry Content-Length. *)
Sx_primitives.register "http-request" (fun args ->
let strip_cr s =
let n = String.length s in
if n > 0 && s.[n - 1] = '\r' then String.sub s 0 (n - 1) else s
in
match args with
| [String meth; String url; headers_v; body_v] ->
let body = match body_v with
| String s -> s
| Nil -> ""
| v -> Sx_types.value_to_string v in
let prefix = "http://" in
let plen = String.length prefix in
let ulen = String.length url in
if ulen < plen || String.sub url 0 plen <> prefix
then raise (Eval_error "http-request: URL must start with http://");
let rest = String.sub url plen (ulen - plen) in
let host_port, path =
match String.index_opt rest '/' with
| Some i ->
String.sub rest 0 i,
String.sub rest i (String.length rest - i)
| None -> rest, "/" in
if host_port = "" then
raise (Eval_error "http-request: missing host");
let host, port =
match String.index_opt host_port ':' with
| Some i ->
let h = String.sub host_port 0 i in
let ps = String.sub host_port (i + 1)
(String.length host_port - i - 1) in
(h,
(try int_of_string ps with _ ->
raise (Eval_error "http-request: bad port")))
| None -> host_port, 80 in
let addr =
(try (Unix.gethostbyname host).h_addr_list.(0)
with Not_found ->
raise (Eval_error ("http-request: dns: " ^ host))) in
let sock = Unix.socket Unix.PF_INET Unix.SOCK_STREAM 0 in
let cleanup () = try Unix.close sock with _ -> () in
let result =
(try
(try Unix.connect sock (Unix.ADDR_INET (addr, port))
with Unix.Unix_error (e, _, _) ->
raise (Eval_error
("http-request: connect: " ^ Unix.error_message e)));
let oc = Unix.out_channel_of_descr sock in
let ic = Unix.in_channel_of_descr sock in
let buf = Buffer.create 256 in
Buffer.add_string buf
(Printf.sprintf "%s %s HTTP/1.1\r\n" meth path);
let host_hdr_sent = ref false in
let clen_sent = ref false in
let conn_sent = ref false in
(match headers_v with
| Dict h ->
Hashtbl.iter (fun k v ->
let kl = String.lowercase_ascii k in
if kl = "host" then host_hdr_sent := true;
if kl = "content-length" then clen_sent := true;
if kl = "connection" then conn_sent := true;
let vs = match v with
| String s -> s
| x -> Sx_types.value_to_string x in
Buffer.add_string buf
(Printf.sprintf "%s: %s\r\n" k vs)) h
| Nil -> ()
| _ -> raise (Eval_error "http-request: headers must be dict"));
if not !host_hdr_sent then
Buffer.add_string buf
(Printf.sprintf "Host: %s\r\n" host_port);
if not !clen_sent then
Buffer.add_string buf
(Printf.sprintf "Content-Length: %d\r\n"
(String.length body));
if not !conn_sent then
Buffer.add_string buf "Connection: close\r\n";
Buffer.add_string buf "\r\n";
Buffer.add_string buf body;
output_string oc (Buffer.contents buf);
flush oc;
let sl =
(try strip_cr (input_line ic)
with End_of_file ->
raise (Eval_error
"http-request: connection closed before status")) in
let status =
match String.split_on_char ' ' sl with
| _ver :: code :: _ ->
(try int_of_string code with _ ->
raise (Eval_error "http-request: bad status code"))
| _ -> raise (Eval_error "http-request: bad status line") in
let rhdrs = Sx_types.make_dict () in
let clen = ref (-1) in
let chunked = ref false in
let rec rdh () =
let h =
(try strip_cr (input_line ic)
with End_of_file -> "") in
if h = "" then ()
else begin
(match String.index_opt h ':' with
| Some i ->
let name =
String.lowercase_ascii
(String.trim (String.sub h 0 i)) in
let value =
String.trim
(String.sub h (i + 1)
(String.length h - i - 1)) in
Hashtbl.replace rhdrs name (String value);
if name = "content-length" then
(try clen := int_of_string value with _ -> ())
else if name = "transfer-encoding" &&
String.lowercase_ascii value = "chunked"
then chunked := true
| None -> ());
rdh ()
end in
rdh ();
if !chunked then
raise (Eval_error
"http-request: chunked transfer-encoding not supported");
let rbody =
if !clen >= 0 then begin
let b = Bytes.create !clen in
really_input ic b 0 !clen;
Bytes.unsafe_to_string b
end else begin
let b = Buffer.create 256 in
(try
while true do
Buffer.add_channel b ic 4096
done; assert false
with End_of_file -> ());
Buffer.contents b
end in
let resp = Sx_types.make_dict () in
Hashtbl.replace resp "status" (Integer status);
Hashtbl.replace resp "headers" (Dict rhdrs);
Hashtbl.replace resp "body" (String rbody);
Dict resp
with e -> cleanup (); raise e) in
cleanup ();
result
| _ -> raise (Eval_error "http-request: (method url headers body)"));
bind "trampoline" (fun args -> bind "trampoline" (fun args ->
match args with match args with
| [v] -> | [v] ->

49
hosts/ocaml/bin/test_http.sh Executable file
View File

@@ -0,0 +1,49 @@
#!/usr/bin/env bash
# Phase H test — native-only http-listen primitive.
# Starts sx_server with a tiny SX echo handler, drives it with curl
# (GET / POST / 404 / custom header), asserts, then kills it.
set -u
cd "$(dirname "$0")/.."
SRV=_build/default/bin/sx_server.exe
PORT=${HTTP_TEST_PORT:-8911}
PASS=0
FAIL=0
ok() { echo " PASS: $1"; PASS=$((PASS+1)); }
bad() { echo " FAIL: $1$2"; FAIL=$((FAIL+1)); }
if [ ! -x "$SRV" ]; then
echo "build sx_server.exe first (dune build bin/sx_server.exe)"; exit 1
fi
H='(begin (define (h req) (if (= (get req "path") "/echo") {:status 200 :headers {"X-Echo" (get req "method")} :body (str "M=" (get req "method") " P=" (get req "path") " Q=" (get req "query") " B=" (get req "body"))} {:status 404 :body "nope"})) (http-listen '"$PORT"' h))'
ESC=${H//\"/\\\"}
{ printf '(epoch 1)\n(eval "%s")\n' "$ESC"; sleep 30; } | "$SRV" >/tmp/test_http_srv.out 2>&1 &
SVPID=$!
trap 'kill $SVPID 2>/dev/null; wait 2>/dev/null' EXIT
up=0
for _ in $(seq 1 50); do
curl -s -o /dev/null "http://127.0.0.1:$PORT/echo" 2>/dev/null && { up=1; break; }
sleep 0.2
done
[ "$up" = 1 ] || { echo " FAIL: server did not start"; cat /tmp/test_http_srv.out; exit 1; }
# GET with query + custom response header.
g=$(curl -s -i "http://127.0.0.1:$PORT/echo?x=1" | tr -d '\r')
echo "$g" | grep -q '^HTTP/1.1 200 OK' && ok "GET status 200" || bad "GET status" "$g"
echo "$g" | grep -q '^X-Echo: GET' && ok "GET custom header" || bad "GET header" "$g"
echo "$g" | grep -q '^M=GET P=/echo Q=x=1 B=$' && ok "GET echo body" || bad "GET body" "$g"
# POST with body.
p=$(curl -s -X POST --data 'hello' "http://127.0.0.1:$PORT/echo")
[ "$p" = 'M=POST P=/echo Q= B=hello' ] && ok "POST body echoed" || bad "POST body" "$p"
# 404 path.
n=$(curl -s -i "http://127.0.0.1:$PORT/missing" | tr -d '\r')
echo "$n" | grep -q '^HTTP/1.1 404 Not Found' && ok "404 status" || bad "404 status" "$n"
echo "$n" | grep -q '^nope$' && ok "404 body" || bad "404 body" "$n"
echo "Results: $PASS passed, $FAIL failed"
[ "$FAIL" = 0 ]

View File

@@ -0,0 +1,80 @@
#!/usr/bin/env bash
# Phase J test — native-only http-request client primitive.
# Reuses Phase H's http-listen to spin up an echo server, then drives
# a separate sx_server via the epoch protocol to issue http-request
# calls and assert response shape + headers + body.
set -u
cd "$(dirname "$0")/.."
SRV=_build/default/bin/sx_server.exe
PORT=${HTTP_CLIENT_TEST_PORT:-8921}
PASS=0
FAIL=0
ok() { echo " PASS: $1"; PASS=$((PASS+1)); }
bad() { echo " FAIL: $1$2"; FAIL=$((FAIL+1)); }
if [ ! -x "$SRV" ]; then
echo "build sx_server.exe first (dune build bin/sx_server.exe)"; exit 1
fi
# /echo echoes method/path/query/body and reflects request X-Custom
# back as response X-Got; /missing-test → 404.
H='(begin (define (h req) (if (= (get req "path") "/echo") {:status 200 :headers {"X-Echo" (get req "method") "X-Got" (get (get req "headers") "x-custom")} :body (str "M=" (get req "method") " P=" (get req "path") " Q=" (get req "query") " B=" (get req "body"))} (if (= (get req "path") "/missing-test") {:status 404 :body "nope"} {:status 500 :body "err"}))) (http-listen '"$PORT"' h))'
ESC=${H//\"/\\\"}
{ printf '(epoch 1)\n(eval "%s")\n' "$ESC"; sleep 60; } | "$SRV" >/tmp/test_http_client_srv.out 2>&1 &
SVPID=$!
trap 'kill $SVPID 2>/dev/null; wait 2>/dev/null' EXIT
up=0
for _ in $(seq 1 50); do
curl -s -o /dev/null "http://127.0.0.1:$PORT/echo" 2>/dev/null && { up=1; break; }
sleep 0.2
done
[ "$up" = 1 ] || { echo " FAIL: server did not start"; cat /tmp/test_http_client_srv.out; exit 1; }
emit() {
# $1 = epoch num, $2 = raw SX form. Wraps in (eval "...") with quotes escaped.
local esc=${2//\"/\\\"}
printf '(epoch %s)\n(eval "%s")\n' "$1" "$esc"
}
DRV_OUT=/tmp/test_http_client_drv.out
{
emit 1 '(let ((r (http-request "GET" "http://127.0.0.1:'"$PORT"'/echo?x=1" {} ""))) (str "S=" (get r "status") " E=" (get (get r "headers") "x-echo") " B=" (get r "body")))'
emit 2 '(let ((r (http-request "POST" "http://127.0.0.1:'"$PORT"'/echo" {} "hello"))) (str "S=" (get r "status") " B=" (get r "body")))'
emit 3 '(let ((r (http-request "GET" "http://127.0.0.1:'"$PORT"'/missing-test" {} ""))) (str "S=" (get r "status") " B=" (get r "body")))'
emit 4 '(let ((r (http-request "GET" "http://127.0.0.1:'"$PORT"'/echo" {"X-Custom" "myval"} ""))) (get (get r "headers") "x-got"))'
emit 5 '(http-request "GET" "ftp://nope" {} "")'
emit 6 '(let ((r (http-request "GET" "http://127.0.0.1:'"$PORT"'/echo" {} ""))) (get r "status"))'
} | "$SRV" >"$DRV_OUT" 2>&1
# eval results come back as (ok-len N L)\n<body>\n — grep the body content.
grep -q '^"S=200 E=GET B=M=GET P=/echo Q=x=1 B="$' "$DRV_OUT" \
&& ok "GET status + echo header + body" \
|| bad "GET" "$(grep -A1 '^(ok-len 1 ' "$DRV_OUT" | tail -1)"
grep -q '^"S=200 B=M=POST P=/echo Q= B=hello"$' "$DRV_OUT" \
&& ok "POST body roundtrip" \
|| bad "POST" "$(grep -A1 '^(ok-len 2 ' "$DRV_OUT" | tail -1)"
grep -q '^"S=404 B=nope"$' "$DRV_OUT" \
&& ok "404 status + body" \
|| bad "404" "$(grep -A1 '^(ok-len 3 ' "$DRV_OUT" | tail -1)"
grep -q '^"myval"$' "$DRV_OUT" \
&& ok "custom request header reaches server" \
|| bad "custom-header" "$(grep -A1 '^(ok-len 4 ' "$DRV_OUT" | tail -1)"
R5=$(grep '^(error 5 ' "$DRV_OUT" | head -1)
echo "$R5" | grep -q 'URL must start with http' \
&& ok "non-http scheme rejected" \
|| bad "bad-url" "$R5"
# Status is an Integer (200), serialized bare without quotes.
grep -q '^200$' "$DRV_OUT" \
&& ok "response status is integer 200" \
|| bad "status-integer" "$(grep -A1 '^(ok-len 6 ' "$DRV_OUT" | tail -1)"
echo "Results: $PASS passed, $FAIL failed"
[ "$FAIL" = 0 ]

View File

@@ -2,3 +2,7 @@
(name sx) (name sx)
(wrapped false) (wrapped false)
(libraries re re.pcre unix)) (libraries re re.pcre unix))
; Pull in extension modules from lib/extensions/ (test_ext.ml, etc).
; See plans/sx-vm-opcode-extension.md.
(include_subdirs unqualified)

View File

@@ -0,0 +1,71 @@
# SX VM extensions
Each `*.ml` file here is a VM extension — a first-class OCaml module that
registers specialized bytecode opcodes with `Sx_vm_extensions`. See
[`plans/sx-vm-opcode-extension.md`](../../../../plans/sx-vm-opcode-extension.md)
for the design.
## Pattern
```ocaml
(* lib/extensions/myport.ml *)
open Sx_types
type Sx_vm_extension.extension_state += MyportState of { ... }
module M : Sx_vm_extension.EXTENSION = struct
let name = "myport"
let init () = MyportState { ... }
let opcodes _st = [
(id, "myport.OP_NAME", handler);
...
]
end
let register () = Sx_vm_extensions.register (module M)
```
Then call `Myport.register ()` once at startup from any binary that
should have the extension loaded.
## Opcode-ID allocation
Range 200-247 (per `Sx_vm_extensions.extension_min` /
`extension_max`). Conventions:
| Range | Use |
|---------|-------------------------------------------------------------------------|
| 200-209 | reserved for `lib/guest/vm/` shared opcodes (chiselled out on 2nd use) |
| 210-219 | inline test extensions defined in `bin/run_tests.ml` |
| 220-229 | this directory's `test_ext` (the canonical template) |
| 230-247 | first-come-first-served by language ports (Erlang first) |
When a port claims a contiguous block, document it in the table above.
The registry rejects collisions at startup with a loud error — there is
no silent shadowing.
## Naming
Always prefix opcode names with the extension name plus a dot:
`myport.OP_<NAME>`. The prefix is a hard convention so that multiple
extensions can share the global opcode-name namespace cleanly.
## State
`extension_state` is an extensible variant. Add your case (e.g.
`MyportState of { ... }`) at the top of your file, return it from
`init`, and pattern-match it inside your handlers. Other extensions
cannot see your state — the variant case is private to your module.
## Testing
`test_ext.ml` is the canonical worked example. `bin/run_tests.ml`
calls `Test_ext.register ()`, then drives bytecode that exercises the
opcodes end-to-end (push, double, dispatch, disassemble, invocation
counter). Mirror this shape when adding a real port's extension.
## Build wiring
`lib/dune` has `(include_subdirs unqualified)`, so any `.ml` you drop
in here is automatically part of the `sx` library. Module name follows
the filename verbatim (`test_ext.ml``Test_ext`).

View File

@@ -0,0 +1,67 @@
(** {1 [test_ext] — canonical example VM extension}
A minimal extension demonstrating the registration pattern from
[plans/sx-vm-opcode-extension.md]. The opcode IDs (220, 221) sit at
the top of the extension range, well clear of anything a real
language port would claim.
Two operand-less opcodes:
- [test_ext.OP_TEST_PUSH_42] (220) — pushes the integer 42.
- [test_ext.OP_TEST_DOUBLE_TOS] (221) — pops the integer on TOS,
pushes 2× it.
These are the smallest stack manipulations that prove the extension
mechanism wires through end-to-end (registry → dispatch → human-
readable disassembly). Real ports (Erlang Phase 9, future Haskell
perf phases) replace this template with their own opcode set.
Loading: [Test_ext.register ()] adds the extension to
[Sx_vm_extensions]. Run-time binaries that want the test opcodes
available call this once at startup. Unit tests in
[bin/run_tests.ml] do exactly that. *)
open Sx_types
(** Per-instance state for [test_ext]. Counts how many times the
handlers ran — purely so the extension has *some* state, exercising
the [extension_state] machinery. *)
type Sx_vm_extension.extension_state += TestExtState of {
mutable invocations : int;
}
module M : Sx_vm_extension.EXTENSION = struct
let name = "test_ext"
let init () = TestExtState { invocations = 0 }
let opcodes st =
let bump () = match st with
| TestExtState s -> s.invocations <- s.invocations + 1
| _ -> ()
in
[
(220, "test_ext.OP_TEST_PUSH_42",
(fun vm _frame -> bump (); Sx_vm.push vm (Integer 42)));
(221, "test_ext.OP_TEST_DOUBLE_TOS",
(fun vm _frame ->
bump ();
let v = Sx_vm.pop vm in
match v with
| Integer n -> Sx_vm.push vm (Integer (n * 2))
| _ -> raise (Eval_error
"test_ext.OP_TEST_DOUBLE_TOS: TOS is not an integer")));
]
end
(** Register [test_ext] in [Sx_vm_extensions]. Idempotent only by
failing loudly — calling twice raises [Failure]. Binaries call this
once at startup; tests may [_reset_for_tests] then re-register. *)
let register () = Sx_vm_extensions.register (module M : Sx_vm_extension.EXTENSION)
(** Read the invocation counter from the live registry state. Returns
[None] if [register] hasn't been called yet. *)
let invocation_count () =
match Sx_vm_extensions.state_of_extension "test_ext" with
| Some (TestExtState s) -> Some s.invocations
| _ -> None

142
hosts/ocaml/lib/sx_cbor.ml Normal file
View File

@@ -0,0 +1,142 @@
(** dag-cbor encode / decode — pure OCaml, WASM-safe.
RFC 8949 deterministic subset as constrained by IPLD dag-cbor
(RFC 8742): unsigned/negative ints, text strings, arrays, maps
with keys sorted by **length-then-bytewise**, bool, null, and
tag 42 (CID link, decode-side passthrough). Floats are not
supported (no fed-sx shape needs them yet) — encoding a [Number]
or decoding a float head raises. Reference: RFC 8949 §3, §4.2. *)
open Sx_types
exception Cbor_error of string
(* ---- Encoder ---- *)
let write_head buf major v =
let m = major lsl 5 in
if v < 24 then
Buffer.add_char buf (Char.chr (m lor v))
else if v < 0x100 then begin
Buffer.add_char buf (Char.chr (m lor 24));
Buffer.add_char buf (Char.chr v)
end else if v < 0x10000 then begin
Buffer.add_char buf (Char.chr (m lor 25));
Buffer.add_char buf (Char.chr ((v lsr 8) land 0xFF));
Buffer.add_char buf (Char.chr (v land 0xFF))
end else if v < 0x100000000 then begin
Buffer.add_char buf (Char.chr (m lor 26));
for i = 3 downto 0 do
Buffer.add_char buf (Char.chr ((v lsr (8 * i)) land 0xFF))
done
end else begin
Buffer.add_char buf (Char.chr (m lor 27));
for i = 7 downto 0 do
Buffer.add_char buf (Char.chr ((v lsr (8 * i)) land 0xFF))
done
end
(* dag-cbor map key order: shorter key first, then bytewise. *)
let key_order a b =
let la = String.length a and lb = String.length b in
if la <> lb then compare la lb else compare a b
let rec encode_into buf (v : value) : unit =
match v with
| Integer n ->
if n >= 0 then write_head buf 0 n
else write_head buf 1 (-1 - n)
| String s ->
write_head buf 3 (String.length s);
Buffer.add_string buf s
| Symbol s | Keyword s ->
write_head buf 3 (String.length s);
Buffer.add_string buf s
| Bool false -> Buffer.add_char buf '\xf4'
| Bool true -> Buffer.add_char buf '\xf5'
| Nil -> Buffer.add_char buf '\xf6'
| List items ->
write_head buf 4 (List.length items);
List.iter (encode_into buf) items
| Dict d ->
let keys = Hashtbl.fold (fun k _ acc -> k :: acc) d [] in
let keys = List.sort_uniq key_order keys in
write_head buf 5 (List.length keys);
List.iter (fun k ->
write_head buf 3 (String.length k);
Buffer.add_string buf k;
encode_into buf (Hashtbl.find d k)) keys
| Number _ ->
raise (Cbor_error "cbor-encode: floats unsupported (dag-cbor subset)")
| _ ->
raise (Cbor_error
("cbor-encode: unencodable value " ^ type_of v))
let encode (v : value) : string =
let buf = Buffer.create 64 in
encode_into buf v;
Buffer.contents buf
(* ---- Decoder ---- *)
let decode (s : string) : value =
let pos = ref 0 in
let len = String.length s in
let byte () =
if !pos >= len then raise (Cbor_error "cbor-decode: truncated");
let c = Char.code s.[!pos] in incr pos; c
in
let read_uint ai =
if ai < 24 then ai
else if ai = 24 then byte ()
else if ai = 25 then let a = byte () in let b = byte () in (a lsl 8) lor b
else if ai = 26 then begin
let v = ref 0 in
for _ = 0 to 3 do v := (!v lsl 8) lor byte () done; !v
end else if ai = 27 then begin
let v = ref 0 in
for _ = 0 to 7 do v := (!v lsl 8) lor byte () done; !v
end else raise (Cbor_error "cbor-decode: bad additional info")
in
let read_bytes n =
if !pos + n > len then raise (Cbor_error "cbor-decode: truncated");
let r = String.sub s !pos n in pos := !pos + n; r
in
let rec item () =
let b = byte () in
let major = b lsr 5 and ai = b land 0x1f in
match major with
| 0 -> Integer (read_uint ai)
| 1 -> Integer (-1 - read_uint ai)
| 2 -> String (read_bytes (read_uint ai))
| 3 -> String (read_bytes (read_uint ai))
| 4 ->
let n = read_uint ai in
List (List.init n (fun _ -> item ()))
| 5 ->
let n = read_uint ai in
let d = make_dict () in
for _ = 1 to n do
let k = match item () with
| String k -> k
| _ -> raise (Cbor_error "cbor-decode: non-string map key")
in
Hashtbl.replace d k (item ())
done;
Dict d
| 6 ->
(* Tag: tag-42 CID link → pass the inner item through. *)
ignore (read_uint ai); item ()
| 7 ->
(match ai with
| 20 -> Bool false
| 21 -> Bool true
| 22 -> Nil
| 23 -> Nil
| _ ->
raise (Cbor_error
"cbor-decode: floats/simple unsupported (dag-cbor subset)"))
| _ -> raise (Cbor_error "cbor-decode: bad major type")
in
let v = item () in
v

66
hosts/ocaml/lib/sx_cid.ml Normal file
View File

@@ -0,0 +1,66 @@
(** CIDv1 computation — pure OCaml, WASM-safe.
Multihash + CIDv1 + multibase base32-lower (RFC 4648, no pad,
multibase prefix 'b'). Codecs: dag-cbor 0x71, raw 0x55. Hash
codes: sha2-256 0x12, sha3-256 0x16. Reference: the multiformats
specs (unsigned-varint, multihash, cid, multibase). No deps. *)
open Sx_types
(* Unsigned LEB128 (multiformats unsigned-varint). *)
let varint (n : int) : string =
let buf = Buffer.create 4 in
let n = ref n in
let cont = ref true in
while !cont do
let b = !n land 0x7f in
n := !n lsr 7;
if !n = 0 then (Buffer.add_char buf (Char.chr b); cont := false)
else Buffer.add_char buf (Char.chr (b lor 0x80))
done;
Buffer.contents buf
(* RFC 4648 base32 lowercase, no padding. *)
let b32_alpha = "abcdefghijklmnopqrstuvwxyz234567"
let base32_lower (s : string) : string =
let buf = Buffer.create ((String.length s * 8 + 4) / 5) in
let acc = ref 0 and bits = ref 0 in
String.iter (fun c ->
acc := (!acc lsl 8) lor (Char.code c);
bits := !bits + 8;
while !bits >= 5 do
bits := !bits - 5;
Buffer.add_char buf b32_alpha.[(!acc lsr !bits) land 0x1f]
done) s;
if !bits > 0 then
Buffer.add_char buf b32_alpha.[(!acc lsl (5 - !bits)) land 0x1f];
Buffer.contents buf
(* "abef" -> the 2 raw bytes. *)
let unhex (h : string) : string =
let n = String.length h / 2 in
let b = Bytes.create n in
for i = 0 to n - 1 do
Bytes.set b i
(Char.chr (int_of_string ("0x" ^ String.sub h (2 * i) 2)))
done;
Bytes.unsafe_to_string b
(* multihash = varint(code) || varint(len) || digest *)
let multihash (code : int) (digest : string) : string =
varint code ^ varint (String.length digest) ^ digest
(* CIDv1 = 0x01 || varint(codec) || multihash ; multibase 'b' base32. *)
let cidv1 (codec : int) (mh : string) : string =
"b" ^ base32_lower ("\x01" ^ varint codec ^ mh)
let codec_dag_cbor = 0x71
let mh_sha2_256 = 0x12
(* Canonicalize an SX value: dag-cbor encode -> sha2-256 ->
multihash -> CIDv1 (dag-cbor codec). *)
let cid_from_sx (v : value) : string =
let cbor = Sx_cbor.encode v in
let digest = unhex (Sx_sha2.sha256_hex cbor) in
cidv1 codec_dag_cbor (multihash mh_sha2_256 digest)

View File

@@ -0,0 +1,289 @@
(** Ed25519 signature verification — pure OCaml, WASM-safe.
RFC 8032 §5.1.7 cofactorless verify over edwards25519. Includes a
minimal arbitrary-precision unsigned bignum (no Zarith / no deps)
and twisted-Edwards extended-coordinate point arithmetic. Verify
is total: malformed inputs return [false], never raise. SHA-512
is reused from {!Sx_sha2}. Reference: RFC 8032, RFC 7748. *)
(* ---- Minimal bignum: int array, little-endian, base 2^26. ---- *)
let bits = 26
let base = 1 lsl bits
let mask = base - 1
type bn = int array (* normalized: no high zero limbs, length >= 1 *)
let norm (a : bn) : bn =
let n = ref (Array.length a) in
while !n > 1 && a.(!n - 1) = 0 do decr n done;
if !n = Array.length a then a else Array.sub a 0 !n
let bzero : bn = [| 0 |]
let of_int n : bn =
if n = 0 then bzero
else begin
let r = ref [] and n = ref n in
while !n > 0 do r := (!n land mask) :: !r; n := !n lsr bits done;
norm (Array.of_list (List.rev !r))
end
let is_zero (a : bn) = Array.length a = 1 && a.(0) = 0
let cmp (a : bn) (b : bn) : int =
let a = norm a and b = norm b in
let la = Array.length a and lb = Array.length b in
if la <> lb then compare la lb
else begin
let r = ref 0 and i = ref (la - 1) in
while !r = 0 && !i >= 0 do
if a.(!i) <> b.(!i) then r := compare a.(!i) b.(!i);
decr i
done; !r
end
let add (a : bn) (b : bn) : bn =
let la = Array.length a and lb = Array.length b in
let n = (max la lb) + 1 in
let r = Array.make n 0 in
let carry = ref 0 in
for i = 0 to n - 1 do
let s = !carry
+ (if i < la then a.(i) else 0)
+ (if i < lb then b.(i) else 0) in
r.(i) <- s land mask; carry := s lsr bits
done;
norm r
(* a - b, requires a >= b *)
let sub (a : bn) (b : bn) : bn =
let la = Array.length a and lb = Array.length b in
let r = Array.make la 0 in
let borrow = ref 0 in
for i = 0 to la - 1 do
let s = a.(i) - !borrow - (if i < lb then b.(i) else 0) in
if s < 0 then (r.(i) <- s + base; borrow := 1)
else (r.(i) <- s; borrow := 0)
done;
norm r
let mul (a : bn) (b : bn) : bn =
let la = Array.length a and lb = Array.length b in
let r = Array.make (la + lb) 0 in
for i = 0 to la - 1 do
let carry = ref 0 in
for j = 0 to lb - 1 do
let s = r.(i + j) + a.(i) * b.(j) + !carry in
r.(i + j) <- s land mask; carry := s lsr bits
done;
r.(i + lb) <- r.(i + lb) + !carry
done;
norm r
let numbits (a : bn) : int =
let a = norm a in
let hi = Array.length a - 1 in
if hi = 0 && a.(0) = 0 then 0
else begin
let b = ref 0 and v = ref a.(hi) in
while !v > 0 do incr b; v := !v lsr 1 done;
hi * bits + !b
end
let bit (a : bn) (i : int) : int =
let limb = i / bits and off = i mod bits in
if limb >= Array.length a then 0 else (a.(limb) lsr off) land 1
(* r = a mod m (m > 0), binary long division. *)
let bn_mod (a : bn) (m : bn) : bn =
if cmp a m < 0 then norm a
else begin
let r = ref bzero in
for i = numbits a - 1 downto 0 do
(* r = r*2 + bit *)
r := add !r !r;
if bit a i = 1 then r := add !r [| 1 |];
if cmp !r m >= 0 then r := sub !r m
done;
!r
end
let div_small (a : bn) (d : int) : bn =
let la = Array.length a in
let q = Array.make la 0 in
let rem = ref 0 in
for i = la - 1 downto 0 do
let cur = (!rem lsl bits) lor a.(i) in
q.(i) <- cur / d; rem := cur mod d
done;
norm q
let powmod (b0 : bn) (e : bn) (m : bn) : bn =
let result = ref [| 1 |] and b = ref (bn_mod b0 m) in
let nb = numbits e in
for i = 0 to nb - 1 do
if bit e i = 1 then result := bn_mod (mul !result !b) m;
b := bn_mod (mul !b !b) m
done;
!result
let of_bytes_le (s : string) : bn =
let acc = ref bzero in
for i = String.length s - 1 downto 0 do
acc := add (mul !acc (of_int 256)) (of_int (Char.code s.[i]))
done;
!acc
let to_bytes_le (a : bn) (n : int) : string =
let b = Bytes.make n '\000' in
let cur = ref (norm a) in
for i = 0 to n - 1 do
let q = div_small !cur 256 in
let r =
let qm = mul q (of_int 256) in
let d = sub !cur qm in
if is_zero d then 0 else d.(0)
in
Bytes.set b i (Char.chr r);
cur := q
done;
Bytes.unsafe_to_string b
(* ---- Field GF(p), p = 2^255 - 19 ---- *)
let p =
let twop255 = Array.make 11 0 in (* 11*26 = 286 > 255 *)
let limb = 255 / bits and off = 255 mod bits in
twop255.(limb) <- 1 lsl off;
sub (norm twop255) (of_int 19)
let fmod a = bn_mod a p
let fadd a b = fmod (add a b)
let fsub a b = fmod (add a (sub p (fmod b)))
let fmul a b = fmod (mul a b)
let fpow a e = powmod a e p
let finv a = fpow a (sub p (of_int 2)) (* Fermat: a^(p-2) *)
(* group order L = 2^252 + 27742317777372353535851937790883648493 *)
let ell =
of_bytes_le
"\xed\xd3\xf5\x5c\x1a\x63\x12\x58\xd6\x9c\xf7\xa2\xde\xf9\xde\x14\
\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x10"
(* d = -121665 / 121666 mod p *)
let dconst =
let inv666 = finv (of_int 121666) in
fmod (mul (fsub (of_int 0) (of_int 121665)) inv666)
(* sqrt(-1) = 2^((p-1)/4) mod p *)
let sqrtm1 = fpow (of_int 2) (div_small (sub p (of_int 1)) 4)
(* ---- edwards25519 points in extended coords (X,Y,Z,T) ---- *)
type pt = { x : bn; y : bn; z : bn; t : bn }
let identity = { x = bzero; y = of_int 1; z = of_int 1; t = bzero }
(* add-2008-hwcd-3, complete for a = -1 on ed25519 *)
let padd (p1 : pt) (p2 : pt) : pt =
let a = fmul (fsub p1.y p1.x) (fsub p2.y p2.x) in
let b = fmul (fadd p1.y p1.x) (fadd p2.y p2.x) in
let c = fmul (fmul p1.t (fmul (of_int 2) dconst)) p2.t in
let dd = fmul (fmul p1.z (of_int 2)) p2.z in
let e = fsub b a in
let f = fsub dd c in
let g = fadd dd c in
let h = fadd b a in
{ x = fmul e f; y = fmul g h; t = fmul e h; z = fmul f g }
let scalar_mul (n : bn) (q : pt) : pt =
let r = ref identity in
for i = numbits n - 1 downto 0 do
r := padd !r !r;
if bit n i = 1 then r := padd !r q
done;
!r
let pnegate (q : pt) : pt =
{ q with x = fsub (of_int 0) q.x; t = fsub (of_int 0) q.t }
(* Decompress a 32-byte little-endian point encoding. *)
let decompress (s : string) : pt option =
if String.length s <> 32 then None
else begin
let sign = (Char.code s.[31] lsr 7) land 1 in
let s' = Bytes.of_string s in
Bytes.set s' 31 (Char.chr (Char.code s.[31] land 0x7f));
let y = of_bytes_le (Bytes.unsafe_to_string s') in
if cmp y p >= 0 then None
else begin
let y2 = fmul y y in
let u = fsub y2 (of_int 1) in
let v = fadd (fmul dconst y2) (of_int 1) in
(* x = u v^3 (u v^7)^((p-5)/8) *)
let v3 = fmul (fmul v v) v in
let v7 = fmul (fmul v3 v3) v in
let exp = div_small (sub p (of_int 5)) 8 in
let x0 = fmul (fmul u v3) (fpow (fmul u v7) exp) in
let vx2 = fmul v (fmul x0 x0) in
let x =
if cmp vx2 u = 0 then Some x0
else if cmp vx2 (fsub (of_int 0) u) = 0 then Some (fmul x0 sqrtm1)
else None
in
match x with
| None -> None
| Some x ->
if is_zero x && sign = 1 then None
else begin
let x = if (bit x 0) <> sign then fsub (of_int 0) x else x in
Some { x; y; z = of_int 1; t = fmul x y }
end
end
end
(* Encode a point to 32-byte little-endian (y with x-parity bit). *)
let encode (q : pt) : string =
let zi = finv q.z in
let x = fmul q.x zi and y = fmul q.y zi in
let b = Bytes.of_string (to_bytes_le y 32) in
let last = Char.code (Bytes.get b 31) lor ((bit x 0) lsl 7) in
Bytes.set b 31 (Char.chr last);
Bytes.unsafe_to_string b
(* base point: y = 4/5 mod p, x even (sign 0). *)
let base_point =
let by = fmul (of_int 4) (finv (of_int 5)) in
match decompress (to_bytes_le by 32) with
| Some pt -> pt
| None -> failwith "ed25519: base point decompress failed"
let unhex (h : string) : string =
let n = String.length h / 2 in
let b = Bytes.create n in
for i = 0 to n - 1 do
Bytes.set b i
(Char.chr (int_of_string ("0x" ^ String.sub h (2 * i) 2)))
done;
Bytes.unsafe_to_string b
let sha512_bytes s = unhex (Sx_sha2.sha512_hex s)
(* RFC 8032 §5.1.7 cofactorless: encode([S]B - [k]A) == R. *)
let verify ~pubkey ~msg ~sig_ : bool =
if String.length pubkey <> 32 || String.length sig_ <> 64 then false
else
let rb = String.sub sig_ 0 32 in
let sb = String.sub sig_ 32 32 in
let s = of_bytes_le sb in
if cmp s ell >= 0 then false
else
match decompress pubkey with
| None -> false
| Some a ->
let h = sha512_bytes (rb ^ pubkey ^ msg) in
let k = bn_mod (of_bytes_le h) ell in
let sb_pt = scalar_mul s base_point in
let ka = scalar_mul k a in
let chk = padd sb_pt (pnegate ka) in
(try encode chk = rb with _ -> false)

View File

@@ -3237,6 +3237,21 @@ let () =
with Sys_error msg -> raise (Eval_error ("file-read: " ^ msg))) with Sys_error msg -> raise (Eval_error ("file-read: " ^ msg)))
| _ -> raise (Eval_error "file-read: (path)")); | _ -> raise (Eval_error "file-read: (path)"));
(* fed-sx Step 3 segment replay. Sorted names, no "."/".." ;
errors prefixed like file-read (msg carries enoent/enotdir). *)
register "file-list-dir" (fun args ->
match args with
| [String path] ->
(try
let names = Sys.readdir path in
let names =
Array.to_list names
|> List.filter (fun n -> n <> "." && n <> "..") in
let names = List.sort compare names in
List (List.map (fun n -> String n) names)
with Sys_error msg -> raise (Eval_error ("file-list-dir: " ^ msg)))
| _ -> raise (Eval_error "file-list-dir: (path)"));
register "file-write" (fun args -> register "file-write" (fun args ->
match args with match args with
| [String path; String content] -> | [String path; String content] ->
@@ -4158,4 +4173,61 @@ let () =
Sx_types.jit_skipped_count := 0; Sx_types.jit_skipped_count := 0;
Sx_types.jit_threshold_skipped_count := 0; Sx_types.jit_threshold_skipped_count := 0;
Sx_types.jit_evicted_count := 0; Sx_types.jit_evicted_count := 0;
Nil) Nil);
(* fed-sx host primitives — pure-OCaml crypto (WASM-safe). *)
register "crypto-sha256" (fun args ->
match args with
| [String s] -> String (Sx_sha2.sha256_hex s)
| _ -> raise (Eval_error "crypto-sha256: (bytes)"));
register "crypto-sha512" (fun args ->
match args with
| [String s] -> String (Sx_sha2.sha512_hex s)
| _ -> raise (Eval_error "crypto-sha512: (bytes)"));
register "crypto-sha3-256" (fun args ->
match args with
| [String s] -> String (Sx_sha3.sha3_256_hex s)
| _ -> raise (Eval_error "crypto-sha3-256: (bytes)"));
register "cbor-encode" (fun args ->
match args with
| [v] ->
(try String (Sx_cbor.encode v)
with Sx_cbor.Cbor_error m -> raise (Eval_error m))
| _ -> raise (Eval_error "cbor-encode: (value)"));
register "cbor-decode" (fun args ->
match args with
| [String s] ->
(try Sx_cbor.decode s
with Sx_cbor.Cbor_error m -> raise (Eval_error m))
| _ -> raise (Eval_error "cbor-decode: (bytes)"));
register "cid-from-bytes" (fun args ->
match args with
| [Integer codec; String mh] ->
String (Sx_cid.cidv1 codec mh)
| _ -> raise (Eval_error "cid-from-bytes: (codec multihash-bytes)"));
register "cid-from-sx" (fun args ->
match args with
| [v] ->
(try String (Sx_cid.cid_from_sx v)
with Sx_cbor.Cbor_error m -> raise (Eval_error m))
| _ -> raise (Eval_error "cid-from-sx: (value)"));
(* Verify is total: any malformed input -> false, never raises. *)
register "ed25519-verify" (fun args ->
match args with
| [String pk; String msg; String sg] ->
Bool (try Sx_ed25519.verify ~pubkey:pk ~msg ~sig_:sg
with _ -> false)
| _ -> Bool false);
register "rsa-sha256-verify" (fun args ->
match args with
| [String spki; String msg; String sg] ->
Bool (try Sx_rsa.verify ~spki ~msg ~sig_:sg with _ -> false)
| _ -> Bool false)

220
hosts/ocaml/lib/sx_rsa.ml Normal file
View File

@@ -0,0 +1,220 @@
(** RSASSA-PKCS1-v1_5 verification with SHA-256 — pure OCaml,
WASM-safe. Self-contained minimal bignum (modexp only), a tiny
DER reader for SubjectPublicKeyInfo, and the fixed SHA-256
DigestInfo prefix. Verify only on public data — constant time
not required. Reference: RFC 8017 §8.2.2, §9.2. No deps. *)
(* ---- Minimal unsigned bignum: int array, little-endian, base 2^26 ---- *)
let bits = 26
let base = 1 lsl bits
let mask = base - 1
type bn = int array
let norm a =
let n = ref (Array.length a) in
while !n > 1 && a.(!n - 1) = 0 do decr n done;
if !n = Array.length a then a else Array.sub a 0 !n
let bzero : bn = [| 0 |]
let is_zero a = Array.length a = 1 && a.(0) = 0
let cmp a b =
let a = norm a and b = norm b in
let la = Array.length a and lb = Array.length b in
if la <> lb then compare la lb
else begin
let r = ref 0 and i = ref (la - 1) in
while !r = 0 && !i >= 0 do
if a.(!i) <> b.(!i) then r := compare a.(!i) b.(!i);
decr i
done; !r
end
let add a b =
let la = Array.length a and lb = Array.length b in
let n = (max la lb) + 1 in
let r = Array.make n 0 and carry = ref 0 in
for i = 0 to n - 1 do
let s = !carry + (if i < la then a.(i) else 0)
+ (if i < lb then b.(i) else 0) in
r.(i) <- s land mask; carry := s lsr bits
done;
norm r
let sub a b = (* requires a >= b *)
let la = Array.length a and lb = Array.length b in
let r = Array.make la 0 and borrow = ref 0 in
for i = 0 to la - 1 do
let s = a.(i) - !borrow - (if i < lb then b.(i) else 0) in
if s < 0 then (r.(i) <- s + base; borrow := 1)
else (r.(i) <- s; borrow := 0)
done;
norm r
let mul a b =
let la = Array.length a and lb = Array.length b in
let r = Array.make (la + lb) 0 in
for i = 0 to la - 1 do
let carry = ref 0 in
for j = 0 to lb - 1 do
let s = r.(i + j) + a.(i) * b.(j) + !carry in
r.(i + j) <- s land mask; carry := s lsr bits
done;
r.(i + lb) <- r.(i + lb) + !carry
done;
norm r
let numbits a =
let a = norm a in
let hi = Array.length a - 1 in
if hi = 0 && a.(0) = 0 then 0
else begin
let b = ref 0 and v = ref a.(hi) in
while !v > 0 do incr b; v := !v lsr 1 done;
hi * bits + !b
end
let bit a i =
let limb = i / bits and off = i mod bits in
if limb >= Array.length a then 0 else (a.(limb) lsr off) land 1
let bn_mod a m = (* binary long division, m > 0 *)
if cmp a m < 0 then norm a
else begin
let r = ref bzero in
for i = numbits a - 1 downto 0 do
r := add !r !r;
if bit a i = 1 then r := add !r [| 1 |];
if cmp !r m >= 0 then r := sub !r m
done;
!r
end
let powmod b0 e m =
let result = ref [| 1 |] and b = ref (bn_mod b0 m) in
for i = 0 to numbits e - 1 do
if bit e i = 1 then result := bn_mod (mul !result !b) m;
b := bn_mod (mul !b !b) m
done;
!result
let of_bytes_be (s : string) : bn =
let acc = ref bzero in
for i = 0 to String.length s - 1 do
acc := add (mul !acc [| 256 |]) [| Char.code s.[i] |]
done;
!acc
let div_small a d =
let la = Array.length a in
let q = Array.make la 0 and rem = ref 0 in
for i = la - 1 downto 0 do
let cur = (!rem lsl bits) lor a.(i) in
q.(i) <- cur / d; rem := cur mod d
done;
norm q
let to_bytes_be (a : bn) (n : int) : string =
let b = Bytes.make n '\000' in
let cur = ref (norm a) in
for i = n - 1 downto 0 do
let q = div_small !cur 256 in
let r =
let d = sub !cur (mul q [| 256 |]) in
if is_zero d then 0 else d.(0)
in
Bytes.set b i (Char.chr r);
cur := q
done;
Bytes.unsafe_to_string b
(* ---- Minimal DER reader (for SubjectPublicKeyInfo) ---- *)
exception Der of string
(* Returns (tag, content_start, content_len, next). *)
let der_tlv s pos =
if pos + 2 > String.length s then raise (Der "short");
let tag = Char.code s.[pos] in
let l0 = Char.code s.[pos + 1] in
let len, hdr =
if l0 < 0x80 then l0, 2
else begin
let nb = l0 land 0x7f in
if pos + 2 + nb > String.length s then raise (Der "short len");
let v = ref 0 in
for i = 0 to nb - 1 do
v := (!v lsl 8) lor Char.code s.[pos + 2 + i]
done;
!v, 2 + nb
end
in
(tag, pos + hdr, len, pos + hdr + len)
(* SPKI DER -> (n, e) as bignums. *)
let parse_spki (der : string) : bn * bn =
let tag, c, _l, _ = der_tlv der 0 in
if tag <> 0x30 then raise (Der "spki: outer not SEQUENCE");
(* AlgorithmIdentifier SEQUENCE — skip. *)
let _, _, _, after_alg = der_tlv der c in
(* BIT STRING. *)
let bt, bc, bl, _ = der_tlv der after_alg in
if bt <> 0x03 then raise (Der "spki: expected BIT STRING");
(* First content byte = unused bits (must be 0). *)
let rpk_start = bc + 1 in
ignore bl;
let st, sc, _, _ = der_tlv der rpk_start in
if st <> 0x30 then raise (Der "spki: RSAPublicKey not SEQUENCE");
let nt, nc, nl, after_n = der_tlv der sc in
if nt <> 0x02 then raise (Der "spki: modulus not INTEGER");
let et, ec, el, _ = der_tlv der after_n in
if et <> 0x02 then raise (Der "spki: exponent not INTEGER");
let n = of_bytes_be (String.sub der nc nl) in
let e = of_bytes_be (String.sub der ec el) in
(n, e)
(* SHA-256 DigestInfo DER prefix (RFC 8017 §9.2 note 1). *)
let sha256_digestinfo_prefix =
"\x30\x31\x30\x0d\x06\x09\x60\x86\x48\x01\x65\x03\x04\x02\x01\x05\x00\x04\x20"
let unhex h =
let n = String.length h / 2 in
let b = Bytes.create n in
for i = 0 to n - 1 do
Bytes.set b i (Char.chr (int_of_string ("0x" ^ String.sub h (2 * i) 2)))
done;
Bytes.unsafe_to_string b
(* RSASSA-PKCS1-v1_5 verify with SHA-256. Total: any malformed
input yields false (caller wraps, but be defensive here too). *)
let verify ~spki ~msg ~sig_ : bool =
try
let n, e = parse_spki spki in
let k = (numbits n + 7) / 8 in
if String.length sig_ <> k then false
else begin
let s = of_bytes_be sig_ in
if cmp s n >= 0 then false
else begin
let m = powmod s e n in
let em = to_bytes_be m k in
(* EM = 0x00 01 FF..FF 00 || DigestInfo || H *)
let h = unhex (Sx_sha2.sha256_hex msg) in
let t = sha256_digestinfo_prefix ^ h in
let tlen = String.length t in
if k < tlen + 11 then false
else begin
let ok = ref (em.[0] = '\x00' && em.[1] = '\x01') in
let ps_end = k - tlen - 1 in
for i = 2 to ps_end - 1 do
if em.[i] <> '\xff' then ok := false
done;
if em.[ps_end] <> '\x00' then ok := false;
if String.sub em (ps_end + 1) tlen <> t then ok := false;
!ok
end
end
end
with _ -> false

212
hosts/ocaml/lib/sx_sha2.ml Normal file
View File

@@ -0,0 +1,212 @@
(** SHA-2 (SHA-256, SHA-512) — pure OCaml, WASM-safe.
No C stubs, no external deps. Used by the fed-sx host primitives
[crypto-sha256] / [crypto-sha512]. Reference: FIPS 180-4. *)
(* ---- SHA-256 (FIPS 180-4 §6.2). 32-bit words held in native int,
masked to 32 bits after every arithmetic op. ---- *)
let mask32 = 0xFFFFFFFF
let k256 = [|
0x428a2f98; 0x71374491; 0xb5c0fbcf; 0xe9b5dba5;
0x3956c25b; 0x59f111f1; 0x923f82a4; 0xab1c5ed5;
0xd807aa98; 0x12835b01; 0x243185be; 0x550c7dc3;
0x72be5d74; 0x80deb1fe; 0x9bdc06a7; 0xc19bf174;
0xe49b69c1; 0xefbe4786; 0x0fc19dc6; 0x240ca1cc;
0x2de92c6f; 0x4a7484aa; 0x5cb0a9dc; 0x76f988da;
0x983e5152; 0xa831c66d; 0xb00327c8; 0xbf597fc7;
0xc6e00bf3; 0xd5a79147; 0x06ca6351; 0x14292967;
0x27b70a85; 0x2e1b2138; 0x4d2c6dfc; 0x53380d13;
0x650a7354; 0x766a0abb; 0x81c2c92e; 0x92722c85;
0xa2bfe8a1; 0xa81a664b; 0xc24b8b70; 0xc76c51a3;
0xd192e819; 0xd6990624; 0xf40e3585; 0x106aa070;
0x19a4c116; 0x1e376c08; 0x2748774c; 0x34b0bcb5;
0x391c0cb3; 0x4ed8aa4a; 0x5b9cca4f; 0x682e6ff3;
0x748f82ee; 0x78a5636f; 0x84c87814; 0x8cc70208;
0x90befffa; 0xa4506ceb; 0xbef9a3f7; 0xc67178f2 |]
let rotr32 x n = ((x lsr n) lor (x lsl (32 - n))) land mask32
let sha256_hex (msg : string) : string =
let h = [| 0x6a09e667; 0xbb67ae85; 0x3c6ef372; 0xa54ff53a;
0x510e527f; 0x9b05688c; 0x1f83d9ab; 0x5be0cd19 |] in
let len = String.length msg in
(* Padded length: multiple of 64 bytes. *)
let bitlen = len * 8 in
let padlen =
let r = (len + 1) mod 64 in
if r <= 56 then 56 - r else 120 - r
in
let total = len + 1 + padlen + 8 in
let buf = Bytes.make total '\000' in
Bytes.blit_string msg 0 buf 0 len;
Bytes.set buf len '\x80';
(* 64-bit big-endian bit length (we cap at OCaml int range). *)
for i = 0 to 7 do
Bytes.set buf (total - 1 - i)
(Char.chr ((bitlen lsr (8 * i)) land 0xFF))
done;
let w = Array.make 64 0 in
let nblocks = total / 64 in
for b = 0 to nblocks - 1 do
let base = b * 64 in
for t = 0 to 15 do
let o = base + t * 4 in
w.(t) <-
(Char.code (Bytes.get buf o) lsl 24)
lor (Char.code (Bytes.get buf (o + 1)) lsl 16)
lor (Char.code (Bytes.get buf (o + 2)) lsl 8)
lor (Char.code (Bytes.get buf (o + 3)))
done;
for t = 16 to 63 do
let s0 =
(rotr32 w.(t - 15) 7) lxor (rotr32 w.(t - 15) 18)
lxor (w.(t - 15) lsr 3) in
let s1 =
(rotr32 w.(t - 2) 17) lxor (rotr32 w.(t - 2) 19)
lxor (w.(t - 2) lsr 10) in
w.(t) <- (w.(t - 16) + s0 + w.(t - 7) + s1) land mask32
done;
let a = ref h.(0) and bb = ref h.(1) and c = ref h.(2)
and d = ref h.(3) and e = ref h.(4) and f = ref h.(5)
and g = ref h.(6) and hh = ref h.(7) in
for t = 0 to 63 do
let s1 =
(rotr32 !e 6) lxor (rotr32 !e 11) lxor (rotr32 !e 25) in
let ch = (!e land !f) lxor ((lnot !e land mask32) land !g) in
let t1 = (!hh + s1 + ch + k256.(t) + w.(t)) land mask32 in
let s0 =
(rotr32 !a 2) lxor (rotr32 !a 13) lxor (rotr32 !a 22) in
let maj = (!a land !bb) lxor (!a land !c) lxor (!bb land !c) in
let t2 = (s0 + maj) land mask32 in
hh := !g; g := !f; f := !e;
e := (!d + t1) land mask32;
d := !c; c := !bb; bb := !a;
a := (t1 + t2) land mask32
done;
h.(0) <- (h.(0) + !a) land mask32;
h.(1) <- (h.(1) + !bb) land mask32;
h.(2) <- (h.(2) + !c) land mask32;
h.(3) <- (h.(3) + !d) land mask32;
h.(4) <- (h.(4) + !e) land mask32;
h.(5) <- (h.(5) + !f) land mask32;
h.(6) <- (h.(6) + !g) land mask32;
h.(7) <- (h.(7) + !hh) land mask32
done;
let out = Buffer.create 64 in
Array.iter (fun x -> Buffer.add_string out (Printf.sprintf "%08x" x)) h;
Buffer.contents out
(* ---- SHA-512 (FIPS 180-4 §6.4). 64-bit words via Int64.
128-bit length append; we only support messages whose bit length
fits in 64 bits (high word is always zero). ---- *)
let k512 = [|
0x428a2f98d728ae22L; 0x7137449123ef65cdL; 0xb5c0fbcfec4d3b2fL;
0xe9b5dba58189dbbcL; 0x3956c25bf348b538L; 0x59f111f1b605d019L;
0x923f82a4af194f9bL; 0xab1c5ed5da6d8118L; 0xd807aa98a3030242L;
0x12835b0145706fbeL; 0x243185be4ee4b28cL; 0x550c7dc3d5ffb4e2L;
0x72be5d74f27b896fL; 0x80deb1fe3b1696b1L; 0x9bdc06a725c71235L;
0xc19bf174cf692694L; 0xe49b69c19ef14ad2L; 0xefbe4786384f25e3L;
0x0fc19dc68b8cd5b5L; 0x240ca1cc77ac9c65L; 0x2de92c6f592b0275L;
0x4a7484aa6ea6e483L; 0x5cb0a9dcbd41fbd4L; 0x76f988da831153b5L;
0x983e5152ee66dfabL; 0xa831c66d2db43210L; 0xb00327c898fb213fL;
0xbf597fc7beef0ee4L; 0xc6e00bf33da88fc2L; 0xd5a79147930aa725L;
0x06ca6351e003826fL; 0x142929670a0e6e70L; 0x27b70a8546d22ffcL;
0x2e1b21385c26c926L; 0x4d2c6dfc5ac42aedL; 0x53380d139d95b3dfL;
0x650a73548baf63deL; 0x766a0abb3c77b2a8L; 0x81c2c92e47edaee6L;
0x92722c851482353bL; 0xa2bfe8a14cf10364L; 0xa81a664bbc423001L;
0xc24b8b70d0f89791L; 0xc76c51a30654be30L; 0xd192e819d6ef5218L;
0xd69906245565a910L; 0xf40e35855771202aL; 0x106aa07032bbd1b8L;
0x19a4c116b8d2d0c8L; 0x1e376c085141ab53L; 0x2748774cdf8eeb99L;
0x34b0bcb5e19b48a8L; 0x391c0cb3c5c95a63L; 0x4ed8aa4ae3418acbL;
0x5b9cca4f7763e373L; 0x682e6ff3d6b2b8a3L; 0x748f82ee5defb2fcL;
0x78a5636f43172f60L; 0x84c87814a1f0ab72L; 0x8cc702081a6439ecL;
0x90befffa23631e28L; 0xa4506cebde82bde9L; 0xbef9a3f7b2c67915L;
0xc67178f2e372532bL; 0xca273eceea26619cL; 0xd186b8c721c0c207L;
0xeada7dd6cde0eb1eL; 0xf57d4f7fee6ed178L; 0x06f067aa72176fbaL;
0x0a637dc5a2c898a6L; 0x113f9804bef90daeL; 0x1b710b35131c471bL;
0x28db77f523047d84L; 0x32caab7b40c72493L; 0x3c9ebe0a15c9bebcL;
0x431d67c49c100d4cL; 0x4cc5d4becb3e42b6L; 0x597f299cfc657e2aL;
0x5fcb6fab3ad6faecL; 0x6c44198c4a475817L |]
let ( &: ) = Int64.logand
let ( |: ) = Int64.logor
let ( ^: ) = Int64.logxor
let ( +: ) = Int64.add
let lnot64 = Int64.lognot
let rotr64 x n =
(Int64.shift_right_logical x n) |: (Int64.shift_left x (64 - n))
let sha512_hex (msg : string) : string =
let h = [| 0x6a09e667f3bcc908L; 0xbb67ae8584caa73bL;
0x3c6ef372fe94f82bL; 0xa54ff53a5f1d36f1L;
0x510e527fade682d1L; 0x9b05688c2b3e6c1fL;
0x1f83d9abfb41bd6bL; 0x5be0cd19137e2179L |] in
let len = String.length msg in
let bitlen = len * 8 in
(* Pad to a multiple of 128 bytes; 16-byte big-endian length. *)
let padlen =
let r = (len + 1) mod 128 in
if r <= 112 then 112 - r else 240 - r
in
let total = len + 1 + padlen + 16 in
let buf = Bytes.make total '\000' in
Bytes.blit_string msg 0 buf 0 len;
Bytes.set buf len '\x80';
for i = 0 to 7 do
Bytes.set buf (total - 1 - i)
(Char.chr ((bitlen lsr (8 * i)) land 0xFF))
done;
let w = Array.make 80 0L in
let nblocks = total / 128 in
for b = 0 to nblocks - 1 do
let base = b * 128 in
for t = 0 to 15 do
let o = base + t * 8 in
let v = ref 0L in
for j = 0 to 7 do
v := Int64.logor (Int64.shift_left !v 8)
(Int64.of_int (Char.code (Bytes.get buf (o + j))))
done;
w.(t) <- !v
done;
for t = 16 to 79 do
let s0 =
(rotr64 w.(t - 15) 1) ^: (rotr64 w.(t - 15) 8)
^: (Int64.shift_right_logical w.(t - 15) 7) in
let s1 =
(rotr64 w.(t - 2) 19) ^: (rotr64 w.(t - 2) 61)
^: (Int64.shift_right_logical w.(t - 2) 6) in
w.(t) <- w.(t - 16) +: s0 +: w.(t - 7) +: s1
done;
let a = ref h.(0) and bb = ref h.(1) and c = ref h.(2)
and d = ref h.(3) and e = ref h.(4) and f = ref h.(5)
and g = ref h.(6) and hh = ref h.(7) in
for t = 0 to 79 do
let s1 = (rotr64 !e 14) ^: (rotr64 !e 18) ^: (rotr64 !e 41) in
let ch = (!e &: !f) ^: ((lnot64 !e) &: !g) in
let t1 = !hh +: s1 +: ch +: k512.(t) +: w.(t) in
let s0 = (rotr64 !a 28) ^: (rotr64 !a 34) ^: (rotr64 !a 39) in
let maj = (!a &: !bb) ^: (!a &: !c) ^: (!bb &: !c) in
let t2 = s0 +: maj in
hh := !g; g := !f; f := !e;
e := !d +: t1;
d := !c; c := !bb; bb := !a;
a := t1 +: t2
done;
h.(0) <- h.(0) +: !a;
h.(1) <- h.(1) +: !bb;
h.(2) <- h.(2) +: !c;
h.(3) <- h.(3) +: !d;
h.(4) <- h.(4) +: !e;
h.(5) <- h.(5) +: !f;
h.(6) <- h.(6) +: !g;
h.(7) <- h.(7) +: !hh
done;
let out = Buffer.create 128 in
Array.iter
(fun x -> Buffer.add_string out (Printf.sprintf "%016Lx" x)) h;
Buffer.contents out

107
hosts/ocaml/lib/sx_sha3.ml Normal file
View File

@@ -0,0 +1,107 @@
(** SHA-3 (SHA3-256) — pure OCaml, WASM-safe.
Keccak-f[1600] permutation + SHA-3 multi-rate padding (domain byte
0x06, NOT the legacy Keccak 0x01). Reference: FIPS 202. No deps. *)
let ( ^: ) = Int64.logxor
let ( &: ) = Int64.logand
let lnot64 = Int64.lognot
let rotl64 x n =
if n = 0 then x
else
Int64.logor (Int64.shift_left x n) (Int64.shift_right_logical x (64 - n))
(* FIPS 202 Table 2 — ρ rotation offsets, indexed lane = x + 5*y. *)
let rho = [|
0; 1; 62; 28; 27;
36; 44; 6; 55; 20;
3; 10; 43; 25; 39;
41; 45; 15; 21; 8;
18; 2; 61; 56; 14 |]
(* FIPS 202 §3.2.5 — round constants RC[0..23] for ι. *)
let rc = [|
0x0000000000000001L; 0x0000000000008082L; 0x800000000000808aL;
0x8000000080008000L; 0x000000000000808bL; 0x0000000080000001L;
0x8000000080008081L; 0x8000000000008009L; 0x000000000000008aL;
0x0000000000000088L; 0x0000000080008009L; 0x000000008000000aL;
0x000000008000808bL; 0x800000000000008bL; 0x8000000000008089L;
0x8000000000008003L; 0x8000000000008002L; 0x8000000000000080L;
0x000000000000800aL; 0x800000008000000aL; 0x8000000080008081L;
0x8000000000008080L; 0x0000000080000001L; 0x8000000080008008L |]
let keccak_f (a : int64 array) : unit =
let c = Array.make 5 0L and d = Array.make 5 0L in
let b = Array.make 25 0L in
for round = 0 to 23 do
(* θ *)
for x = 0 to 4 do
c.(x) <- a.(x) ^: a.(x + 5) ^: a.(x + 10)
^: a.(x + 15) ^: a.(x + 20)
done;
for x = 0 to 4 do
d.(x) <- c.((x + 4) mod 5) ^: (rotl64 c.((x + 1) mod 5) 1)
done;
for x = 0 to 4 do
for y = 0 to 4 do
a.(x + 5 * y) <- a.(x + 5 * y) ^: d.(x)
done
done;
(* ρ and π: B[y, 2x+3y] = rotl(A[x,y], rho[x,y]) *)
for x = 0 to 4 do
for y = 0 to 4 do
let nx = y and ny = (2 * x + 3 * y) mod 5 in
b.(nx + 5 * ny) <- rotl64 a.(x + 5 * y) rho.(x + 5 * y)
done
done;
(* χ *)
for y = 0 to 4 do
for x = 0 to 4 do
a.(x + 5 * y) <-
b.(x + 5 * y)
^: ((lnot64 b.((x + 1) mod 5 + 5 * y))
&: b.((x + 2) mod 5 + 5 * y))
done
done;
(* ι *)
a.(0) <- a.(0) ^: rc.(round)
done
let sha3_256_hex (msg : string) : string =
let rate = 136 (* bytes: (1600 - 2*256) / 8 *) in
let len = String.length msg in
(* pad10*1 with SHA-3 domain byte 0x06; last byte ORed with 0x80. *)
let q = rate - (len mod rate) in
let padded = Bytes.make (len + q) '\000' in
Bytes.blit_string msg 0 padded 0 len;
if q = 1 then
Bytes.set padded len '\x86'
else begin
Bytes.set padded len '\x06';
Bytes.set padded (len + q - 1) '\x80'
end;
let total = Bytes.length padded in
let a = Array.make 25 0L in
let nblocks = total / rate in
for blk = 0 to nblocks - 1 do
let base = blk * rate in
(* Absorb: XOR rate bytes into the state, little-endian lanes. *)
for j = 0 to rate - 1 do
let lane = j / 8 and sh = (j mod 8) * 8 in
let byte = Int64.of_int (Char.code (Bytes.get padded (base + j))) in
a.(lane) <- a.(lane) ^: (Int64.shift_left byte sh)
done;
keccak_f a
done;
(* Squeeze 32 bytes (fits in the first 4 lanes; rate > 32). *)
let out = Buffer.create 64 in
for j = 0 to 31 do
let lane = j / 8 and sh = (j mod 8) * 8 in
let byte =
Int64.to_int
(Int64.logand (Int64.shift_right_logical a.(lane) sh) 0xFFL)
in
Buffer.add_string out (Printf.sprintf "%02x" byte)
done;
Buffer.contents out

View File

@@ -44,6 +44,11 @@ type vm = {
ip past OP_PERFORM, stack ready for a result push). *) ip past OP_PERFORM, stack ready for a result push). *)
exception VmSuspended of value * vm exception VmSuspended of value * vm
(** Raised by the extension dispatch fallthrough when an opcode in the
extension range (≥ 200) is encountered with no handler registered.
Carries the offending opcode id. See plans/sx-vm-opcode-extension.md. *)
exception Invalid_opcode of int
(* Register the VM suspension converter so sx_runtime.sx_apply_cek can (* Register the VM suspension converter so sx_runtime.sx_apply_cek can
catch VmSuspended and convert it to CekPerformRequest without a catch VmSuspended and convert it to CekPerformRequest without a
direct dependency on this module. *) direct dependency on this module. *)
@@ -57,6 +62,21 @@ let () = Sx_types._convert_vm_suspension := (fun exn ->
let jit_compile_ref : (lambda -> (string, value) Hashtbl.t -> vm_closure option) ref = let jit_compile_ref : (lambda -> (string, value) Hashtbl.t -> vm_closure option) ref =
ref (fun _ _ -> None) ref (fun _ _ -> None)
(** Forward reference for extension opcode dispatch — Phase B installs the
real registry's dispatch function here at module init. Until then, any
opcode in the extension range raises [Invalid_opcode]. Same forward-ref
pattern as [jit_compile_ref] above; keeps [Sx_vm_extensions] free to
depend on [Sx_vm]'s [vm] / [frame] types without a cycle. *)
let extension_dispatch_ref : (int -> vm -> frame -> unit) ref =
ref (fun op _vm _frame -> raise (Invalid_opcode op))
(** Forward reference for extension opcode → name lookup, used by
[opcode_name] / [disassemble] for human-readable disassembly. The
registry installs a real lookup at module init; default returns
[None] (then [opcode_name] falls back to "UNKNOWN_n"). *)
let extension_opcode_name_ref : (int -> string option) ref =
ref (fun _ -> None)
(* JIT threshold and counters live in Sx_types so primitives can read them (* JIT threshold and counters live in Sx_types so primitives can read them
without creating a sx_primitives → sx_vm dependency cycle. *) without creating a sx_primitives → sx_vm dependency cycle. *)
@@ -875,6 +895,15 @@ and run vm =
let request = pop vm in let request = pop vm in
raise (VmSuspended (request, vm)) raise (VmSuspended (request, vm))
(* ---- Extension dispatch fallthrough ----
Opcode partition (see plans/sx-vm-opcode-extension.md):
0 reserved / NOP
1-199 core opcodes (current ceiling 175 = OP_DEC)
200-247 extension opcodes (registered via Sx_vm_extensions)
248-255 reserved for future expansion / multi-byte
Any opcode ≥ 200 routes through the extension registry. *)
| op when op >= 200 -> !extension_dispatch_ref op vm frame
| opcode -> | opcode ->
raise (Eval_error (Printf.sprintf "VM: unknown opcode %d at ip=%d" raise (Eval_error (Printf.sprintf "VM: unknown opcode %d at ip=%d"
opcode (frame.ip - 1))) opcode (frame.ip - 1)))
@@ -1027,6 +1056,62 @@ let _jit_is_broken_name n =
|| n = "hs-repeat-while" || n = "hs-repeat-until" || n = "hs-repeat-while" || n = "hs-repeat-until"
|| n = "hs-for-each" || n = "hs-put!" || n = "hs-for-each" || n = "hs-put!"
(** Scan bytecode for any extension opcode (≥ 200, the registry's
[Sx_vm_extensions.extension_min]). Walks operand bytes correctly
so values that happen to be ≥200 (e.g. a CONST u16 index pointing
into a large pool) do not trigger false positives. CLOSURE's
dynamic upvalue descriptors are read from the constant pool entry
at the same index it pushes.
Used by [jit_compile_lambda] (Phase E of the opcode-extension
plan): a lambda whose compiled body contains any extension opcode
is routed through interpretation rather than JIT. Extensions
interpret their opcodes via the registry; the JIT does not
currently know how to compile them.
Operand-size logic mirrors [opcode_operand_size] (which is defined
later, in the disassembly section); inlined here so this helper can
sit before [jit_compile_lambda] in the file. *)
let bytecode_uses_extension_opcodes (bc : int array) (consts : value array) =
let core_operand_size = function
| 1 | 20 | 21 | 64 | 65 | 128 -> 2 (* u16 *)
| 16 | 17 | 18 | 19 | 48 | 49 | 144 -> 1 (* u8 *)
| 32 | 33 | 34 | 35 -> 2 (* i16 *)
| 52 -> 3 (* CALL_PRIM: u16 + u8 *)
| _ -> 0
in
let len = Array.length bc in
let ip = ref 0 in
let found = ref false in
while not !found && !ip < len do
let op = bc.(!ip) in
if op >= 200 then found := true
else begin
ip := !ip + 1;
let extra = match op with
| 51 (* CLOSURE *) when !ip + 1 < len ->
let lo = bc.(!ip) in
let hi = bc.(!ip + 1) in
let idx = lo lor (hi lsl 8) in
let uv_count =
if idx < Array.length consts then
(match consts.(idx) with
| Dict d ->
(match Hashtbl.find_opt d "upvalue-count" with
| Some (Integer n) -> n
| Some (Number n) -> int_of_float n
| _ -> 0)
| _ -> 0)
else 0
in
2 + uv_count * 2
| _ -> core_operand_size op
in
ip := !ip + extra
end
done;
!found
let jit_compile_lambda (l : lambda) globals = let jit_compile_lambda (l : lambda) globals =
let fn_name = match l.l_name with Some n -> n | None -> "<anon>" in let fn_name = match l.l_name with Some n -> n | None -> "<anon>" in
if !_jit_compiling then ( if !_jit_compiling then (
@@ -1089,6 +1174,16 @@ let jit_compile_lambda (l : lambda) globals =
if idx < Array.length outer_code.vc_constants then if idx < Array.length outer_code.vc_constants then
let inner_val = outer_code.vc_constants.(idx) in let inner_val = outer_code.vc_constants.(idx) in
let code = code_from_value inner_val in let code = code_from_value inner_val in
(* Phase E: if the inner lambda's bytecode contains any
extension opcode (≥200), skip JIT and let the lambda run
interpreted via CEK. Extension opcodes dispatch correctly
through the VM's registry fallthrough, but the JIT has no
knowledge of them and shouldn't claim ownership. *)
if bytecode_uses_extension_opcodes code.vc_bytecode code.vc_constants then begin
Printf.eprintf "[jit] SKIP %s: bytecode uses extension opcodes (interpret-only in v1)\n%!"
fn_name;
None
end else
Some { vm_code = code; vm_upvalues = [||]; Some { vm_code = code; vm_upvalues = [||];
vm_name = l.l_name; vm_env_ref = effective_globals; vm_closure_env = Some l.l_closure } vm_name = l.l_name; vm_env_ref = effective_globals; vm_closure_env = Some l.l_closure }
else begin else begin
@@ -1200,7 +1295,12 @@ let opcode_name = function
| 164 -> "EQ" | 165 -> "LT" | 166 -> "GT" | 167 -> "NOT" | 164 -> "EQ" | 165 -> "LT" | 166 -> "GT" | 167 -> "NOT"
| 168 -> "LEN" | 169 -> "FIRST" | 170 -> "REST" | 171 -> "NTH" | 168 -> "LEN" | 169 -> "FIRST" | 170 -> "REST" | 171 -> "NTH"
| 172 -> "CONS" | 173 -> "NEG" | 174 -> "INC" | 175 -> "DEC" | 172 -> "CONS" | 173 -> "NEG" | 174 -> "INC" | 175 -> "DEC"
| n -> Printf.sprintf "UNKNOWN_%d" n | n ->
(* Extension opcodes (≥200) get their human-readable name from the
registry; defaults to UNKNOWN_n if the extension isn't loaded. *)
(match !extension_opcode_name_ref n with
| Some name -> name
| None -> Printf.sprintf "UNKNOWN_%d" n)
(** Number of extra operand bytes consumed by each opcode. (** Number of extra operand bytes consumed by each opcode.
Returns (format, total_bytes) where format describes the operand types. *) Returns (format, total_bytes) where format describes the operand types. *)

View File

@@ -0,0 +1,48 @@
(** {1 VM extension interface}
Type definitions for VM bytecode extensions. See
[plans/sx-vm-opcode-extension.md].
An extension is a first-class module of type [EXTENSION]: it has a
stable [name], an [init] that returns its private state, and an
[opcodes] function that lists the opcodes it provides.
Opcode handlers receive the live [vm] and the active [frame]. They
read operands via [Sx_vm.read_u8] / [read_u16], manipulate the stack
via [push] / [pop] / [peek], and update the frame's [ip] as needed. *)
(** A handler for an extension opcode. Reads operands from bytecode,
manipulates the VM stack, updates the frame's instruction pointer.
May raise exceptions (which propagate via the existing VM error path). *)
type handler = Sx_vm.vm -> Sx_vm.frame -> unit
(** State an extension carries alongside the VM. Opaque to the VM core;
extensions extend this with their own constructor and cast as needed.
Extensible variant — extensions add cases:
{[
type Sx_vm_extension.extension_state +=
| ErlangState of erlang_scheduler
]} *)
type extension_state = ..
(** An extension is a first-class module of this signature. *)
module type EXTENSION = sig
(** Stable name for this extension (e.g. ["erlang"], ["guest_vm"]).
Used as the lookup key in the registry and as the prefix for opcode
names ([erlang.OP_PATTERN_TUPLE_2] etc). *)
val name : string
(** Initialize per-instance state. Called once when [register] is
invoked on this extension. *)
val init : unit -> extension_state
(** Opcodes this extension provides. Each is
[(opcode_id, opcode_name, handler)].
[opcode_id] must be in the range 200-247 (the extension partition;
see the partition comment at the top of [Sx_vm]'s dispatch loop).
Conflicts with already-registered opcodes cause [register] to
fail. *)
val opcodes : extension_state -> (int * string * handler) list
end

View File

@@ -0,0 +1,120 @@
(** {1 VM extension registry}
Holds the live registry of extension opcodes and installs the
[dispatch] function into [Sx_vm.extension_dispatch_ref] at module
init time, replacing Phase A's stub.
See [plans/sx-vm-opcode-extension.md] and [Sx_vm_extension] for the
extension interface. *)
open Sx_vm_extension
(** The opcode range an extension is allowed to claim.
Mirrors the partition comment in [Sx_vm]. *)
let extension_min = 200
let extension_max = 247
(** opcode_id → handler *)
let by_id : (int, handler) Hashtbl.t = Hashtbl.create 64
(** opcode_name → opcode_id *)
let by_name : (string, int) Hashtbl.t = Hashtbl.create 64
(** opcode_id → opcode_name (reverse of [by_name]; used by
[Sx_vm.opcode_name] for disassembly). *)
let name_of_id_table : (int, string) Hashtbl.t = Hashtbl.create 64
(** extension_name → state *)
let states : (string, extension_state) Hashtbl.t = Hashtbl.create 8
(** Registered extension names, newest first. *)
let extensions : string list ref = ref []
(** Dispatch an extension opcode to its registered handler. Raises
[Sx_vm.Invalid_opcode] if no handler is registered for [op]. *)
let dispatch op vm frame =
match Hashtbl.find_opt by_id op with
| Some handler -> handler vm frame
| None -> raise (Sx_vm.Invalid_opcode op)
(** Register an extension. Fails if the extension name is already
registered, or if any opcode_id is outside the extension range or
collides with an already-registered opcode. *)
let register (m : (module EXTENSION)) =
let module M = (val m) in
if Hashtbl.mem states M.name then
failwith (Printf.sprintf
"Sx_vm_extensions: extension %S already registered" M.name);
let st = M.init () in
let ops = M.opcodes st in
List.iter (fun (id, opname, _h) ->
if id < extension_min || id > extension_max then
failwith (Printf.sprintf
"Sx_vm_extensions: opcode %d (%s) outside extension range %d-%d"
id opname extension_min extension_max);
if Hashtbl.mem by_id id then
failwith (Printf.sprintf
"Sx_vm_extensions: opcode %d (%s) already registered" id opname);
if Hashtbl.mem by_name opname then
failwith (Printf.sprintf
"Sx_vm_extensions: opcode name %S already registered" opname)
) ops;
Hashtbl.add states M.name st;
List.iter (fun (id, opname, h) ->
Hashtbl.add by_id id h;
Hashtbl.add by_name opname id;
Hashtbl.add name_of_id_table id opname
) ops;
extensions := M.name :: !extensions
(** Look up the opcode_id for an opcode_name. Returns [None] if no
extension provides that opcode. *)
let id_of_name name = Hashtbl.find_opt by_name name
(** Look up the opcode_name for an opcode_id. Returns [None] if no
extension provides that opcode. Used by disassembly. *)
let name_of_id id = Hashtbl.find_opt name_of_id_table id
(** Look up the state of an extension by name. Returns [None] if the
extension is not registered. *)
let state_of_extension name = Hashtbl.find_opt states name
(** Names of all registered extensions, newest first. *)
let registered_extensions () = !extensions
(** Test-only: clear the registry. Used by unit tests to isolate
extensions between test cases. The dispatch_ref is left in place. *)
let _reset_for_tests () =
Hashtbl.clear by_id;
Hashtbl.clear by_name;
Hashtbl.clear name_of_id_table;
Hashtbl.clear states;
extensions := []
(** Install our [dispatch] into [Sx_vm.extension_dispatch_ref] and our
[name_of_id] into [Sx_vm.extension_opcode_name_ref], replacing
the Phase A stubs. Idempotent. Called automatically at module init. *)
let install_dispatch () =
Sx_vm.extension_dispatch_ref := dispatch;
Sx_vm.extension_opcode_name_ref := name_of_id
let () = install_dispatch ()
(** Compiler-side opcode lookup: register the [extension-opcode-id]
primitive. Compilers ([lib/compiler.sx]) call this to emit
extension opcodes by name. Returns [Integer id] when registered,
[Nil] otherwise — so missing extensions degrade to a fallback
rather than failure. *)
let () =
Sx_primitives.register "extension-opcode-id" (fun args ->
match args with
| [Sx_types.String name] ->
(match id_of_name name with
| Some id -> Sx_types.Integer id
| None -> Sx_types.Nil)
| [Sx_types.Symbol name] ->
(match id_of_name name with
| Some id -> Sx_types.Integer id
| None -> Sx_types.Nil)
| _ -> raise (Sx_types.Eval_error
"extension-opcode-id: expected one string or symbol"))

View File

@@ -16,5 +16,5 @@
{"name":"magic","passed":37,"failed":0,"total":37}, {"name":"magic","passed":37,"failed":0,"total":37},
{"name":"demo","passed":21,"failed":0,"total":21} {"name":"demo","passed":21,"failed":0,"total":21}
], ],
"generated": "2026-05-11T09:40:12+00:00" "generated": "2026-05-14T20:30:05+00:00"
} }

View File

@@ -0,0 +1,109 @@
# fed-prims loop agent (single agent, phase-ordered)
Role: iterates `plans/fed-sx-host-primitives.md` forever. Adds the pure-OCaml
crypto / CBOR / CID / Ed25519 / RSA primitives and the native HTTP server that
Erlang Phase 8 BIFs (and therefore fed-sx Milestone 1) are blocked on. One
feature per commit.
```
description: fed-prims host-primitive loop
subagent_type: general-purpose
run_in_background: true
isolation: worktree
```
## Prompt
You are the sole background agent working `/root/rose-ash/plans/fed-sx-host-primitives.md`.
You run in an isolated git worktree on branch `loops/fed-prims`. You work the
plan's phases in order (A→I), forever, one commit per feature. Push to
`origin/loops/fed-prims` after every commit.
## Restart baseline — check before iterating
1. Read `plans/fed-sx-host-primitives.md` — Phasing + Progress log + Blockers
tell you where you are.
2. `cd hosts/ocaml && dune build bin/sx_server.exe 2>&1 | tail` — must be green
before new work. If broken and not by your last edit, Blockers + stop.
3. `bash hosts/ocaml/browser/test_boot.sh` — the WASM kernel must boot. This is
the regression you are most at risk of causing.
4. Find the first unchecked `[ ]` phase. That is your iteration.
## The iteration
Implement → `dune build bin/sx_server.exe` (native) → **WASM build check**
(`test_boot.sh`) → run the phase's tests → run the no-regression gate
(`conformance.sh`, see plan) → commit → tick the `[ ]` → append one dated line
to the Progress log (newest first) → push → stop.
One phase = one iteration = one commit. Do not batch phases.
## Ground rules (hard)
- **Scope:** only `hosts/ocaml/lib/**`, `hosts/ocaml/bin/**`, and
`plans/fed-sx-host-primitives.md`. The single exception is Phase I, which also
edits exactly one Blockers entry in `plans/erlang-on-sx.md`. Do **not** touch
`lib/erlang/**`, `spec/`, `lib/` root, other `lib/<lang>/`.
- **Pure OCaml for `lib/` primitives.** No new opam deps. WASM-safe: no C stubs,
no `Unix`/`Thread` in `lib/sx_primitives.ml`. The HTTP server (Phase H) is
native-only — register it in `bin/sx_server.ml`, never in the lib.
- **Prove WASM every commit.** `test_boot.sh` green is a phase gate, not
optional. A broken WASM kernel = the phase failed; revert and rethink.
- **No-regression gate:** OCaml `run_tests` + Erlang `conformance.sh` must stay
at their current pass counts (Erlang 715/715 once the merge lands; otherwise
whatever `lib/erlang/scoreboard.json` says). New crypto tests are additive.
- **`.ml`/`.sh` files:** ordinary `Read`/`Edit`/`Write` — these are NOT `.sx`.
Do not use sx-tree MCP for OCaml. (sx-tree is only if you ever touch `.sx`,
which this loop should not.)
- **Builds are slow.** Use a generous `timeout` on `dune build` (≥600s) and on
`conformance.sh` (≥400s). If a build genuinely hangs >10min, Blockers + stop.
- **Worktree:** commit, push `origin/loops/fed-prims`. Never `main`, never
`architecture`.
- **Commit granularity:** one feature per commit. `fed-prims: SHA-256 + 4 NIST
vectors`. Update Progress log + tick box every commit.
- **If blocked** two iterations on the same issue: Blockers entry, move to the
next independent phase (A-G are largely independent; H is independent; only
D depends on A+C, E depends on A).
## Crypto correctness gotchas
- **Test vectors are non-negotiable.** Every hash/sig phase lands with published
vectors (NIST FIPS 180-4 / 202, RFC 8032, RFC 8949). A primitive without a
passing standard vector is not done — do not tick the box.
- **SHA endianness:** SHA-2 is big-endian length-append; SHA-3 is little-endian
Keccak lane order. Easy to get backwards — the empty-string vector catches it.
- **dag-cbor determinism:** map keys sorted by **byte length first, then
bytewise**. Not lexicographic-only. The "reordered dict keys → identical
bytes" test is the guard; it must be in the phase.
- **CIDv1 layout:** `0x01 || codec-varint || (mh-code-varint || mh-len-varint ||
digest)`, then multibase base32-lower with a leading `b`. Off-by-one in varint
is the classic bug — cross-check one CID against `ipfs` CLI if available.
- **Ed25519 verify is total:** wrong-length inputs return `false`, never raise.
Verify checks `[S]B = R + [k]A` with `k = SHA512(R||A||M)` reduced mod L.
- **RSA:** PKCS#1 v1.5 EMSA — the DigestInfo DER prefix for SHA-256 is fixed
(`3031300d060960864801650304020105000420`). Constant-time not required (verify
only, public data).
## General gotchas
- The `sx` library is `(wrapped false)` — new module `Sx_sha2` is referenced as
`Sha2.f` is **wrong**; it's `Sx_sha2.f` unless you also alias. Check
`lib/dune` `include_subdirs unqualified`: a new `lib/sx_sha2.ml` is module
`Sx_sha2`. Match the existing `Sx_*` naming.
- `Eval_error` is the primitive-error exception; raise it with `"name: shape"`.
- Reach a primitive from SX to smoke-test:
`printf '(epoch 1)\n(crypto-sha256 "abc")\n' | hosts/ocaml/_build/default/bin/sx_server.exe`
- The native binary the conformance gate uses is
`hosts/ocaml/_build/default/bin/sx_server.exe` — rebuild it before gating.
## Style
- No comments in OCaml unless non-obvious (crypto constants ARE non-obvious —
cite the RFC/FIPS section in a one-line comment).
- No new planning docs — update `plans/fed-sx-host-primitives.md` inline.
- One feature per iteration. Build. WASM-check. Test. Gate. Commit. Log. Push.
Next.
Go. Run the restart baseline. Find the first unchecked `[ ]`. Implement it.
Remember: no commit without a passing standard test vector AND a green WASM
boot.

View File

@@ -0,0 +1,86 @@
# sx-vm-extensions loop agent
Role: drives `plans/sx-vm-opcode-extension.md` to completion. One phase per
fire (A → B → C → D → E). Bounded loop — after Phase E acceptance, the loop
is done.
```
description: sx-vm-extensions queue loop
subagent_type: general-purpose
run_in_background: true
isolation: worktree (already on loops/sx-vm-extensions)
```
## What this loop is for
Mechanism in `hosts/ocaml/lib/` that lets language ports register specialized
bytecode opcodes without modifying the SX VM core. Direct prerequisite for
**erlang-on-sx Phase 9** (the BEAM analog) and a structural enabler for any
future language port that wants performance-critical opcodes.
## The queue
Per `plans/sx-vm-opcode-extension.md`, in order:
- **Phase A** — Opcode ID partition + dispatch fallthrough in `sx_vm.ml`.
Add `Invalid_opcode of int` exception, `extension_dispatch_ref`, the
`| op when op >= 200 -> !extension_dispatch_ref op vm frame` arm, and a
partition comment near the opcode list.
- **Phase B** — Extension registry module (`sx_vm_extensions.ml`).
`register`, `dispatch`, `id_of_name`, `state_of_extension`. Wire dispatch
into Phase A's ref at module init.
- **Phase C** — Compiler-side opcode lookup primitive (`extension-opcode-id`).
- **Phase D** — Test extension at `hosts/ocaml/lib/extensions/test_ext.ml`,
end-to-end SX → bytecode → VM dispatch flow.
- **Phase E** — JIT awareness: extension opcodes mark a lambda as
interpret-only.
## Per-fire workflow (hard)
1. Read `plans/sx-vm-opcode-extension.md` — find the first un-ticked phase.
2. Implement the phase (only files in `hosts/ocaml/**` and the plan file).
3. Build via `sx_build target=ocaml`.
4. Run regression: every existing language-port conformance suite plus
the OCaml unit tests. The list lives at `lib/<lang>/conformance.sh`
13 suites at last count (apl, common-lisp, datalog, erlang, forth, guest,
haskell, js, lua, ocaml, prolog, smalltalk, tcl).
5. If green, commit (short factual message — `vm-ext: phase A — dispatch
fallthrough` style).
6. Tick the `[ ]` for the completed phase in the plan, append one dated
line to the Progress log (newest first).
7. Stop. Wait for the next fire.
## Ground rules (hard)
- **Scope:** only `hosts/ocaml/**` and `plans/sx-vm-opcode-extension.md`.
Do **not** edit `lib/<lang>/**`, `spec/**`, `shared/**`, or any other
language port's tests.
- **One phase per fire.** Don't combine phases even if a phase looks small.
The point of the loop is incremental commits.
- **Commit locally only.** Do **not** push. Do **not** touch `main`.
- **Worktree:** you are on `loops/sx-vm-extensions` in
`/root/rose-ash-loops/sx-vm-extensions`.
- **OCaml SX VM gotchas:**
- `vm` and `frame` types are defined in `sx_vm.ml`, not `sx_types.ml`.
Forward refs (like the existing `jit_compile_ref` pattern) are how
sibling modules avoid circular dependency.
- Current core opcode ceiling is 175 (OP_DEC). The extension threshold
is 200, leaving 24 spare slots for future core opcodes.
- JIT compilation is lazy per-lambda. See `project_jit_compilation.md`
in memory for the cache + sentinel pattern.
- **SX edits:** `sx-tree` MCP tools only (none expected for this loop, but
if needed).
- **OCaml edits:** Edit/Write tools are fine — these aren't `.sx` files.
## Done condition
Phase E acceptance: all 13 (or however many exist at the time) language-port
conformance suites pass, OCaml unit tests pass, the test extension from
Phase D demonstrates end-to-end flow including JIT routing. Loop is
complete; mark and stop.
## After acceptance
Hand off to the Erlang loop: `hosts/ocaml/lib/extensions/erlang.ml` becomes
the first real consumer, written against this mechanism instead of the
Phase 9b stub dispatcher in `lib/erlang/vm/dispatcher.sx`.

View File

@@ -131,4 +131,18 @@ _Newest first._
## Blockers ## Blockers
- _(none yet)_ - **RESOLVED (2026-05-18) — SX runtime now exposes the platform
primitives Phase 8 BIFs need.** Delivered by `loops/fed-prims`
(see `plans/fed-sx-host-primitives.md` Handoff). Pure-OCaml,
WASM-safe except `http-listen` (native only). Wire Phase 8 BIFs:
- `crypto:hash/2``crypto-sha256` / `crypto-sha512` /
`crypto-sha3-256` (each `(bytes) -> hex-string`).
- `cid:from_bytes/1``cid-from-bytes` `(codec mh-bytes)`;
`cid:to_string/1` / canonical CID → `cid-from-sx` `(value)`;
dag-cbor via `cbor-encode` / `cbor-decode`.
- signature verify → `ed25519-verify` `(pk msg sig)` and
`rsa-sha256-verify` `(spki msg sig)` — both total (→ false).
- `file:list_dir/1``file-list-dir` `(path) -> (list string)`.
- fed-sx transport → `http-listen` `(port handler)` (native only).
Still deferred (leave blocked): `httpc` (HTTP client, v2) and
`sqlite-*` (v2 indexes) — not provided by fed-prims.

View File

@@ -0,0 +1,342 @@
# fed-sx host primitives — `hosts/ocaml/`
The single blocker between Erlang Phase 8 (FFI mechanism — done) and starting
fed-sx Milestone 1: the SX OCaml runtime exposes no crypto / CID / HTTP host
primitives for the Phase 8 BIF wrappers to call. This plan adds exactly that
surface, pure-OCaml where it must stay WASM-safe, native-only where it can't.
Reference: `plans/fed-sx-milestone-1.md` (build steps 1-8),
`plans/erlang-on-sx.md` Blockers ("SX runtime lacks platform primitives …").
## The hard constraint — WASM boundary
`hosts/ocaml/lib/` is the `sx` library. `hosts/ocaml/browser/dune` links it
with `(modes byte js wasm)`. **Anything added to `lib/sx_primitives.ml` must
compile under `js_of_ocaml` AND `wasm_of_ocaml`.** Therefore:
- **Pure OCaml only** for hash / CBOR / CID / Ed25519 / RSA. No `digestif`,
no `mirage-crypto`, no C stubs, no `Unix` dependency in these primitives.
(None of those libs are even installed — the switch has only
re/unix/yojson/otfm/js_of_ocaml. Pure OCaml is both required and hermetic.)
- **HTTP server is native-only**: it needs sockets/threads. Register it in
`bin/sx_server.ml` via `Sx_primitives.register` (precedent: `eval-in-env` at
`bin/sx_server.ml:721`), **not** in the shared lib. It must never enter the
WASM build.
- **`file-list-dir`** uses `Sys.readdir` (stdlib, WASM-stubbed) — safe in lib,
but the fed-sx server is native anyway; native registration is acceptable too.
**Every phase must prove the WASM build still links** (`sx_build target="wasm"`
or `bash hosts/ocaml/browser/test_boot.sh`) before its commit. A broken WASM
browser kernel is a hard regression and fails the phase.
## Primitive surface (what fed-sx Milestone 1 actually needs)
Mapped to `plans/fed-sx-milestone-1.md` build steps:
| Primitive (SX name) | Signature | fed-sx step | Host |
|---|---|---|---|
| `crypto-sha256` | `(bytes) -> hex-string` | 1, 2 | lib (pure) |
| `crypto-sha512` | `(bytes) -> hex-string` | 2 | lib (pure) |
| `crypto-sha3-256` | `(bytes) -> hex-string` | 1 (CID default) | lib (pure) |
| `cbor-encode` | `(sx-value) -> bytes` (dag-cbor, deterministic) | 1 | lib (pure) |
| `cbor-decode` | `(bytes) -> sx-value` | 1 (round-trip tests) | lib (pure) |
| `cid-from-bytes` | `(codec multihash-bytes) -> cid-string` | 1 | lib (pure) |
| `cid-from-sx` | `(sx-value) -> cid-string` (canonicalize→cbor→sha→mh→cidv1) | 1 | lib (pure) |
| `ed25519-verify` | `(pubkey-32 msg sig-64) -> bool` | 2 | lib (pure) |
| `rsa-sha256-verify` | `(der-spki msg sig) -> bool` (PKCS#1 v1.5) | 2 | lib (pure) |
| `file-list-dir` | `(path) -> (list string)` | 3 | lib/native |
| `http-listen` | `(port handler-fn) -> never` (handler: req-dict→resp-dict) | 8 | **native only** |
Deferred (not Milestone 1): `httpc-request` (HTTP client — federation is v2),
`sqlite-*` (Milestone 1 is file-on-disk; sqlite is v2 indexes).
## Registration pattern (established)
`lib/sx_primitives.ml`:
```ocaml
register "crypto-sha256" (fun args ->
match args with
| [String s] -> String (Sha2.sha256_hex s)
| _ -> raise (Eval_error "crypto-sha256: (bytes)"))
```
Errors: `raise (Eval_error "name: shape")`. Byte strings are OCaml `string`
(SX `String`). Lists are `Pair`/`Nil` per `sx_types.ml`. Native-only prims go in
`bin/sx_server.ml` the same way.
## Phasing — one feature per loop iteration
Dependency order. Each phase: implement → `dune build` (ocaml) → **WASM build
check** → tests → commit → tick box → Progress-log line → push.
### Phase A — SHA-2 (sha256 + sha512), pure OCaml ✅ DONE
- New `lib/sx_sha2.ml` (or inline in primitives if small): SHA-256 + SHA-512.
- Primitives `crypto-sha256`, `crypto-sha512` → lowercase hex string.
- Tests (`bin/run_tests.ml` or a dedicated `bin/test_crypto.ml`): NIST vectors —
`""`, `"abc"`, the 896-bit message, a 1MB "a" repetition.
- sha256("") = `e3b0c442…b7852b855`; sha256("abc") = `ba7816bf…f20015ad`
- sha512("abc") = `ddaf35a1…2a9ac94f…`
- **Acceptance:** vectors pass; WASM build links; OCaml conformance unchanged.
### Phase B — SHA-3 / Keccak-256, pure OCaml ✅ DONE
- Keccak-f[1600] + SHA3-256 padding. Primitive `crypto-sha3-256`.
- Tests: sha3-256("") = `a7ffc6f8…0f8434a`; sha3-256("abc") = `3a985da7…11431532`.
- **Acceptance:** NIST SHA-3 vectors pass; WASM links.
### Phase C — dag-cbor encoder + decoder, pure OCaml ✅ DONE
- RFC 8949 deterministic subset (RFC 8742 dag-cbor): unsigned/negative ints,
byte strings, text strings, arrays, maps with **keys sorted by
length-then-bytewise**, bool, null, tag 42 (CID link). No floats unless a
fed-sx shape needs them (defer; document).
- SX↔CBOR mapping: `Integer`→int, `String`→text str, `Bool`, `Nil`→null,
`Pair/Nil`→array, `Dict`→map (sorted keys), keyword/symbol→text str.
- Primitives `cbor-encode`, `cbor-decode`. Round-trip property tests + RFC 8949
appendix-A vectors + a "reordered dict keys → identical bytes" determinism test.
- **Acceptance:** vectors + round-trip + determinism pass; WASM links.
### Phase D — CID computation, pure OCaml ✅ DONE
- Multihash (sha2-256 = 0x12, sha3-256 = 0x16; varint code + varint len + digest).
- CIDv1 = `0x01 || codec-varint || multihash`. Codecs: dag-cbor 0x71, raw 0x55.
- Multibase base32 lower (`b` prefix, RFC 4648 no-pad).
- Primitives `cid-from-bytes` (codec, raw mh bytes), `cid-from-sx`
(canonicalize → cbor-encode → sha2-256 → multihash → cidv1 → base32).
- Tests: known IPFS CIDs — cross-check against `ipfs` CLI if present, else the
fixed vectors for `{}` dag-cbor and `"abc"` raw (hardcode expected strings).
Determinism: same SX value (whitespace/comment/key-order variants) → same CID.
- **Acceptance:** matches reference CIDs; determinism holds; WASM links. Satisfies
fed-sx Milestone 1 Step 1.
### Phase E — Ed25519 verify, pure OCaml ✅ DONE
- Curve25519/edwards25519 field arith (mod 2^255-19), point decompress,
SHA-512-based verify per RFC 8032 §5.1.7. (Reuse Phase A sha512.)
- Primitive `ed25519-verify (pubkey msg sig) -> bool`. Bad-length args → false,
not exception (verify is total).
- Tests: RFC 8032 §7.1 vectors (TEST 1-4 + the 1024-byte one). Tampered msg/sig
→ false. Wrong-length key → false.
- **Acceptance:** all RFC 8032 vectors pass; WASM links. Satisfies fed-sx Step 2
(Ed25519 sig-suite).
### Phase F — RSA-SHA256 verify (PKCS#1 v1.5), pure OCaml ✅ DONE
- Minimal pure-OCaml bignum (only need modexp + DER parse). Parse SPKI DER →
(n, e). RSASSA-PKCS1-v1_5 verify with SHA-256 (Phase A).
- Primitive `rsa-sha256-verify (der-spki msg sig) -> bool`.
- Tests: a generated 2048-bit keypair's signature (vectors hardcoded in the test
from a one-off openssl run, documented in a comment), tamper → false.
- **Acceptance:** vector verifies; tamper fails; WASM links. Satisfies fed-sx
Step 2 (rsa-sha256-2018 sig-suite). **Lower priority** than E — Ed25519 is the
modern default; RSA can land after the HTTP phase if time-boxed.
### Phase G — `file-list-dir`, native-safe ✅ DONE
- `Sys.readdir` → sorted SX list of names (no `.`/`..`). Errors → `enoent`/
`enotdir` classified like the existing `file-read` error mapping.
- Tests: list a known dir, missing dir → error, file-not-dir → error.
- **Acceptance:** passes; WASM build still links (Sys.readdir is stubbed there).
Satisfies fed-sx Step 3 segment replay.
### Phase H — HTTP/1.1 server, **native-only** (`bin/sx_server.ml`) ✅ DONE
- Minimal threaded HTTP/1.1: accept loop (`Unix` + `Thread`), parse request
line + headers + body (Content-Length), build an SX request dict
`{:method :path :query :headers :body}`, call the SX handler callable, take an
SX response dict `{:status :headers :body}`, write it. Connection: close
(keep-alive optional, defer). Bind `127.0.0.1:<port>`.
- Primitive `http-listen (port handler) -> never-returns` registered ONLY in
`bin/sx_server.ml`. Document that it is absent from the WASM kernel.
- Tests: `bin/test_http.sh` — start a server on a port with a tiny SX echo
handler in a subprocess, `curl` GET/POST/404/headers, assert responses, kill.
- **Acceptance:** curl test script green; WASM build untouched (prim not in lib).
Satisfies fed-sx Step 8 transport.
### Phase J — HTTP/1.1 client, **native-only** (`bin/sx_server.ml`) ✅ DONE
- Mirror of Phase H, inverse direction. TCP connect via `Unix.gethostbyname` +
`Unix.socket`/`Unix.connect`. Write request line + headers + body, read
response status line + headers + body (Content-Length first; chunked
encoding optional v2 — flag as Blockers if a fed-sx need hits it).
- Primitive `(http-request method url headers body) -> response-dict`
registered ONLY in `bin/sx_server.ml`. Response dict shape:
`{:status :headers :body}` (mirror of server's request dict). URL must be
`http://...` for v1; HTTPS is a separate later phase (needs TLS lib).
- Tests: `bin/test_http_client.sh` — start a tiny python HTTP server in a
subprocess (or reuse Phase H's SX server), drive GET / POST / 404 /
custom-header roundtrip via `(http-request ...)` from the epoch protocol,
assert response dict shape + body, kill server.
- **Acceptance:** test script green; WASM build untouched (prim not in lib);
Erlang conformance unchanged. Unblocks Erlang Phase 8 `httpc:request/4` BIF
wiring and fed-sx Milestone 2 federation `POST /inbox` outbound.
### Phase K — URL parser, pure OCaml, WASM-safe (`lib/sx_url.ml`)
- `(url-parse "http://host:port/path?q=1") -> {:scheme :host :port :path :query}`
— small recursive-descent parser. No external deps. Port is integer when
present, absent key otherwise (or default per scheme: 80/443).
- `(url-encode-component string) -> string` /
`(url-decode-component string) -> string` — percent-encoding per RFC 3986
(reserved/unreserved sets).
- Tests: `bin/test_url.ml` — full URL, port-less, path-only, query string with
multiple pairs, empty path, percent-encoding round-trips, malformed inputs
(return error-shaped result, not exception).
- **Acceptance:** WASM boot green (pure lib); supports fed-sx kernel actor URL
parsing and Phase J HTTP-client url handling.
### Phase L — (open) further client prims as fed-sx kernel needs surface
- Add new phases here as the kernel loop or design conversations identify
needs: chunked HTTP transfer encoding, HTTPS / TLS verify (likely opam-dep
Blockers), webfinger HTTP shape, DNS (probably folded into `http-request`).
- Each new phase: define test vectors / contract → implement → WASM-check
(skip for native-only) → commit → Progress log. Same iteration discipline as
AI.
### Phase I — handoff ✅ DONE
- Flip the `plans/erlang-on-sx.md` Blockers entry "SX runtime lacks platform
primitives …" to **RESOLVED**, listing the exact SX primitive names so the
Erlang loop can one-line-wire its blocked Phase 8 BIFs (`crypto:hash/2`,
`cid:from_bytes/1`, `cid:to_string/1`, `file:list_dir/1`, plus note
`httpc`/`sqlite` still deferred). **Do not edit `lib/erlang/`** — that wiring
is the Erlang loop's job; this phase only updates the blocker text + this
plan's "Handoff" section with the primitive→BIF mapping.
- **Acceptance:** blocker text updated; fed-sx Milestone 1 Steps 1-3 + 8
prerequisites all green.
## Scope (hard)
- **Edit only:** `hosts/ocaml/lib/**`, `hosts/ocaml/bin/**`, this plan file.
- **Do NOT edit:** `lib/erlang/**` (Erlang loop owns BIF wiring), `spec/`,
`lib/` root, other `lib/<lang>/`, `plans/erlang-on-sx.md` *except* the one
Blockers entry in Phase I.
- **Pure OCaml for lib primitives.** No new opam deps. If a phase seems to need
one, stop and add a Blockers entry instead.
- **Prove WASM every phase.** No commit without `test_boot.sh` (or wasm build)
green.
- **Never push to `main` or `architecture`.** Branch `loops/fed-prims`, push
`origin/loops/fed-prims`.
- One feature per commit. Short factual messages: `fed-prims: SHA-256 + 4 NIST
vectors`. Tick the box, append a dated Progress-log line (newest first).
- **Never call `sx_build` with no timeout-awareness** — OCaml builds are slow;
use the MCP `sx_build target="ocaml"` / `target="wasm"` tools or
`dune build` with a generous timeout. If the build hangs >10min, Blockers +
stop.
## Build & test reference
```bash
cd hosts/ocaml && dune build bin/sx_server.exe 2>&1 | tail # native
bash hosts/ocaml/browser/test_boot.sh # WASM links + boots
cd hosts/ocaml && dune exec bin/run_tests.exe 2>&1 | tail # OCaml unit tests
SX_SERVER=hosts/ocaml/_build/default/bin/sx_server.exe \
timeout 400 bash lib/erlang/conformance.sh 2>&1 | tail -3 # no-regression gate
```
A primitive is reachable from SX via the epoch protocol:
```bash
printf '(epoch 1)\n(crypto-sha256 "abc")\n' | \
hosts/ocaml/_build/default/bin/sx_server.exe
```
## Handoff (Phase I fills this in)
| SX primitive | Erlang Phase 8 BIF it unblocks |
|---|---|
| `crypto-sha256` / `crypto-sha512` / `crypto-sha3-256` | `crypto:hash/2` |
| `cid-from-bytes` / `cid-from-sx` | `cid:from_bytes/1`, `cid:to_string/1` |
| `ed25519-verify` / `rsa-sha256-verify` | `crypto:verify` / sig-suites |
| `file-list-dir` | `file:list_dir/1` |
| `http-listen` | fed-sx kernel `http:listen/2` (Milestone 1 Step 8) |
**Status: DELIVERED (Phases AH, 2026-05-18).** All primitives are
registered and reachable from SX (`(eval "(crypto-sha256 \"abc\")")`
via the epoch protocol). Signatures the Erlang loop can one-line-wire:
- `(crypto-sha256 bytes) -> hex-string` — also `crypto-sha512`,
`crypto-sha3-256`. lib (`Sx_sha2`/`Sx_sha3`), WASM-safe.
- `(cbor-encode value) -> bytes` / `(cbor-decode bytes) -> value` —
deterministic dag-cbor, lib (`Sx_cbor`), WASM-safe.
- `(cid-from-bytes codec mh-bytes) -> cid-string` /
`(cid-from-sx value) -> cid-string` — lib (`Sx_cid`), WASM-safe.
- `(ed25519-verify pk msg sig) -> bool` /
`(rsa-sha256-verify spki msg sig) -> bool` — total (bad input →
false), lib (`Sx_ed25519`/`Sx_rsa`), WASM-safe.
- `(file-list-dir path) -> (list string)` — sorted, lib, WASM-stubbed.
- `(http-listen port handler) -> never` — **NATIVE ONLY**
(`bin/sx_server.ml`); absent from the WASM kernel by design.
Still **deferred** (not Milestone 1, not provided here): `httpc-request`
(HTTP client / federation v2), `sqlite-*` (v2 indexes). The Erlang loop
should leave `httpc`/`sqlite` BIFs blocked with that note.
## Progress log
_Newest first._
- 2026-05-26 — Phase J: `http-request` primitive in `bin/sx_server.ml`
(NATIVE ONLY — `Unix.gethostbyname` + `Unix.connect`; HTTP/1.1 with
inline `http://` URL parser; sends Connection: close + Host +
Content-Length unless caller supplies them; reads status line +
headers + body via Content-Length, falling back to read-to-EOF;
Transfer-Encoding: chunked rejected with explicit error per plan).
Test `bin/test_http_client.sh` spins up a Phase-H echo server in a
background sx_server and drives a second sx_server with epoch
`(eval …)` calls: GET+query, POST+body, 404, custom request
header reflected back, non-http scheme rejected (error path),
integer status — 6/6 pass. NOT in lib/ so WASM boot untouched
(green); Erlang conformance 530/530 unchanged; run_tests
unchanged. Unblocks Erlang Phase 8 `httpc:request/4` BIF wiring
and fed-sx Milestone 2 federation `POST /inbox` outbound.
- 2026-05-18 — Phase I: handoff. `erlang-on-sx.md` Blockers gained one
RESOLVED entry (no "SX runtime lacks…" entry pre-existed; it read
"_(none yet)_") mapping every delivered primitive → its Phase 8 BIF,
with httpc/sqlite explicitly left deferred. Handoff section here
filled with signatures + native/WASM notes. Doc-only (no lib/erlang/
edits); Erlang 530/530 unchanged. **fed-sx Milestone 1 Steps 1-3 + 8
prerequisites all green — plan complete (Phases AI done).**
- 2026-05-18 — Phase H: `http-listen` primitive in `bin/sx_server.ml`
(NATIVE ONLY — Unix sockets + Thread per connection, Mutex around
the shared-runtime handler call; HTTP/1.1, Connection: close;
req {:method :path :query :headers :body} → resp {:status :headers
:body}). Test `bin/test_http.sh`: curl GET+query / POST+body / 404
/ custom header — 6/6. NOT in lib, so WASM kernel untouched (boot
green); run_tests 4897 unchanged; Erlang 530/530. Satisfies fed-sx
Milestone 1 Step 8 transport.
- 2026-05-18 — Phase G: `file-list-dir` primitive in
`lib/sx_primitives.ml` (Sys.readdir → sorted names, no "."/"..";
Sys_error prefixed like file-read, msg carries enoent/enotdir).
4 tests: sorted listing, missing dir, not-a-dir, arity. WASM boot
green (Sys.readdir stubbed there); Erlang 530/530; run_tests +4.
Satisfies fed-sx Step 3 segment replay.
- 2026-05-18 — Phase F: pure-OCaml `lib/sx_rsa.ml` (self-contained
bignum modexp, minimal DER SPKI reader, RFC 8017 §8.2.2 PKCS#1
v1.5 verify with SHA-256 DigestInfo prefix). Primitive
`rsa-sha256-verify` total. 5 tests on a fixed RSA-2048 vector
(one-off python-cryptography keygen, hardcoded): valid, tampered
msg/sig, garbage SPKI, non-string. WASM boot green with new lib
module; Erlang 530/530; run_tests +5. Satisfies fed-sx Step 2
(rsa-sha256-2018 sig-suite).
- 2026-05-18 — Phase E: pure-OCaml `lib/sx_ed25519.ml` (minimal
base-2^26 bignum, edwards25519 extended-coord points, RFC 8032
§5.1.7 cofactorless verify reusing Phase-A sha512). Primitive
`ed25519-verify` is total (bad/short/non-string args → false).
8 tests: RFC 8032 §7.1 TEST 1-3 (re-derived independently via
python-cryptography), tampered msg/sig, wrong-length, non-string.
WASM boot green with new lib module; Erlang 530/530; run_tests +8.
Satisfies fed-sx Milestone 1 Step 2 (Ed25519 sig-suite).
- 2026-05-18 — Phase D: pure-OCaml `lib/sx_cid.ml` (unsigned-varint,
multihash, CIDv1, multibase base32-lower), primitives `cid-from-bytes`
/ `cid-from-sx` (cbor→sha2-256→mh→cidv1, dag-cbor codec 0x71). 5 tests:
raw "abc"=bafkreif2pall7d…, raw ""=bafkreihdwdcefg…, dag-cbor {}=
bafyreigbtj4x7i… (all match canonical IPFS CIDs; no `ipfs` CLI so
vectors independently derived in Python), key-order determinism. WASM
boot green with new lib module; Erlang 530/530; run_tests +5.
- 2026-05-18 — Phase C: pure-OCaml `lib/sx_cbor.ml` (dag-cbor encode/
decode), primitives `cbor-encode`/`cbor-decode`. RFC 8949 Appendix-A
vectors, length-then-bytewise key sort + order-invariance determinism,
decode∘encode round-trip (30 tests). Floats unsupported (raise, no
fed-sx shape needs them); tag-42 decode = inner-item passthrough.
WASM boot green with new lib module; Erlang 530/530; run_tests +30.
- 2026-05-18 — Phase B: pure-OCaml `lib/sx_sha3.ml` (Keccak-f[1600] +
SHA-3 pad, domain 0x06), primitive `crypto-sha3-256`. 4 NIST FIPS 202
vectors pass (empty/abc/896-bit + 1600-bit 0xa3 multi-block). WASM boot
green with new lib module; Erlang conformance 530/530; run_tests +4.
- 2026-05-18 — Phase A: pure-OCaml `lib/sx_sha2.ml` (SHA-256 + SHA-512),
primitives `crypto-sha256`/`crypto-sha512`. 7 NIST FIPS 180-4 vectors pass
(empty/abc/896-bit/1M-'a' for sha256; empty/abc/896-bit for sha512). WASM
boot green with new lib module; Erlang conformance 530/530 unchanged.
## Blockers
- _(none yet)_

View File

@@ -0,0 +1,555 @@
# SX VM Opcode Extension Mechanism
Mechanism in `hosts/ocaml/lib/` that lets language ports register specialized
bytecode opcodes without modifying the SX VM core. Direct prerequisite for
**erlang-on-sx Phase 9** (the BEAM analog) and a structural enabler for any
future language port that wants performance-critical opcodes.
Reference: `plans/erlang-on-sx.md` Phase 9, `plans/fed-sx-design.md` §17.5,
`hosts/ocaml/lib/sx_vm.ml` (current VM).
Status: **complete** on `loops/sx-vm-extensions` (Phases A-E landed
2026-05-14 / 2026-05-15). Ready for first real consumer
(`hosts/ocaml/lib/extensions/erlang.ml`, replacing the Phase 9b stub
dispatcher in `lib/erlang/vm/dispatcher.sx`).
---
## Goal
Allow language ports to register custom bytecode opcodes in the SX VM, with:
- **Zero overhead for core opcodes.** Existing opcodes (current ceiling 175,
see `sx_vm.ml`) must dispatch identically. No regression for any existing
language port or the core SX runtime.
- **One additional dispatch step for extension opcodes.** Acceptable cost; the
win comes from avoiding the general CEK machinery.
- **Per-extension state slot.** Erlang's process scheduler, Haskell's thunk
cache, etc. need somewhere to hang state alongside the VM.
- **Compiler awareness.** The bytecode compiler (`lib/compiler.sx`) must be
able to emit extension opcodes by name, looked up against the registered
set.
- **JIT compatibility.** Existing JIT (lazy lambda compilation) continues to
work for code paths using only core opcodes. Extension opcodes are
interpreted in v1; JITing them is a follow-up.
## Non-goals
- **Hot opcode reload.** Adding/replacing opcodes mid-runtime is not in
scope. Extensions are compile-time additions to the OCaml binary. (If
needed, that's a separate project.)
- **Per-instance opcode sets.** All running instances of the SX VM share
the same opcode set determined at build time. Selective opcode loading
per instance is out of scope.
- **Opcode hot-swap or supersession.** Once registered, opcodes are stable
for the lifetime of the binary.
- **Language-port isolation at the dispatch layer.** Two language ports can
see each other's opcodes (they share the dispatch table). Isolation is a
build-time concern — don't compile in extensions you don't trust.
---
## Why now
The Erlang-on-SX Phase 9 work needs this. Without it, Phase 9b-9g (the actual
opcode implementations) have nowhere to plug in. The Erlang loop hit this
dependency as a Blocker (`0abf05ed`); this design is what unblocks it.
It also enables the **shared opcode pattern** discussed in `plans/fed-sx-
design.md` §17.5: opcodes Erlang Phase 9 produces that other ports could
plausibly use (pattern match, perform/handle, record access) get chiselled
out to `lib/guest/vm/` when a second port has an actual second use. Without
the extension mechanism, each port would have to fork the SX VM core or
modify shared dispatch — neither acceptable.
---
## Architectural overview
```
┌──────────────────────────────────────────┐
│ SX VM core (hosts/ocaml/lib/sx_vm.ml) │
│ │
│ ┌────────────────────────────────────┐ │
│ │ Bytecode dispatch loop │ │
│ │ │ │
│ │ match op with │ │
│ │ | 1 (OP_CONST) -> ... │ │
│ │ | 2 (OP_NIL) -> ... │ │
│ │ | ... │ │
│ │ | 175 -> ... (last core opcode) │ │
│ │ | op when op >= 200 -> │ │
│ │ !extension_dispatch_ref op │ │ ◄── new
│ │ vm frame │ │
│ └────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────┐ │
│ │ Extension registry │ │
│ │ opcode_id -> handler │ │ ◄── Phase B
│ │ opcode_name -> opcode_id │ │
│ │ extension_state per extension │ │
│ └────────────────────────────────────┘ │
└──────────────────────────────────────────┘
│ register at startup
┌──────────────────┴──────────────────────┐
│ Extension modules │
│ hosts/ocaml/lib/extensions/erlang.ml │
│ hosts/ocaml/lib/extensions/haskell.ml │
│ hosts/ocaml/lib/extensions/datalog.ml │
│ hosts/ocaml/lib/extensions/guest_vm.ml │ ◄── shared opcodes
└─────────────────────────────────────────┘
```
### Opcode ID space partition
Current SX VM uses opcode IDs from 1 to 175 (per inspection of `sx_vm.ml`,
ceiling at OP_DEC = 175). We partition the 0-255 space:
| Range | Use |
|---------|------------------------------------------------------------------|
| 0 | reserved / NOP |
| 1-199 | **core opcodes** — owned by the SX VM, locked schema |
| 200-247 | **extension opcodes** — registered by extensions (ports + shared) |
| 248-255 | reserved for future expansion / multi-byte opcodes |
This gives the core 24 free slots above the current 175 ceiling for future
core additions, and 48 slots for extensions. Erlang Phase 9 expects to need
fewer than 30 specialized opcodes, so this is comfortable headroom.
The plan originally proposed a finer split (`128-199` for `lib/guest/vm/`
shared, `200-247` for ports). That distinction is preserved at the **naming
level** (`guest_vm.OP_X` vs `erlang.OP_Y`) and policed by the registry
(duplicate IDs fail at startup), without consuming separate ID ranges. The
chiselling discipline (move an opcode to `guest_vm` when a second port uses
it) operates at the source level.
If we need more than 256 opcodes total, multi-byte opcodes (a leading 248-255
byte plus a second byte) extend the space without breaking the schema.
### Extension module signature
```ocaml
(* hosts/ocaml/lib/sx_vm_extension.ml *)
(** A handler for an extension opcode. Reads operands from bytecode,
manipulates the VM stack, updates the frame's instruction pointer.
May raise exceptions (which propagate via the existing VM error path). *)
type handler = vm -> frame -> unit
(** State an extension carries alongside the VM. Opaque to the VM core;
extensions cast as needed. *)
type extension_state = ..
module type EXTENSION = sig
(** Stable name for this extension (e.g. "erlang", "guest_vm"). *)
val name : string
(** Initialize per-instance state. Called once when the VM starts and the
extension is loaded. *)
val init : unit -> extension_state
(** Opcodes this extension provides. Each is (opcode_id, opcode_name, handler).
opcode_id must be in 200-247. Conflicts cause startup failure. *)
val opcodes : extension_state -> (int * string * handler) list
end
```
### Registration and dispatch
```ocaml
(* hosts/ocaml/lib/sx_vm_extensions.ml *)
let extensions : (module EXTENSION) list ref = ref []
let states : (string, extension_state) Hashtbl.t = Hashtbl.create 8
let by_id : (int, handler) Hashtbl.t = Hashtbl.create 64
let by_name : (string, int) Hashtbl.t = Hashtbl.create 64
let register (m : (module EXTENSION)) =
let module M = (val m) in
let st = M.init () in
Hashtbl.add states M.name st;
List.iter (fun (id, name, h) ->
if Hashtbl.mem by_id id then
failwith (Printf.sprintf "Opcode %d (%s) already registered" id name);
Hashtbl.add by_id id h;
Hashtbl.add by_name name id
) (M.opcodes st);
extensions := m :: !extensions
let dispatch op vm frame =
match Hashtbl.find_opt by_id op with
| Some handler -> handler vm frame
| None -> raise (Invalid_opcode op)
let id_of_name name = Hashtbl.find_opt by_name name
let state_of_extension name = Hashtbl.find_opt states name
```
Phase B installs this dispatcher into `Sx_vm.extension_dispatch_ref` at
module init. Until then, the ref's default raises `Invalid_opcode op` for
any opcode ≥ 200, which is the Phase A test condition.
The dispatch path adds **one hashtable lookup per extension opcode**.
Acceptable cost — and Erlang's specialized opcodes win >100× over going
through the general CEK machine, so the overhead is negligible by comparison.
### Bytecode compiler integration
The compiler (`lib/compiler.sx`) needs to know extension opcode IDs to emit
them. New SX primitive exposed to the compiler:
```sx
(extension-opcode-id "erlang.OP_PATTERN_TUPLE_2") ; → 200, or nil if not loaded
```
When the compiler wants to emit a specialized opcode, it queries by name. If
the extension isn't loaded, the compiler falls back to the general path
(emit a `CALL_PRIM` or general SX `case`). This means a language port's
optimization is opt-in per build, and missing extensions degrade to slower
correct execution rather than failure.
Naming convention: `<extension-name>.OP_<NAME>`. So `erlang.OP_PATTERN_TUPLE_2`,
`guest_vm.OP_PERFORM`, etc.
### Per-extension state access
Some opcodes need state beyond the VM stack (Erlang's scheduler, mailbox
state, etc.). Extensions store state in their `init`-returned value, accessed
via `state_of_extension`:
```ocaml
let op_spawn vm frame =
let st = Sx_vm_extensions.state_of_extension "erlang"
|> Option.get
|> Obj.magic in (* extension casts to its known type *)
let body = pop vm in
let pid = Erlang_scheduler.spawn st body in
push vm (pid_value pid);
frame.ip <- frame.ip + 1
```
Shared scheduler state lives in the Erlang extension's state value. Other
extensions don't see it.
---
## Phase plan
Five sub-phases in dependency order. Each is testable in isolation.
### Phase A — Opcode ID partition + dispatch fallthrough
- [x] Define `exception Invalid_opcode of int` in `sx_vm.ml`.
- [x] Add `extension_dispatch_ref : (int -> vm -> frame -> unit) ref`
whose default handler raises `Invalid_opcode op`. Forward-declared in
the same style as the existing `jit_compile_ref`.
- [x] Add `| op when op >= 200 -> !extension_dispatch_ref op vm frame` arm
in the dispatch loop, immediately before the catch-all.
- [x] Document the partition in a comment near the top of the opcode list.
**Tests:**
- All existing OCaml VM/CEK tests pass unchanged (zero regression for core).
- Constructed bytecode using opcode 200 raises `Invalid_opcode 200` when no
extension is registered.
**Effort:** small. ~50 lines + tests.
### Phase B — Extension registry module
`hosts/ocaml/lib/sx_vm_extensions.ml` per the sketch above. Pure plumbing, no
opcodes yet. Phase B's module init installs the real `dispatch` into
`Sx_vm.extension_dispatch_ref`, replacing Phase A's stub.
- [x] `Sx_vm_extension` interface module (handler type, EXTENSION sig).
- [x] `Sx_vm_extensions` registry module (`register`, `dispatch`,
`id_of_name`, `state_of_extension`).
- [x] Wire the registry's `dispatch` into `Sx_vm.extension_dispatch_ref` at
module init.
**Tests:**
- Register a test extension with one opcode; dispatch finds it.
- Duplicate opcode-id registration fails at startup.
- `id_of_name` and `state_of_extension` lookups work.
**Effort:** small. ~150 lines + tests.
### Phase C — Compiler-side opcode lookup primitive
Expose `extension-opcode-id` as an SX primitive in `hosts/ocaml/lib/`. The
compiler in `lib/compiler.sx` can call it to emit extension opcodes by name.
Does not require any extension to actually exist — the primitive returns
`nil` for unknown names, and the compiler falls back.
- [x] Register `extension-opcode-id` in `sx_primitives.ml`.
- [x] Returns `Integer id` when registered, `Nil` otherwise.
**Tests:**
- Primitive returns nil for unknown name.
- After registering a test extension, primitive returns the registered ID.
**Effort:** small. Single primitive registration + compiler-side use docs.
### Phase D — Test extension demonstrating end-to-end flow
A dummy extension at `hosts/ocaml/lib/extensions/test_ext.ml` registering
one or two trivial opcodes (e.g. `OP_TEST_PUSH_42`, `OP_TEST_DOUBLE_TOS`).
Wired into the build, available when running tests.
Compiler test: write SX that triggers the test compiler-extension to emit
`OP_TEST_PUSH_42`, then verify the VM executes it correctly via
`bytecode-inspect` and `vm-trace`.
- [x] `test_ext.ml` registers two opcodes.
- [x] Wired into the build (extensions registered at startup).
- [x] Bytecode emission via name lookup produces the right ID.
- [x] `bytecode-inspect` shows the opcode by name.
**Tests:**
- Bytecode emission via name lookup produces the right ID.
- Execution produces the expected stack effect.
- `bytecode-inspect` shows the opcode by name.
- `vm-trace` correctly reports the extension opcode.
**Effort:** small. ~100 lines including build wiring.
### Phase E — JIT awareness (interpreted-only for v1)
The JIT (lazy lambda compilation) currently compiles based on opcode ranges.
Extension opcodes (≥200) should fall through to interpretation, not be
JIT-compiled in v1.
- [x] Mark extension opcodes as "interpret only" in the JIT pre-analysis.
- [x] Lambda containing only core opcodes JIT-compiles as before.
- [x] Lambda containing any extension opcode runs interpreted.
JITing extension opcodes is a follow-up project; v1 keeps the JIT scope
unchanged and just makes it correctly route mixed bytecode.
**Tests:**
- Lambda with only core opcodes: JIT-compiled, fast path.
- Lambda with extension opcode: interpreted, correct result.
- Mixed lambda: interpreted, correct result.
**Effort:** small-medium. Requires understanding the JIT's pre-analysis
(per `project_jit_compilation.md` memory: "Lazy JIT implemented: lambda
bodies compiled on first VM call, cached, failures sentinel-marked").
Extension-opcode detection becomes another reason to mark a lambda
"interpret-only."
---
## Acceptance criteria
1. **Phase A-D pass their test suites.**
2. **Zero regression on existing SX VM tests.** All language-port test
suites currently passing on the architecture branch (Erlang 530+, Haskell
285+, Datalog 276+, Smalltalk 625+, the SX core test suite, etc.) still
pass.
3. **Test extension demonstrates the flow end-to-end.** SX source compiles
via the compiler with a registered extension opcode, executes through the
VM via the dispatch fallthrough, returns correct result.
4. **Documentation:** README in `hosts/ocaml/lib/extensions/` explaining the
pattern, with a worked example (the test extension is the canonical one).
After acceptance, the Erlang-on-SX Phase 9 work in `lib/erlang/vm/` can use
this mechanism. The Erlang loop's Blocker for 9a is resolved.
---
## Risk and mitigation
**Risk: regression in core opcode dispatch.** A misplaced `match` arm could
break something. *Mitigation:* run every existing language-port conformance
suite before merging.
**Risk: opcode ID conflicts as more extensions land.** If Erlang Phase 9
claims IDs 200-220 and Haskell wants 215-235, we have a problem.
*Mitigation:* maintain a registry document at `hosts/ocaml/lib/extensions/
README.md` listing claimed ID ranges per extension. Convention: each
extension claims a contiguous block at first registration; collisions caught
at startup with a clear error.
**Risk: extension state types leak through `Obj.magic`.** The extension state
is type-erased in the registry. *Mitigation:* extensions cast in their own
opcode handlers, never expose state to other extensions or the VM core.
First-class modules / GADTs could add more type safety; deferred unless
this becomes a concrete pain point.
**Risk: extensions become a back door for kernel mutation.** An extension
opcode handler has full access to the VM. *Mitigation:* extensions are
build-time additions, not runtime; they're as trusted as the rest of the
binary. Operators audit at build time, not runtime. Same trust model as
any other compiled-in code.
**Risk: shared `lib/guest/vm/` opcodes evolve under different language
ports' needs.** *Mitigation:* the chiselling discipline (move to guest only
on second use) ensures the shared opcodes are tested against at least two
ports' actual usage before being considered stable.
---
## Open questions
To be resolved during implementation, not blocking design approval:
1. **Multi-byte opcode encoding.** If we need >256 opcodes total, the
leading-byte 248-255 schema accommodates it. Do we need multi-byte at
v1? Probably not — 48 extension opcodes is more than any single port
should reasonably want.
2. **Extension ordering matters?** If two extensions register opcodes that
read the same VM state, ordering of registration could matter for
initialization. Probably not in practice; flag if it bites.
3. **Hot-reload of extensions.** Out of scope for v1 (per non-goals). If
wanted later, the registry would need teardown + re-registration; the
`gen_server` `code_change/3` model from Erlang Phase 7 is a precedent.
4. **Cross-extension opcode composition.** Can `guest_vm.OP_PERFORM` invoke
`erlang.OP_RECEIVE_SCAN`? In principle yes — handlers can do anything.
The interface is clean; the question is whether we want any conventions
to keep ergonomics tractable. Defer until composition appears in
practice.
---
## Implementation roadmap and sequencing
This is a sister workstream to `loops/erlang`. Driven by Erlang Phase 9.
Single bounded loop on `loops/sx-vm-extensions`, ~1-2 weeks.
Recommended sequencing (one phase per loop fire):
1. **Phase A** — dispatch fallthrough. Smallest viable change to `sx_vm.ml`.
2. **Phase B** — extension registry module.
3. **Phase C** — compiler-side opcode lookup primitive.
4. **Phase D** — test extension demonstrating end-to-end flow.
5. **Phase E** — JIT awareness (interpret-only routing).
After acceptance:
- **`hosts/ocaml/lib/extensions/erlang.ml`** becomes the *first real
consumer* — written by whoever takes over from the Erlang loop's stub
dispatcher in `lib/erlang/vm/dispatcher.sx`. That's the integration
moment that closes the loop.
Estimated total effort: 1-2 weeks for one focused engineer with OCaml SX VM
familiarity.
---
## Relationship to other plans
- **`plans/erlang-on-sx.md` Phase 9:** unblocked by this work. Erlang loop
develops opcodes against a stub dispatcher in `lib/erlang/vm/`; once this
mechanism lands, swap stub for real registration via
`hosts/ocaml/lib/extensions/erlang.ml`.
- **`plans/fed-sx-design.md` §17.5:** documents this as Layer-1 prerequisite.
The shared-opcode discipline (lib/guest/vm/) is designed on top of this
mechanism's namespace allocation.
- **Future language ports (Haskell, Datalog, Smalltalk perf phases):** will
use the same mechanism. Each adds an extension module, claims an opcode
range, registers handlers. The `lib/guest/vm/` opcodes get
cross-referenced when the second port's needs justify chiselling.
- **JIT roadmap (per `project_jit_architecture.md` memory):** extension
opcodes are interpreted in v1. JITing them is a logical follow-up but
a separate project.
---
## Progress log
Newest first.
- **2026-05-15** — Phase E done. Loop complete (acceptance criteria
1-4 all met). New `Sx_vm.bytecode_uses_extension_opcodes` walks
bytecode operand-aware (CONST u16 indices, CALL_PRIM u16+u8,
CLOSURE u16+dynamic upvalue descriptors) so values that happen to
be ≥200 don't false-positive as extension opcodes. Wired into
`jit_compile_lambda`: when the inner closure's bytecode contains
any extension opcode, JIT returns None and the lambda runs
interpreted via CEK (the dispatch fallthrough still routes
extension opcodes through the registry — this just prevents the
JIT from claiming ownership of code it can't optimise). 7 new
foundation tests (`jit extension-opcode awareness` suite): pure
core eligible, head/middle/post-CLOSURE detection, CONST + CALL_PRIM
+ CLOSURE-descriptor false-positive avoidance. +7 pass vs Phase D
baseline (4833 vs 4826), 1111 pre-existing failures unchanged.
Conformance suites green: erlang 530/530, haskell 285/285, datalog
276/276, prolog 590/590, smalltalk 847/847, common-lisp 487/487,
apl 562/562, js 148/148, forth 632/638 (pre-existing), tcl 3/4
(pre-existing), ocaml-on-sx unit 607/607.
Loop done. Hand-off: the Erlang loop's Phase 9b stub dispatcher in
`lib/erlang/vm/dispatcher.sx` can now be replaced with a real
`hosts/ocaml/lib/extensions/erlang.ml` consumer.
- **2026-05-15** — Phase D done. New `hosts/ocaml/lib/extensions/` subtree
wired into the `sx` library via `(include_subdirs unqualified)`.
`extensions/test_ext.ml` is the canonical worked example: two
operand-less opcodes (`test_ext.OP_TEST_PUSH_42` = 220,
`test_ext.OP_TEST_DOUBLE_TOS` = 221) carrying `TestExtState` (an
invocation counter that exercises the per-extension state slot).
`extensions/README.md` documents the registration pattern, opcode-ID
range conventions, and naming rules.
`Sx_vm.opcode_name` now consults `extension_opcode_name_ref` (forward
ref) so disassembly shows extension opcodes by name instead of
`UNKNOWN_n`. Registry maintains `name_of_id_table` (reverse of
`by_name`) and installs the lookup at module init alongside the
dispatch ref. 5 new foundation tests (`extensions/test_ext` suite):
`extension-opcode-id` finds OP_TEST_PUSH_42, end-to-end bytecode runs
to 84, disassemble shows opcode names, unregistered ext opcodes still
fall back to UNKNOWN_n, per-extension state counter increments.
+5 pass vs Phase C baseline (4826 vs 4821), 1111 pre-existing failures
unchanged. Conformance suites green: erlang 530/530, haskell 285/285,
datalog 276/276, prolog 590/590, smalltalk 847/847, common-lisp
487/487, apl 562/562, js 148/148, forth 632/638 (pre-existing), tcl
3/4 (pre-existing), ocaml-on-sx unit 607/607.
- **2026-05-15** — Phase C done. `extension-opcode-id` SX primitive
registered from `sx_vm_extensions.ml` module init (avoids the
`sx_primitives ↔ sx_vm` cycle by registering downstream of both).
Accepts a string or symbol; returns `Integer id` for registered
opcode names, `Nil` for unknown — so a missing extension at compile
time degrades to a fallback rather than failure. 5 new foundation
tests (`extension-opcode-id primitive` suite): registered lookup,
unknown → nil, symbol arg, zero-arg rejection, integer-arg
rejection. +5 pass vs Phase B baseline (4821 vs 4816), 1111
pre-existing failures unchanged. Conformance suites green: erlang
530/530, haskell 285/285, datalog 276/276, prolog 590/590, smalltalk
847/847, common-lisp 487/487, apl 562/562, js 148/148, forth 632/638
(pre-existing), tcl 3/4 (pre-existing), ocaml-on-sx unit 607/607.
- **2026-05-14** — Phase B done. Added `hosts/ocaml/lib/sx_vm_extension.ml`
(interface: `handler` type, `extension_state` extensible variant,
`EXTENSION` module type) and `sx_vm_extensions.ml` (registry: `register`,
`dispatch`, `id_of_name`, `state_of_extension`, `_reset_for_tests`).
`let () = install_dispatch ()` at module init replaces Phase A's stub
with the real registry dispatch — Phase A behavior preserved (empty
registry still raises `Invalid_opcode` for unregistered ops). Registry
rejects opcode IDs outside 200-247, duplicate IDs, duplicate names, and
duplicate extension names. 9 new foundation tests (`vm-extension-registry`
suite): id_of_name resolve+miss, state_of_extension resolve+miss,
end-to-end VM dispatch (push 42), opcode composition (push 42 → double
→ 84), duplicate-id / out-of-range / duplicate-name rejection. +9 pass
vs Phase A baseline (4816 vs 4807), 1111 pre-existing failures unchanged.
Conformance suites green: erlang 530/530, haskell 285/285, datalog
276/276, prolog 590/590, smalltalk 847/847, common-lisp 487/487, apl
562/562, js 148/148, forth 632/638 (pre-existing), tcl 3/4 (pre-existing),
ocaml-on-sx unit 607/607.
- **2026-05-14** — Phase A done. Added `Invalid_opcode of int` exception,
`extension_dispatch_ref` (default raises `Invalid_opcode op`), and the
`| op when op >= 200 -> !extension_dispatch_ref op vm frame` arm before the
catch-all in `sx_vm.ml`. Partition comment documents 1-199 core / 200-247
extensions / 248-255 reserved (current core ceiling is OP_DEC = 175).
4 new foundation tests (3 × Invalid_opcode for opcodes 200/224/247, 1 ×
Eval_error for opcode 199 to pin the threshold). Foundation 64/64;
full OCaml test suite +4 pass vs baseline (4807 vs 4803), 1111 pre-existing
failures unchanged. Conformance suites green: erlang 530/530, haskell
285/285, datalog 276/276, prolog 590/590, smalltalk 847/847, common-lisp
305/305, apl 562/562, js 148/148, forth 632/638 (pre-existing), tcl 3/4
(pre-existing), ocaml-on-sx unit 607/607. (Lua 0/16 and ocaml-conformance
baseline programs not exercised — pre-existing scoreboard state and
multi-hour runtime respectively.)