Three new files forming the bytecode compilation pipeline:
spec/bytecode.sx — opcode definitions (~65 ops):
- Stack/constant ops (CONST, NIL, TRUE, POP, DUP)
- Lexical variable access (LOCAL_GET/SET, UPVALUE_GET/SET, GLOBAL_GET/SET)
- Jump-based control flow (JUMP, JUMP_IF_FALSE/TRUE)
- Function ops (CALL, TAIL_CALL, RETURN, CLOSURE, CALL_PRIM)
- HO form ops (ITER_INIT/NEXT, MAP_OPEN/APPEND/CLOSE)
- Scope/continuation ops (SCOPE_PUSH/POP, RESET, SHIFT)
- Aser specialization (ASER_TAG, ASER_FRAG)
spec/compiler.sx — SX-to-bytecode compiler (SX code, portable):
- Scope analysis: resolve variables to local/upvalue/global at compile time
- Tail position detection for TCO
- Code generation for: if, when, and, or, let, begin, lambda,
define, set!, quote, function calls, primitive calls
- Constant pool with deduplication
- Jump patching for forward references
hosts/ocaml/lib/sx_vm.ml — bytecode interpreter (OCaml):
- Stack-based VM with array-backed operand stack
- Call frames with base pointer for locals
- Direct opcode dispatch via pattern match
- Zero allocation per step (unlike CEK machine's dict-per-step)
- Handles: constants, variables, jumps, calls, primitives,
collections, string concat, define
Architecture: compiler.sx is spec (SX, portable). VM is platform
(OCaml-native). Same bytecode runs on JS/WASM VMs.
Also includes: CekFrame record optimization in transpiler.sx
(29 frame types as records instead of Hashtbl).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
147 lines
5.9 KiB
Plaintext
147 lines
5.9 KiB
Plaintext
;; ==========================================================================
|
|
;; bytecode.sx — SX bytecode format definition
|
|
;;
|
|
;; Universal bytecode for SX evaluation. Produced by compiler.sx,
|
|
;; executed by platform-native VMs (OCaml, JS, WASM).
|
|
;;
|
|
;; Design principles:
|
|
;; - One byte per opcode (~65 ops, fits in u8)
|
|
;; - Variable-length encoding (1-5 bytes per instruction)
|
|
;; - Lexical scope resolved at compile time (no hash lookups)
|
|
;; - Tail calls detected statically (no thunks/trampoline)
|
|
;; - Control flow via jumps (no continuation frames for if/when/etc.)
|
|
;; - Content-addressable (deterministic binary for CID)
|
|
;; ==========================================================================
|
|
|
|
|
|
;; --------------------------------------------------------------------------
|
|
;; Opcode constants
|
|
;; --------------------------------------------------------------------------
|
|
|
|
;; Stack / Constants
|
|
(define OP_CONST 0x01) ;; u16 pool_idx — push constant
|
|
(define OP_NIL 0x02) ;; push nil
|
|
(define OP_TRUE 0x03) ;; push true
|
|
(define OP_FALSE 0x04) ;; push false
|
|
(define OP_POP 0x05) ;; discard TOS
|
|
(define OP_DUP 0x06) ;; duplicate TOS
|
|
|
|
;; Variable access (resolved at compile time)
|
|
(define OP_LOCAL_GET 0x10) ;; u8 slot
|
|
(define OP_LOCAL_SET 0x11) ;; u8 slot
|
|
(define OP_UPVALUE_GET 0x12) ;; u8 idx
|
|
(define OP_UPVALUE_SET 0x13) ;; u8 idx
|
|
(define OP_GLOBAL_GET 0x14) ;; u16 name_idx
|
|
(define OP_GLOBAL_SET 0x15) ;; u16 name_idx
|
|
|
|
;; Control flow (replaces if/when/cond/and/or frames)
|
|
(define OP_JUMP 0x20) ;; i16 offset
|
|
(define OP_JUMP_IF_FALSE 0x21) ;; i16 offset
|
|
(define OP_JUMP_IF_TRUE 0x22) ;; i16 offset
|
|
|
|
;; Function operations
|
|
(define OP_CALL 0x30) ;; u8 argc
|
|
(define OP_TAIL_CALL 0x31) ;; u8 argc — reuse frame (TCO)
|
|
(define OP_RETURN 0x32) ;; return TOS
|
|
(define OP_CLOSURE 0x33) ;; u16 code_idx — create closure
|
|
(define OP_CALL_PRIM 0x34) ;; u16 name_idx, u8 argc — direct primitive
|
|
(define OP_APPLY 0x35) ;; (apply f args-list)
|
|
|
|
;; Collection construction
|
|
(define OP_LIST 0x40) ;; u16 count — build list from stack
|
|
(define OP_DICT 0x41) ;; u16 count — build dict from stack pairs
|
|
(define OP_APPEND_BANG 0x42) ;; (append! TOS-1 TOS)
|
|
|
|
;; Higher-order forms (inlined loop)
|
|
(define OP_ITER_INIT 0x50) ;; init iterator on TOS list
|
|
(define OP_ITER_NEXT 0x51) ;; i16 end_offset — push next or jump
|
|
(define OP_MAP_OPEN 0x52) ;; push empty accumulator
|
|
(define OP_MAP_APPEND 0x53) ;; append TOS to accumulator
|
|
(define OP_MAP_CLOSE 0x54) ;; pop accumulator as list
|
|
(define OP_FILTER_TEST 0x55) ;; i16 skip — if falsy jump (skip append)
|
|
|
|
;; HO fallback (dynamic callback)
|
|
(define OP_HO_MAP 0x58) ;; (map fn coll)
|
|
(define OP_HO_FILTER 0x59) ;; (filter fn coll)
|
|
(define OP_HO_REDUCE 0x5A) ;; (reduce fn init coll)
|
|
(define OP_HO_FOR_EACH 0x5B) ;; (for-each fn coll)
|
|
(define OP_HO_SOME 0x5C) ;; (some fn coll)
|
|
(define OP_HO_EVERY 0x5D) ;; (every? fn coll)
|
|
|
|
;; Scope / dynamic binding
|
|
(define OP_SCOPE_PUSH 0x60) ;; TOS = name
|
|
(define OP_SCOPE_POP 0x61)
|
|
(define OP_PROVIDE_PUSH 0x62) ;; TOS-1 = name, TOS = value
|
|
(define OP_PROVIDE_POP 0x63)
|
|
(define OP_CONTEXT 0x64) ;; TOS = name → push value
|
|
(define OP_EMIT 0x65) ;; TOS-1 = name, TOS = value
|
|
(define OP_EMITTED 0x66) ;; TOS = name → push collected
|
|
|
|
;; Continuations
|
|
(define OP_RESET 0x70) ;; i16 body_len — push delimiter
|
|
(define OP_SHIFT 0x71) ;; u8 k_slot, i16 body_len — capture k
|
|
|
|
;; Define / component
|
|
(define OP_DEFINE 0x80) ;; u16 name_idx — bind TOS to name
|
|
(define OP_DEFCOMP 0x81) ;; u16 template_idx
|
|
(define OP_DEFISLAND 0x82) ;; u16 template_idx
|
|
(define OP_DEFMACRO 0x83) ;; u16 template_idx
|
|
(define OP_EXPAND_MACRO 0x84) ;; u8 argc — runtime macro expansion
|
|
|
|
;; String / serialize (hot path)
|
|
(define OP_STR_CONCAT 0x90) ;; u8 count — concat N values as strings
|
|
(define OP_STR_JOIN 0x91) ;; (join sep list)
|
|
(define OP_SERIALIZE 0x92) ;; serialize TOS to SX string
|
|
|
|
;; Aser specialization (optional, 0xE0-0xEF reserved)
|
|
(define OP_ASER_TAG 0xE0) ;; u16 tag_name_idx — serialize HTML tag
|
|
(define OP_ASER_FRAG 0xE1) ;; u8 child_count — serialize fragment
|
|
|
|
|
|
;; --------------------------------------------------------------------------
|
|
;; Bytecode module structure
|
|
;; --------------------------------------------------------------------------
|
|
|
|
;; A module contains:
|
|
;; magic: "SXBC" (4 bytes)
|
|
;; version: u16
|
|
;; pool_count: u32
|
|
;; pool: constant pool entries (self-describing tagged values)
|
|
;; code_count: u32
|
|
;; codes: code objects
|
|
;; entry: u32 (index of entry-point code object)
|
|
|
|
(define BYTECODE_MAGIC "SXBC")
|
|
(define BYTECODE_VERSION 1)
|
|
|
|
;; Constant pool tags
|
|
(define CONST_NUMBER 0x01)
|
|
(define CONST_STRING 0x02)
|
|
(define CONST_BOOL 0x03)
|
|
(define CONST_NIL 0x04)
|
|
(define CONST_SYMBOL 0x05)
|
|
(define CONST_KEYWORD 0x06)
|
|
(define CONST_LIST 0x07)
|
|
(define CONST_DICT 0x08)
|
|
(define CONST_CODE 0x09)
|
|
|
|
|
|
;; --------------------------------------------------------------------------
|
|
;; Disassembler
|
|
;; --------------------------------------------------------------------------
|
|
|
|
(define opcode-names
|
|
{:0x01 "CONST" :0x02 "NIL" :0x03 "TRUE" :0x04 "FALSE"
|
|
:0x05 "POP" :0x06 "DUP"
|
|
:0x10 "LOCAL_GET" :0x11 "LOCAL_SET"
|
|
:0x12 "UPVALUE_GET" :0x13 "UPVALUE_SET"
|
|
:0x14 "GLOBAL_GET" :0x15 "GLOBAL_SET"
|
|
:0x20 "JUMP" :0x21 "JUMP_IF_FALSE" :0x22 "JUMP_IF_TRUE"
|
|
:0x30 "CALL" :0x31 "TAIL_CALL" :0x32 "RETURN"
|
|
:0x33 "CLOSURE" :0x34 "CALL_PRIM" :0x35 "APPLY"
|
|
:0x40 "LIST" :0x41 "DICT" :0x42 "APPEND!"
|
|
:0x50 "ITER_INIT" :0x51 "ITER_NEXT"
|
|
:0x52 "MAP_OPEN" :0x53 "MAP_APPEND" :0x54 "MAP_CLOSE"
|
|
:0x80 "DEFINE" :0x90 "STR_CONCAT" :0x92 "SERIALIZE"
|
|
:0xE0 "ASER_TAG" :0xE1 "ASER_FRAG"})
|