Two fixes:
1. HO forms (map/filter/for-each/reduce): registered as Python
primitives so compiler emits OP_CALL_PRIM (direct dispatch to
OCaml primitive) instead of OP_CALL (which routed through CEK
HO special forms and failed on NativeFn closure args).
2. Mutable closures: locals captured by closures now share an
upvalue_cell. OP_LOCAL_GET/SET check frame.local_cells first —
if the slot has a shared cell, read/write through it. OP_CLOSURE
creates or reuses cells for is_local=1 captures. Both parent
and closure see the same mutations.
Frame type extended with local_cells hashtable for captured slots.
40/40 tests pass:
- 12 compiler output tests
- 18 VM execution tests (arithmetic, control flow, closures,
nested let, higher-order, cond, string ops)
- 10 auto-compile pattern tests (recursive, map, filter,
for-each, mutable closures, multiple closures, type dispatch)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Added vm-compile command: iterates env, compiles lambdas to bytecode,
replaces with NativeFn VM wrappers (with CEK fallback on error).
Tested: 3/109 compile, reduces CEK steps 23%.
Disabled auto-compile in production — the compiler doesn't handle
closures with upvalues yet, and compiled functions that reference
dynamic env vars crash. Infrastructure stays for when compiler
handles all SX features.
Also: added set-nth! and mutable-list primitives (needed by
compiler.sx for bytecode patching). Fixed compiler.sx to use
mutable lists on OCaml (ListRef for append!/set-nth! mutation).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
End-to-end pipeline working:
Python compiler.sx → bytecode → OCaml VM → result
Verified: (+ (* 3 4) 2) → 14 ✓
(+ 0 1 2 ... 49) → 1225 ✓
Benchmark (500 iterations, 50 additions each):
CEK machine: 327ms
Bytecode VM: 145ms
Speedup: 2.2x
VM handles: constants, local variables, global variables,
primitive calls, jumps, conditionals, closures (via NativeFn
wrapper), define, return.
Protocol: (vm-exec {:bytecode (...) :constants (...)})
- Compiler outputs clean format (no internal index dict)
- VM converts bytecode list to int array, constants to value array
- Stack-based execution with direct opcode dispatch
The 2.2x speedup is for pure arithmetic. For aser (the real
target), the speedup will be larger because aser involves:
- String building (no CEK frame allocation in VM)
- Map/filter iterations (no frame-per-iteration in VM)
- Closure calls (no thunk/trampoline in VM)
Next: compile and run the aser adapter on the VM.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three new files forming the bytecode compilation pipeline:
spec/bytecode.sx — opcode definitions (~65 ops):
- Stack/constant ops (CONST, NIL, TRUE, POP, DUP)
- Lexical variable access (LOCAL_GET/SET, UPVALUE_GET/SET, GLOBAL_GET/SET)
- Jump-based control flow (JUMP, JUMP_IF_FALSE/TRUE)
- Function ops (CALL, TAIL_CALL, RETURN, CLOSURE, CALL_PRIM)
- HO form ops (ITER_INIT/NEXT, MAP_OPEN/APPEND/CLOSE)
- Scope/continuation ops (SCOPE_PUSH/POP, RESET, SHIFT)
- Aser specialization (ASER_TAG, ASER_FRAG)
spec/compiler.sx — SX-to-bytecode compiler (SX code, portable):
- Scope analysis: resolve variables to local/upvalue/global at compile time
- Tail position detection for TCO
- Code generation for: if, when, and, or, let, begin, lambda,
define, set!, quote, function calls, primitive calls
- Constant pool with deduplication
- Jump patching for forward references
hosts/ocaml/lib/sx_vm.ml — bytecode interpreter (OCaml):
- Stack-based VM with array-backed operand stack
- Call frames with base pointer for locals
- Direct opcode dispatch via pattern match
- Zero allocation per step (unlike CEK machine's dict-per-step)
- Handles: constants, variables, jumps, calls, primitives,
collections, string concat, define
Architecture: compiler.sx is spec (SX, portable). VM is platform
(OCaml-native). Same bytecode runs on JS/WASM VMs.
Also includes: CekFrame record optimization in transpiler.sx
(29 frame types as records instead of Hashtbl).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Transpiler detects dict literals with a "type" string field and emits
CekFrame records instead of Dict(Hashtbl). Maps frame-specific fields
to generic record slots:
cf_type, cf_env, cf_name, cf_body, cf_remaining, cf_f,
cf_args (also evaled), cf_results (also raw-args),
cf_extra (ho-type/scheme/indexed/match-val/current-item/...),
cf_extra2 (emitted/effect-list/first-render)
Runtime get_val handles CekFrame with direct field match — O(1)
field access vs Hashtbl.find.
Bootstrapper: skip stdlib.sx entirely (already OCaml primitives).
Result: 29 CekFrame + 2 CekState = 31 record types, only 8
Hashtbl.create remaining (effect-annotations, empty dicts).
Benchmark (200 divs): 2.94s → 1.71s (1.7x speedup from baseline).
Real pages: ~same as CekState-only (frames are <20% of allocations;
states dominate at 199K/page).
Foundation for JIT: record-based value representation enables
typed compilation — JIT can emit direct field access instead of
hash table lookups.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Transpiler (transpiler.sx): detects CEK state dict literals (5 fields:
control/env/kont/phase/value) and emits CekState OCaml record instead
of Dict(Hashtbl). Eliminates 200K Hashtbl allocations per page.
Bootstrapper: skip stdlib.sx (functions already registered as OCaml
primitives). Only transpile evaluator.sx.
Runtime: get_val handles CekState with direct field access. type_of
returns "dict" for CekState (backward compat).
Profiling results (root cause of slowness):
Pure eval: OCaml 1.6x FASTER than Python (expected)
Aser: OCaml 28x SLOWER than Python (unexpected!)
Root cause: Python has a native optimized aser. OCaml runs the SX
adapter-sx.sx through the CEK machine — each aserCall is ~50 CEK
steps with closures, scope operations, string building.
Fix needed: native OCaml aser (like Python's), not SX adapter
through CEK machine.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
spec-introspect.sx: pure SX functions that read, parse, and analyze
spec files. No Python. The spec IS data — a macro transforms it into
explorer UI components.
- spec-explore: reads spec file via IO, parses with sx-parse, extracts
sections/defines/effects/params, produces explorer data dict
- spec-form-name/kind/effects/params/source: individual extractors
- spec-group-sections: groups defines into sections
- spec-compute-stats: aggregate effect/define counts
OCaml kernel fixes:
- nth handles strings (character indexing for parser)
- ident-start?, ident-char?, char-numeric?, parse-number: platform
primitives needed by spec/parser.sx when loaded at runtime
- _find_spec_file: searches spec/, web/, shared/sx/ref/ for spec files
83/84 Playwright tests pass. The 1 failure is client-side re-rendering
of the spec explorer (the client evaluates defpage content which calls
find-spec — unavailable on the client).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
SX-to-OCaml transpiler (transpiler.sx) generates sx_ref.ml (~90KB, ~135
mutually recursive functions) from the spec evaluator. Foundation tests
all pass: parser, primitives, env operations, type system.
Key design decisions:
- Env variant added to value type for CEK state dict storage
- Continuation carries optional data dict for captured frames
- Dynamic var tracking distinguishes OCaml fn calls from SX value dispatch
- Single let rec...and block for forward references between all defines
- Unused ref pre-declarations eliminated via let-bound name detection
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>