Files
rose-ash/sx/sx/plans/wasm-bytecode-vm.sx
giles b0920a1121 Rename all 1,169 components to path-based names with namespace support
Component names now reflect filesystem location using / as path separator
and : as namespace separator for shared components:
  ~sx-header → ~layouts/header
  ~layout-app-body → ~shared:layout/app-body
  ~blog-admin-dashboard → ~admin/dashboard

209 files, 4,941 replacements across all services.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:00:12 +00:00

261 lines
24 KiB
Plaintext

;; ---------------------------------------------------------------------------
;; WASM Bytecode VM — Compile SX to bytecode, run in Rust/WASM
;; ---------------------------------------------------------------------------
(defcomp ~plans/wasm-bytecode-vm/plan-wasm-bytecode-vm-content ()
(~docs/page :title "WASM Bytecode VM"
(~docs/section :title "The Idea" :id "idea"
(p "Currently the client-side SX runtime is a tree-walking interpreter bootstrapped to JavaScript. The server sends " (strong "SX source text") " — component definitions, page content — and the browser parses and evaluates it.")
(p "The alternative: compile SX to a " (strong "compact bytecode format") ", ship bytecode to the browser, and execute it in a " (strong "WebAssembly VM written in Rust") ". The VM calls out to JavaScript for DOM operations and I/O via standard WASM↔JS bindings.")
(p "This fits naturally into the SX host architecture. Rust becomes another bootstrapper target. The spec compiles to Rust the same way it compiles to Python and JavaScript. The WASM module is the client-side expression of that Rust target."))
;; -----------------------------------------------------------------------
;; Why
;; -----------------------------------------------------------------------
(~docs/section :title "Why" :id "why"
(ul :class "list-disc list-inside space-y-2"
(li (strong "Wire size") " — bytecode is far more compact than source text. No redundant whitespace, no comments, no repeated symbol names. A component bundle that's 40KB of SX source might be 8KB of bytecode.")
(li (strong "No parse overhead") " — the browser currently parses every SX source string (tokenize → AST → eval). Bytecode skips parsing entirely.")
(li (strong "Eval performance") " — a Rust VM with a tight dispatch loop is significantly faster than tree-walking in JavaScript. Matters for compute-heavy islands, large list rendering, complex CSSX calculations.")
(li (strong "Rust as a host target") " — an architectural goal. The spec should compile to every host. Rust/WASM proves the architecture is truly portable.")
(li (strong "Content-addressed bytecode") " — bytecode modules have deterministic content hashes (CIDs). Fits perfectly with the content-addressed components plan — fetch bytecode by CID from anywhere.")))
;; -----------------------------------------------------------------------
;; Architecture
;; -----------------------------------------------------------------------
(~docs/section :title "Architecture" :id "architecture"
(p "Three new layers, all specced in " (code ".sx") " and bootstrapped:")
(h4 :class "font-semibold mt-4 mb-2" "1. Bytecode format — bytecode.sx")
(p "A spec for the bytecode instruction set. Stack-based VM (simpler than register-based, natural fit for s-expressions). Instructions:")
(~docs/code :code (highlight ";; Core instructions\nPUSH_CONST idx ;; push constant from pool\nPUSH_NIL ;; push nil\nPUSH_TRUE / PUSH_FALSE\nLOOKUP idx ;; look up symbol by index\nSET idx ;; define/set symbol\nCALL n ;; call top-of-stack with n args\nTAIL_CALL n ;; tail call (TCO)\nRETURN\nJUMP offset ;; unconditional jump\nJUMP_IF_FALSE offset ;; conditional jump\nMAKE_LAMBDA idx n_params ;; create closure\nMAKE_LIST n ;; collect n stack values into list\nMAKE_DICT n ;; collect 2n stack values into dict\nPOP ;; discard top\nDUP ;; duplicate top" "lisp"))
(p "Bytecode modules contain: a " (strong "constant pool") " (strings, numbers, symbols), a " (strong "code section") " (instruction bytes), and a " (strong "metadata section") " (source maps, component/island declarations for the host to register).")
(h4 :class "font-semibold mt-4 mb-2" "2. Compiler — compile.sx")
(p "An SX-to-bytecode compiler, " (strong "written in SX") ". Takes parsed AST, emits bytecode modules. Handles:")
(ul :class "list-disc list-inside space-y-2 mt-2"
(li (strong "Macro expansion") " — all macros expanded at compile time. The VM never sees macros.")
(li (strong "Constant folding") " — pure expressions with known values computed at compile time.")
(li (strong "Closure analysis") " — determines free variables for each lambda, emits efficient capture instructions.")
(li (strong "Tail call detection") " — emits TAIL_CALL instead of CALL + RETURN for tail positions.")
(li (strong "Component metadata") " — defcomp/defisland declarations are extracted and stored in the module metadata, so the host can register them without evaluating the body."))
(p "Bootstrapped to Python (server-side compilation) and Rust (if self-compilation is needed).")
(h4 :class "font-semibold mt-4 mb-2" "3. VM — bootstrap_rs.py → Rust/WASM")
(p "A Rust implementation of the SX platform interface. The bootstrapper (" (code "bootstrap_rs.py") ") translates the spec to Rust source, which compiles to both:")
(ul :class "list-disc list-inside space-y-2 mt-2"
(li (strong "Native binary") " — for server-side evaluation (replaces Python evaluators entirely)")
(li (strong "WASM module") " — for browser-side evaluation (replaces sx-browser.js)")))
;; -----------------------------------------------------------------------
;; DOM interop
;; -----------------------------------------------------------------------
(~docs/section :title "DOM Interop" :id "dom-interop"
(p "The main engineering challenge. Every DOM operation crosses the WASM↔JS boundary. Two strategies:")
(h4 :class "font-semibold mt-4 mb-2" "Strategy A: Direct calls")
(p "Each DOM operation (" (code "createElement") ", " (code "setAttribute") ", " (code "appendChild") ") is a separate WASM→JS call. Simple, works, but ~50ns overhead per call. For a page with 1,000 DOM operations, that's ~50μs — negligible.")
(~docs/code :code (highlight "// JS side — imported by WASM\nfunction domCreateElement(tag_ptr, tag_len) {\n const tag = readString(tag_ptr, tag_len);\n return storeHandle(document.createElement(tag));\n}\n\n// Rust side\nextern \"C\" { fn dom_create_element(tag: *const u8, len: u32) -> u32; }" "javascript"))
(h4 :class "font-semibold mt-4 mb-2" "Strategy B: Command buffer")
(p "Batch DOM operations in WASM memory as a command buffer. Flush to JS in one call. JS walks the buffer and applies all operations. Fewer boundary crossings, but more complex.")
(~docs/code :code (highlight ";; Command buffer format (in shared WASM memory)\n;; [CREATE_ELEMENT, tag_idx, handle_out]\n;; [SET_ATTR, handle, key_idx, val_idx]\n;; [APPEND_CHILD, parent_handle, child_handle]\n;; [SET_TEXT, handle, text_idx]\n;; Then: (flush-dom-commands)" "lisp"))
(p "Strategy A is simpler and sufficient for SX workloads. Strategy B is an optimisation if profiling shows the boundary crossing matters. " (strong "Start with A, measure, switch to B only if needed.")))
;; -----------------------------------------------------------------------
;; String handling
;; -----------------------------------------------------------------------
(~docs/section :title "String Handling" :id "strings"
(p "WASM has no native string type. Strings must cross the boundary via shared " (code "ArrayBuffer") " memory. Options:")
(ul :class "list-disc list-inside space-y-2 mt-2"
(li (strong "Copy on crossing") " — encode to UTF-8 in WASM linear memory, JS reads via " (code "TextDecoder") ". Simple, safe, ~1μs per string.")
(li (strong "String interning") " — the constant pool already interns all string literals. Assign each a numeric ID. DOM operations reference strings by ID. JS maintains a parallel string table. Strings never cross the boundary — only IDs do.")
(li (strong "Hybrid") " — intern constants (attribute names, tag names, class names), copy dynamic strings (computed CSS values, interpolated text)."))
(p "String interning is the right default — most DOM attribute values in SX are constant strings. Dynamic strings (like CSSX colour output) are the minority. The constant pool already has all the static strings; just share it with JS at init time."))
;; -----------------------------------------------------------------------
;; Memory management
;; -----------------------------------------------------------------------
(~docs/section :title "Memory & Closures" :id "memory"
(p "SX values that the VM must manage:")
(ul :class "list-disc list-inside space-y-2 mt-2"
(li (strong "Closures") " — lambda captures free variables. Rust: " (code "Rc<Closure>") " with captured env as " (code "Vec<Value>") ".")
(li (strong "Signals") " — reference-counted mutable cells with subscriber lists. Subscribers hold weak references to computed nodes to avoid cycles.")
(li (strong "Lists/Dicts") " — immutable by convention (SX doesn't mutate collections). Arena-allocate per evaluation, free the arena when done.")
(li (strong "DOM handles") " — opaque integers referencing JS-side DOM nodes. A handle table in JS maps handle IDs to actual DOM objects. Handles are freed when the island disposes."))
(p "The " (code "with-island-scope") " pattern already models the cleanup boundary. In Rust: each island gets its own " (code "Arena") " + signal scope. When the island is removed from the DOM, drop the arena — all closures, signals, and DOM handles for that island are freed in one shot."))
;; -----------------------------------------------------------------------
;; What gets compiled
;; -----------------------------------------------------------------------
(~docs/section :title "What Gets Compiled" :id "compilation"
(p "Not everything needs bytecode. The compilation boundary follows the existing server/client split:")
(div :class "overflow-x-auto rounded border border-stone-200 mb-4"
(table :class "w-full text-left text-sm"
(thead (tr :class "border-b border-stone-200 bg-stone-100"
(th :class "px-3 py-2 font-medium text-stone-600" "Content")
(th :class "px-3 py-2 font-medium text-stone-600" "Format")
(th :class "px-3 py-2 font-medium text-stone-600" "Why")))
(tbody
(tr :class "border-b border-stone-100"
(td :class "px-3 py-2 text-stone-700" "Component definitions")
(td :class "px-3 py-2 text-stone-700" "Bytecode")
(td :class "px-3 py-2 text-stone-600" "Evaluated on every page load, benefits from fast dispatch"))
(tr :class "border-b border-stone-100"
(td :class "px-3 py-2 text-stone-700" "Client library (@client files)")
(td :class "px-3 py-2 text-stone-700" "Bytecode")
(td :class "px-3 py-2 text-stone-600" "CSSX functions, colour computation — pure code that runs client-side"))
(tr :class "border-b border-stone-100"
(td :class "px-3 py-2 text-stone-700" "Page content (SX wire responses)")
(td :class "px-3 py-2 text-stone-700" "SX source or bytecode")
(td :class "px-3 py-2 text-stone-600" "Wire responses are small, parse overhead minimal. Bytecode optional."))
(tr :class "border-b border-stone-100"
(td :class "px-3 py-2 text-stone-700" "Macros")
(td :class "px-3 py-2 text-stone-700" "Expanded at compile time")
(td :class "px-3 py-2 text-stone-600" "VM never sees macros — they're pure compile-time constructs"))
(tr
(td :class "px-3 py-2 text-stone-700" "Server-affinity components")
(td :class "px-3 py-2 text-stone-700" "Not compiled")
(td :class "px-3 py-2 text-stone-600" "Expanded server-side, never sent to client"))))))
;; -----------------------------------------------------------------------
;; Bytecode vs direct WASM compilation
;; -----------------------------------------------------------------------
(~docs/section :title "Bytecode VM vs Direct WASM Compilation" :id "vm-vs-direct"
(p "Two paths to WASM. The choice matters:")
(div :class "overflow-x-auto rounded border border-stone-200 mb-4"
(table :class "w-full text-left text-sm"
(thead (tr :class "border-b border-stone-200 bg-stone-100"
(th :class "px-3 py-2 font-medium text-stone-600" "")
(th :class "px-3 py-2 font-medium text-stone-600" "Bytecode VM in WASM")
(th :class "px-3 py-2 font-medium text-stone-600" "Compile SX → WASM directly")))
(tbody
(tr :class "border-b border-stone-100"
(td :class "px-3 py-2 font-semibold text-stone-700" "Complexity")
(td :class "px-3 py-2 text-stone-700" "Standard VM design — proven pattern")
(td :class "px-3 py-2 text-stone-700" "Full compiler backend (SSA, register alloc, WASM codegen)"))
(tr :class "border-b border-stone-100"
(td :class "px-3 py-2 font-semibold text-stone-700" "Dynamic loading")
(td :class "px-3 py-2 text-stone-700" "Trivial — load bytecode module, eval")
(td :class "px-3 py-2 text-stone-700" "Hard — must instantiate new WASM module per chunk"))
(tr :class "border-b border-stone-100"
(td :class "px-3 py-2 font-semibold text-stone-700" "eval / REPL")
(td :class "px-3 py-2 text-stone-700" "Works — compile + eval at runtime")
(td :class "px-3 py-2 text-stone-700" "Impossible without bundling a compiler"))
(tr :class "border-b border-stone-100"
(td :class "px-3 py-2 font-semibold text-stone-700" "Performance")
(td :class "px-3 py-2 text-stone-700" "Fast — WASM dispatch loop, no JS overhead")
(td :class "px-3 py-2 text-stone-700" "Fastest — native WASM speed, no dispatch"))
(tr :class "border-b border-stone-100"
(td :class "px-3 py-2 font-semibold text-stone-700" "Module size")
(td :class "px-3 py-2 text-stone-700" "One VM module (~100KB) + bytecode per page")
(td :class "px-3 py-2 text-stone-700" "Per-page WASM modules, each self-contained"))
(tr
(td :class "px-3 py-2 font-semibold text-stone-700" "Debugging")
(td :class "px-3 py-2 text-stone-700" "Source maps over bytecode")
(td :class "px-3 py-2 text-stone-700" "DWARF debug info in WASM")))))
(p (strong "Bytecode VM is the right choice.") " SX needs dynamic loading (HTMX responses inject new components), runtime eval (islands, reactive updates), and incremental compilation (page-by-page). Direct WASM compilation is better for static, ahead-of-time scenarios — not for a live hypermedia system."))
;; -----------------------------------------------------------------------
;; Dual target — same spec, runtime choice
;; -----------------------------------------------------------------------
(~docs/section :title "Dual Target: JS or WASM from the Same Spec" :id "dual-target"
(p "The key insight: this is " (strong "not a replacement") " for the JS evaluator. It's " (strong "another compilation target from the same spec") ". The existing bootstrapper pipeline already proves this pattern:")
(~docs/code :code (highlight "eval.sx ──→ bootstrap_js.py ──→ sx-ref.js (browser, JS eval)\n ──→ bootstrap_py.py ──→ sx_ref.py (server, Python eval)\n ──→ bootstrap_rs.py ──→ sx-vm.wasm (browser, WASM eval) ← new" "text"))
(p "All three outputs have identical semantics because they're compiled from the same source. The choice of which to use is a " (strong "deployment decision") ", not an architectural one:")
(ul :class "list-disc list-inside space-y-2 mt-2"
(li (strong "JS-only") " — current default. Works everywhere. Zero WASM dependency. Ship sx-browser.js + SX source text.")
(li (strong "WASM-only") " — maximum performance. Ship sx-vm.wasm + bytecode. Requires WASM support (99%+ of browsers).")
(li (strong "Progressive") " — try WASM, fall back to JS. Ship both. The server sends bytecode in a " (code "<script type=\"text/sx-bytecode\">") " tag and source in " (code "<script type=\"text/sx\">") ". The boot script picks whichever runtime loaded.")
(li (strong "Per-page") " — heavy pages (data tables, complex islands) use WASM. Simple pages use JS. The server decides based on page complexity."))
(p "The server-side choice is the same pattern. The Python bootstrapped evaluator (" (code "sx_ref.py") ") and the Rust native binary are interchangeable. A deployment could use Rust for production (speed) and Python for development (debugging, hot-reload).")
(p "This is the architectural payoff of the self-hosting spec. Write the semantics once. Compile to every target. Choose at deploy time which targets to use. " (strong "The spec doesn't know or care which host runs it.")))
;; -----------------------------------------------------------------------
;; Implementation phases
;; -----------------------------------------------------------------------
(~docs/section :title "Implementation Phases" :id "phases"
(div :class "overflow-x-auto rounded border border-stone-200 mb-4"
(table :class "w-full text-left text-sm"
(thead (tr :class "border-b border-stone-200 bg-stone-100"
(th :class "px-3 py-2 font-medium text-stone-600" "Phase")
(th :class "px-3 py-2 font-medium text-stone-600" "What")
(th :class "px-3 py-2 font-medium text-stone-600" "Deliverable")))
(tbody
(tr :class "border-b border-stone-100"
(td :class "px-3 py-2 text-stone-700" "1")
(td :class "px-3 py-2 text-stone-700" "Spec the bytecode format in bytecode.sx. Instruction set, constant pool layout, module structure, encoding.")
(td :class "px-3 py-2 text-stone-600" "bytecode.sx — format spec + serializer/deserializer"))
(tr :class "border-b border-stone-100"
(td :class "px-3 py-2 text-stone-700" "2")
(td :class "px-3 py-2 text-stone-700" "Write the compiler in SX. AST → bytecode. Macro expansion, constant folding, tail call detection, closure analysis.")
(td :class "px-3 py-2 text-stone-600" "compile.sx — bootstrapped to Python for server-side compilation"))
(tr :class "border-b border-stone-100"
(td :class "px-3 py-2 text-stone-700" "3")
(td :class "px-3 py-2 text-stone-700" "Write bootstrap_rs.py — Rust bootstrapper. Translates spec to Rust source implementing the platform interface.")
(td :class "px-3 py-2 text-stone-600" "bootstrap_rs.py + generated Rust crate"))
(tr :class "border-b border-stone-100"
(td :class "px-3 py-2 text-stone-700" "4")
(td :class "px-3 py-2 text-stone-700" "Implement bytecode VM in Rust. Dispatch loop, value representation, closure/env model, GC strategy.")
(td :class "px-3 py-2 text-stone-600" "sx-vm crate — native binary + WASM target"))
(tr :class "border-b border-stone-100"
(td :class "px-3 py-2 text-stone-700" "5")
(td :class "px-3 py-2 text-stone-700" "JS bindings for DOM. WASM imports for createElement, setAttribute, appendChild, event listeners. Handle table.")
(td :class "px-3 py-2 text-stone-600" "sx-vm-dom.js — JS glue layer (~2KB)"))
(tr :class "border-b border-stone-100"
(td :class "px-3 py-2 text-stone-700" "6")
(td :class "px-3 py-2 text-stone-700" "Server-side bytecode compilation pipeline. Component registration emits bytecode alongside source. Bytecode hash for caching.")
(td :class "px-3 py-2 text-stone-600" "Bytecode in data-components script tags, fallback to source"))
(tr
(td :class "px-3 py-2 text-stone-700" "7")
(td :class "px-3 py-2 text-stone-700" "Shadow-compare: run JS evaluator and WASM VM in parallel, assert identical DOM output on every page render.")
(td :class "px-3 py-2 text-stone-600" "Confidence to switch over"))))))
;; -----------------------------------------------------------------------
;; Interaction with existing plans
;; -----------------------------------------------------------------------
(~docs/section :title "Interaction with Other Plans" :id "interactions"
(ul :class "list-disc list-inside space-y-2"
(li (strong "Async Eval Convergence") " — must complete first. The spec must be the single evaluator before we add another target. Otherwise we'd be bootstrapping a fork.")
(li (strong "Runtime Slicing") " — the WASM module can be tiered just like the JS runtime. L0 hypermedia needs no VM at all (pure HTML). L1 DOM ops needs a minimal VM. L2 islands needs signals. The WASM module should be tree-shakeable.")
(li (strong "Content-Addressed Components") " — bytecode modules are ideal for content addressing. Deterministic compilation means the same SX source always produces the same bytecode → same CID. Fetch bytecode by CID from IPFS.")
(li (strong "Self-Hosting Bootstrappers") " — compile.sx is bootstrapped by the existing Python bootstrapper. The Rust bootstrapper translates the spec. Self-hosting chain: SX → Python → Rust → WASM.")
(li (strong "js.sx AOT compiler") " — complementary, not competing. js.sx compiles components to static JS for zero-runtime sites. The WASM VM is for dynamic sites with islands, signals, and runtime eval. Both are valid compilation targets.")))
;; -----------------------------------------------------------------------
;; Principles
;; -----------------------------------------------------------------------
(~docs/section :title "Principles" :id "principles"
(ul :class "list-disc list-inside space-y-2"
(li (strong "The spec remains the single source of truth.") " The bytecode format, compiler, and VM semantics are all specced in .sx. The Rust VM is just another host, like Python and JavaScript.")
(li (strong "Bytecode is an optimisation, not a requirement.") " SX source text remains a valid wire format. The system degrades gracefully — if WASM isn't available, fall back to the JS evaluator. Progressive enhancement.")
(li (strong "The VM is dumb, the compiler is smart.") " Macro expansion, constant folding, tail call detection — all done at compile time. The VM is a simple dispatch loop. This keeps the WASM module small and the compilation fast.")
(li (strong "DOM is on the other side of the wall.") " The VM never touches DOM directly. All DOM operations go through the JS binding layer. This keeps the WASM module platform-independent — the same VM could target Node.js, Deno, native desktop (via webview), or headless rendering.")
(li (strong "Shadow-compare before switching.") " Same principle as async eval convergence. Run both paths in parallel, assert identical output. No big-bang cutover.")))
;; -----------------------------------------------------------------------
;; Outcome
;; -----------------------------------------------------------------------
(~docs/section :title "Outcome" :id "outcome"
(p "After completion:")
(ul :class "list-disc list-inside space-y-2 mt-2"
(li "SX compiles to four targets: JavaScript, Python, Rust (native), Rust (WASM)")
(li "Client wire format is ~5x smaller (bytecode vs source text)")
(li "No parsing overhead on the client — bytecode loads directly into the VM")
(li "Reactive islands run at near-native speed inside WASM")
(li "The architecture proof is complete: one spec, every host, every target")
(li "Content-addressed bytecode modules can be fetched from any CDN or IPFS gateway")))))