Idents, ctors, 51 keywords, numbers (int/float/hex/exp/underscored), strings + chars with escapes, type variables, 26 op/punct tokens, and nested (* ... *) block comments. Tests via epoch protocol against sx_server.exe.
17 KiB
OCaml-on-SX: OCaml + ReasonML + Dream on the CEK/VM
The meta-circular demo: SX's native evaluator is OCaml, so implementing OCaml on top of SX closes the loop — the source language of the host is running inside the host it compiles to. Beyond the elegance, it's practically useful: once OCaml expressions run on the SX CEK/VM you get Dream (a clean OCaml web framework) almost for free, and ReasonML is a syntax variant that shares the same transpiler output.
End-state goal: OCaml programs running on the SX CEK/VM, with enough of the standard
library to support Dream's middleware model. Dream-on-SX is the integration target —
a handler/middleware/router API that feels idiomatic while running purely in SX.
ReasonML (Phase 8) adds an alternative syntax frontend that targets the same transpiler.
What this covers that nothing else in the set does
- Strict ML semantics — unlike Haskell, OCaml is call-by-value with explicit
Lazy.tfor laziness. Pattern match is exhaustive. Polymorphic variants. Structural equality. - First-class modules and functors — modules as values (phase 4); functors as SX higher-order functions over module records. Unlike Haskell typeclasses, OCaml's module system is explicit and compositional.
- Mutable state without monads —
ref,:=,!are primitives. Arrays.Hashtbl. The IO model is direct;Lwt/Dream map toperform/cek-resumefor async. - Dream's composable HTTP model —
handler = request -> response promise,middleware = handler -> handler. Algebraically clean;@@composition maps to SX function composition trivially. - ReasonML — same semantics, JS-friendly surface syntax. JSX variant pairs with SX component rendering.
Ground rules
- Scope: only touch
lib/ocaml/**,lib/dream/**,lib/reasonml/**, andplans/ocaml-on-sx.md. Do not editspec/,hosts/,shared/, or otherlib/<lang>/. - Shared-file issues go under "Blockers" below with a minimal repro; do not fix here.
- SX files: use
sx-treeMCP tools only. - Architecture: OCaml source → AST → SX AST → CEK. No standalone OCaml evaluator.
The OCaml AST is walked by an
ocaml-evalfunction in SX that produces SX values. - Type system: deferred until Phase 5. Phases 1–4 are intentionally untyped — get the evaluator right first, then layer HM inference on top.
- Dream: implemented as a library in Phase 7; no separate build step.
Dream.runwraps SX's existing HTTP server machinery viaperform/cek-resume. - Commits: one feature per commit. Keep
## Progress logupdated and tick boxes.
Architecture sketch
OCaml source text
│
▼
lib/ocaml/tokenizer.sx — keywords, operators, string/char literals, comments
│
▼
lib/ocaml/parser.sx — OCaml AST: let/let rec, fun, match, if, begin/end,
│ module/struct/functor, type decls, expressions
▼
lib/ocaml/desugar.sx — surface → core: tuple patterns, or-patterns,
│ sequence (;) → (do), when guards, field punning
▼
lib/ocaml/transpile.sx — OCaml AST → SX AST
│
▼
lib/ocaml/runtime.sx — ADT constructors, module primitives, ref/array ops,
│ Stdlib shims, Dream server (phase 7)
▼
SX CEK evaluator (both JS and OCaml hosts)
Semantic mappings
| OCaml construct | SX mapping |
|---|---|
let x = e (top-level) |
(define x e) |
let f x y = e |
(define (f x y) e) |
let rec f x = e |
(define (f x) e) — SX define is already recursive |
fun x -> e |
(fn (x) e) |
e1 |> f |
(f e1) — pipe desugars to reverse application |
e1; e2 |
(do e1 e2) |
begin e1; e2; e3 end |
(do e1 e2 e3) |
if c then e1 else e2 |
(if c e1 e2) |
match x with | P -> e |
(match x (P e) ...) via Phase 6 ADT primitive |
type t = A | B of int |
(define-type t (A) (B v)) |
module M = struct ... end |
SX dict {:let-bindings ...} — module as record |
functor (M : S) -> ... |
(fn (M) ...) — functor as SX lambda over module record |
open M |
inject M's bindings into scope via env-merge |
M.field |
(get M :field) |
{ r with f = v } |
(dict-set r :f v) |
ref x |
(make-ref x) — mutable cell |
!r |
(deref-ref r) |
r := v |
(set-ref! r v) |
(a, b, c) |
tagged list (:tuple a b c) |
[1; 2; 3] |
(list 1 2 3) |
| `[ | 1; 2; 3 |
try e with | Ex -> h |
(guard (fn (ex) h) e) via SX exception system |
raise Ex |
(perform (:raise Ex)) |
Printf.printf "%d" x |
(perform (:print (format "%d" x))) |
Dream semantic mappings (Phase 7)
| Dream construct | SX mapping |
|---|---|
handler = request -> response promise |
(fn (req) (perform (:http-respond ...))) |
middleware = handler -> handler |
(fn (next) (fn (req) ...)) |
Dream.router [routes] |
(ocaml-dream-router routes) — dispatch on method+path |
Dream.get "/path" h |
route record {:method "GET" :path "/path" :handler h} |
Dream.scope "/p" [ms] [rs] |
prefix mount with middleware chain |
Dream.param req "name" |
path param extracted during routing |
m1 @@ m2 @@ handler |
(m1 (m2 handler)) — left-fold composition |
Dream.session_field req "k" |
(perform (:session-get req "k")) |
Dream.set_session_field req "k" v |
(perform (:session-set req "k" v)) |
Dream.flash req |
(perform (:flash-get req)) |
Dream.form req |
(perform (:form-parse req)) — returns Ok/Error ADT |
Dream.websocket handler |
(perform (:websocket handler)) |
Dream.run handler |
starts SX HTTP server with handler as root |
Roadmap
Phase 1 — Tokenizer + parser
- Tokenizer: keywords (
let,rec,in,fun,function,match,with,type,of,module,struct,end,functor,sig,open,include,if,then,else,begin,try,exception,raise,mutable,for,while,do,done,and,as,when), operators (->,|>,<|,@@,@,:=,!,::,**,:,;,;;), identifiers (lower, upper/ctor), char literals'c', string literals (escaped), int/float literals (incl. hex, exponent, underscores), nested block comments(* ... *). (labels~label:/?label:and heredoc{|...|}deferred — surface tokens already work via~/?punct +{/|punct.) - Parser: top-level
let/let rec/type/module/exception/open/includedeclarations; expressions: literals, identifiers, constructor application, lambda, application (left-assoc), binary ops with precedence table,if/then/else,match/with,try/with,let/in,begin/end,fun/function, tuples, list literals, record literals/updates, field access, sequences;, unit(). - Patterns: constructor, literal, variable, wildcard
_, tuple, list cons::, list literal, record,as, or-patternP1 | P2,whenguard. - OCaml is not indentation-sensitive — no layout algorithm needed.
- Tests in
lib/ocaml/tests/parse.sx— 50+ round-trip parse tests.
Phase 2 — Core evaluator (untyped)
ocaml-evalentry: walks OCaml AST, produces SX values.let/let rec/let ... in(mutually recursive withand).- Lambda + application (curried by default — auto-curry multi-param defs).
fun/function(single-arg lambda with immediate match on arg).if/then/else,begin/end, sequence;.- Arithmetic, comparison, boolean ops, string
^,mod. - Unit
()value;ignore. - References:
ref,!,:=. - Mutable record fields.
for i = lo to hi do ... doneloop;while cond do ... done.try/with— maps to SXguard;raisevia perform.- Tests in
lib/ocaml/tests/eval.sx— 50+ tests, pure + imperative.
Phase 3 — ADTs + pattern matching
typedeclarations:type t = A | B of t1 * t2 | C of { x: int }.- Constructors as tagged lists:
A→(:A),B(1, "x")→(:B 1 "x"). match/with: constructor, literal, variable, wildcard, tuple, list cons/nil,asbinding, or-patterns, nested patterns,whenguard.- Exhaustiveness: runtime error on incomplete match (no compile-time check yet).
- Built-in types:
option(None/Some),result(Ok/Error),list(nil/cons),bool,unit,exn. exceptiondeclarations; built-in:Not_found,Invalid_argument,Failure,Match_failure.- Polymorphic variants (surface syntax
\Tag value`; runtime same tagged list). - Tests in
lib/ocaml/tests/adt.sx— 40+ tests: ADTs, match, option/result.
Phase 4 — Modules + functors
module M = struct let x = 1 let f y = x + y end→ SX dict{:x 1 :f <fn>}.module type S = sig val x : int val f : int -> int end→ interface record (runtime stub; typed checking in Phase 5).module M : S = struct ... end— coercive sealing (runtime: pass-through).functor (M : S) -> struct ... end→ SX(fn (M) ...).module F = Functor(Base)— functor application.open M— merge M's dict into current env (env-merge).include M— same as open at structure level.M.name— dict get via:namekey.- First-class modules (pack/unpack) — deferred to Phase 5.
- Standard module hierarchy:
List,Option,Result,String,Char,Int,Float,Bool,Unit,Printf,Format(stubs, filled in Phase 6). - Tests in
lib/ocaml/tests/modules.sx— 30+ tests.
Phase 5 — Hindley-Milner type inference
- Algorithm W:
gen/inst,unify,infer-expr,infer-decl. - Type variables:
'a,'b; unification with occur-check. - Let-polymorphism: generalise at let-bindings.
- ADT types:
type 'a option = None | Some of 'a. - Function types, tuple types, record types.
- Type signatures:
val f : int -> int— verify against inferred type. - Module type checking: seal against
sig(Phase 4 stubs become real checks). - Error reporting: position-tagged errors with expected vs actual types.
- First-class modules:
(module M : S)pack;(val m : (module S))unpack. - No rank-2 polymorphism, no GADTs (out of scope).
- Tests in
lib/ocaml/tests/types.sx— 60+ inference tests.
Phase 6 — Standard library
List:map,filter,fold_left,fold_right,length,rev,append,concat,flatten,iter,iteri,mapi,for_all,exists,find,find_opt,mem,assoc,assq,sort,stable_sort,nth,hd,tl,init,combine,split,partition.Option:map,bind,fold,get,value,join,iter,to_list,to_result,is_none,is_some.Result:map,bind,fold,get_ok,get_error,map_error,to_option,is_ok,is_error.String:length,get,sub,concat,split_on_char,trim,uppercase_ascii,lowercase_ascii,contains,starts_with,ends_with,index_opt,replace_all(non-stdlib but needed).Char:code,chr,escaped,lowercase_ascii,uppercase_ascii.Int/Float: arithmetic,to_string,of_string_opt,min_int,max_int.Hashtbl:create,add,replace,find,find_opt,remove,mem,iter,fold,length— backed by SX mutable dict.Map.Makefunctor — balanced BST backed by SX sorted dict.Set.Makefunctor.Printf:sprintf,printf,eprintf— format strings via(format ...).Sys:argv,getenv_opt,getcwd— viaperformIO.- Scoreboard runner:
lib/ocaml/conformance.sh+scoreboard.json. - Target: 150+ tests across all stdlib modules.
Phase 7 — Dream web framework (lib/dream/)
The five types: request, response, handler = request -> response,
middleware = handler -> handler, route. Everything else is a function over these.
- Core types in
lib/dream/types.sx: request/response records, route record. - Router in
lib/dream/router.sx: -dream-get path handler,dream-post path handler, etc. for all HTTP methods. -dream-scope prefix middlewares routes— prefix mount with middleware chain. -dream-router routes— dispatch tree, returns handler; no match → 404. - Path param extraction::namesegments,**wildcard. -dream-param req name— retrieve matched path param. - Middleware in
lib/dream/middleware.sx: -dream-pipeline middlewares handler— compose middleware left-to-right. -dream-no-middleware— identity. - Logger:(dream-logger next req)— logs method, path, status, timing. - Content-type sniffer. - Sessions in
lib/dream/session.sx: - Cookie-backed session middleware. -dream-session-field req key,dream-set-session-field req key val. -dream-invalidate-session req. - Flash messages in
lib/dream/flash.sx: -dream-flash-middleware— single-request cookie store. -dream-add-flash-message req category msg. -dream-flash-messages req— returns list of(category, msg). - Forms + CSRF in
lib/dream/form.sx: -dream-form req— returns(Ok fields)or(Err :csrf-token-invalid). -dream-multipart req— streaming multipart form data. - CSRF middleware: stateless signed tokens, session-scoped. -dream-csrf-tag req— returns hidden input fragment for SX templates. - WebSockets in
lib/dream/websocket.sx: -dream-websocket handler— upgrades request; handler(fn (ws) ...). -dream-send ws msg,dream-receive ws,dream-close ws. - Static files:
dream-static root-path— serves files, ETags, range requests. dream-run: wires root handler into SX'sperform (:http-listen ...).- Demos in
lib/dream/demos/: -hello.ml→lib/dream/demos/hello.sx: "Hello, World!" route. -counter.ml→lib/dream/demos/counter.sx: in-memory counter with sessions. -chat.ml→lib/dream/demos/chat.sx: multi-room WebSocket chat. -todo.ml→lib/dream/demos/todo.sx: CRUD list with forms + CSRF. - Tests in
lib/dream/tests/: routing dispatch, middleware composition, session round-trip, CSRF accept/reject, flash read-after-write — 60+ tests.
Phase 8 — ReasonML syntax variant (lib/reasonml/)
ReasonML is OCaml with a JS-friendly surface: semicolons, let with = everywhere,
=> for lambdas, switch for match, {j|...|j} string interpolation. Same semantics —
different tokenizer + parser, same lib/ocaml/transpile.sx output.
- Tokenizer in
lib/reasonml/tokenizer.sx: -let x = e;binding syntax (semicolons required). -(x, y) => earrow function syntax. -switch (x) { | Pat => e | ... }for match. - JSX:<Comp prop=val />,<div>children</div>. - String interpolation:{j|hello $(name)|j}. - Type annotations:x : int,let f : int => int = x => x + 1. - Parser in
lib/reasonml/parser.sx: - Produce same OCaml AST nodes aslib/ocaml/parser.sx. - JSX → SX component calls:<Comp x=1 />→(~comp :x 1). - Multi-arg functions:(x, y) => e→ auto-curried pair. - Shared transpiler:
lib/reasonml/transpile.sxdelegates tolib/ocaml/transpile.sx(parse → ReasonML AST → OCaml AST → SX AST). - Tests in
lib/reasonml/tests/: tokenizer, parser, eval, JSX — 40+ tests. - ReasonML Dream demos: translate Phase 7 demos to ReasonML syntax.
The meta-circular angle
SX is bootstrapped to OCaml (hosts/ocaml/). Running OCaml inside SX running on OCaml is
the "mother tongue" closure: OCaml → SX → OCaml. This means:
- The OCaml host's native pattern matching and ADTs are exact reference semantics for the SX-level implementation — any mismatch is a bug.
- The SX
match/define-typeprimitives (Phase 6 of the primitives roadmap) were built knowing OCaml was the intended target. - When debugging the transpiler, the OCaml REPL is always available as oracle.
- Dream running in SX can serve the sx.rose-ash.com docs site — the framework that describes the runtime it runs on.
Key dependencies
- Phase 6 ADT primitive (
define-type/match) — required before Phase 3. perform/cek-resumeIO suspension — required before Phase 7 (Dream async).- HO forms and first-class lambdas — already in spec, no blocker.
- Module system (Phase 4) is independent of type inference (Phase 5) — can overlap.
- ReasonML (Phase 8) can start once OCaml parser is stable (after Phase 2).
Progress log
Newest first.
- 2026-05-07 Phase 1 —
lib/ocaml/tokenizer.sxconsuminglib/guest/lex.sxviaprefix-rename. Covers idents, ctors, 51 keywords, numbers (int / float / hex / exponent / underscored), strings (with escapes), chars (with escapes), type variables ('a), nested block comments, and 26 operator/punct tokens (incl.->|><-:=::;;@@<>&&||**etc.). 58/58 tokenizer tests pass vialib/ocaml/test.shdrivingsx_server.exe.
Blockers
(none yet)