Files
rose-ash/plans/ocaml-on-sx.md
giles c8bfd22786
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 53s
ocaml: phase 6 String/Char/Int/Float/Printf modules (+13 tests, 278 total)
Host primitives _string_length / _string_sub / _char_code / etc. exposed
in the base env (underscore-prefixed to avoid user clash). lib/ocaml/
runtime.sx wraps them into OCaml-syntax modules: String (length, get,
sub, concat, uppercase/lowercase_ascii, starts_with), Char (code, chr,
lowercase/uppercase_ascii), Int (to_string, of_string, abs, max, min),
Float.to_string, Printf stubs.

Also added print_string / print_endline / print_int IO builtins.
2026-05-08 09:10:06 +00:00

497 lines
28 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# OCaml-on-SX: OCaml + ReasonML + Dream on the CEK/VM
The meta-circular demo: SX's native evaluator is OCaml, so implementing OCaml on top of
SX closes the loop — the source language of the host is running inside the host it
compiles to. Beyond the elegance, it's practically useful: once OCaml expressions run on
the SX CEK/VM you get Dream (a clean OCaml web framework) almost for free, and ReasonML
is a syntax variant that shares the same transpiler output.
End-state goal: **OCaml programs running on the SX CEK/VM**, with enough of the standard
library to support Dream's middleware model. Dream-on-SX is the integration target —
a `handler`/`middleware`/`router` API that feels idiomatic while running purely in SX.
ReasonML (Phase 8) adds an alternative syntax frontend that targets the same transpiler.
## What this covers that nothing else in the set does
- **Strict ML semantics** — unlike Haskell, OCaml is call-by-value with explicit `Lazy.t`
for laziness. Pattern match is exhaustive. Polymorphic variants. Structural equality.
- **First-class modules and functors** — modules as values (phase 4); functors as SX
higher-order functions over module records. Unlike Haskell typeclasses, OCaml's module
system is explicit and compositional.
- **Mutable state without monads** — `ref`, `:=`, `!` are primitives. Arrays. `Hashtbl`.
The IO model is direct; `Lwt`/Dream map to `perform`/`cek-resume` for async.
- **Dream's composable HTTP model** — `handler = request -> response promise`,
`middleware = handler -> handler`. Algebraically clean; `@@` composition maps to SX
function composition trivially.
- **ReasonML** — same semantics, JS-friendly surface syntax. JSX variant pairs with SX
component rendering.
## Ground rules
- **Scope:** only touch `lib/ocaml/**`, `lib/dream/**`, `lib/reasonml/**`, and
`plans/ocaml-on-sx.md`. Do **not** edit `spec/`, `hosts/`, `shared/`, or other
`lib/<lang>/`.
- **Shared-file issues** go under "Blockers" below with a minimal repro; do not fix here.
- **SX files:** use `sx-tree` MCP tools only.
- **Architecture:** OCaml source → AST → SX AST → CEK. No standalone OCaml evaluator.
The OCaml AST is walked by an `ocaml-eval` function in SX that produces SX values.
- **Type system:** deferred until Phase 5. Phases 14 are intentionally untyped —
get the evaluator right first, then layer HM inference on top.
- **Dream:** implemented as a library in Phase 7; no separate build step. `Dream.run`
wraps SX's existing HTTP server machinery via `perform`/`cek-resume`.
- **Commits:** one feature per commit. Keep `## Progress log` updated and tick boxes.
## Architecture sketch
```
OCaml source text
lib/ocaml/tokenizer.sx — keywords, operators, string/char literals, comments
lib/ocaml/parser.sx — OCaml AST: let/let rec, fun, match, if, begin/end,
│ module/struct/functor, type decls, expressions
lib/ocaml/desugar.sx — surface → core: tuple patterns, or-patterns,
│ sequence (;) → (do), when guards, field punning
lib/ocaml/transpile.sx — OCaml AST → SX AST
lib/ocaml/runtime.sx — ADT constructors, module primitives, ref/array ops,
│ Stdlib shims, Dream server (phase 7)
SX CEK evaluator (both JS and OCaml hosts)
```
## Semantic mappings
| OCaml construct | SX mapping |
|----------------|-----------|
| `let x = e` (top-level) | `(define x e)` |
| `let f x y = e` | `(define (f x y) e)` |
| `let rec f x = e` | `(define (f x) e)` — SX define is already recursive |
| `fun x -> e` | `(fn (x) e)` |
| `e1 \|> f` | `(f e1)` — pipe desugars to reverse application |
| `e1; e2` | `(do e1 e2)` |
| `begin e1; e2; e3 end` | `(do e1 e2 e3)` |
| `if c then e1 else e2` | `(if c e1 e2)` |
| `match x with \| P -> e` | `(match x (P e) ...)` via Phase 6 ADT primitive |
| `type t = A \| B of int` | `(define-type t (A) (B v))` |
| `module M = struct ... end` | SX dict `{:let-bindings ...}` — module as record |
| `functor (M : S) -> ...` | `(fn (M) ...)` — functor as SX lambda over module record |
| `open M` | inject M's bindings into scope via `env-merge` |
| `M.field` | `(get M :field)` |
| `{ r with f = v }` | `(dict-set r :f v)` |
| `ref x` | `(make-ref x)` — mutable cell |
| `!r` | `(deref-ref r)` |
| `r := v` | `(set-ref! r v)` |
| `(a, b, c)` | tagged list `(:tuple a b c)` |
| `[1; 2; 3]` | `(list 1 2 3)` |
| `[| 1; 2; 3 |]` | `(make-array 1 2 3)` (Phase 6) |
| `try e with \| Ex -> h` | `(guard (fn (ex) h) e)` via SX exception system |
| `raise Ex` | `(perform (:raise Ex))` |
| `Printf.printf "%d" x` | `(perform (:print (format "%d" x)))` |
## Dream semantic mappings (Phase 7)
| Dream construct | SX mapping |
|----------------|-----------|
| `handler = request -> response promise` | `(fn (req) (perform (:http-respond ...)))` |
| `middleware = handler -> handler` | `(fn (next) (fn (req) ...))` |
| `Dream.router [routes]` | `(ocaml-dream-router routes)` — dispatch on method+path |
| `Dream.get "/path" h` | route record `{:method "GET" :path "/path" :handler h}` |
| `Dream.scope "/p" [ms] [rs]` | prefix mount with middleware chain |
| `Dream.param req "name"` | path param extracted during routing |
| `m1 @@ m2 @@ handler` | `(m1 (m2 handler))` — left-fold composition |
| `Dream.session_field req "k"` | `(perform (:session-get req "k"))` |
| `Dream.set_session_field req "k" v` | `(perform (:session-set req "k" v))` |
| `Dream.flash req` | `(perform (:flash-get req))` |
| `Dream.form req` | `(perform (:form-parse req))` — returns Ok/Error ADT |
| `Dream.websocket handler` | `(perform (:websocket handler))` |
| `Dream.run handler` | starts SX HTTP server with handler as root |
## Roadmap
### Phase 1 — Tokenizer + parser
- [x] **Tokenizer:** keywords (`let`, `rec`, `in`, `fun`, `function`, `match`, `with`,
`type`, `of`, `module`, `struct`, `end`, `functor`, `sig`, `open`, `include`,
`if`, `then`, `else`, `begin`, `try`, `exception`, `raise`, `mutable`,
`for`, `while`, `do`, `done`, `and`, `as`, `when`), operators (`->`, `|>`,
`<|`, `@@`, `@`, `:=`, `!`, `::`, `**`, `:`, `;`, `;;`), identifiers (lower,
upper/ctor), char literals `'c'`, string literals (escaped),
int/float literals (incl. hex, exponent, underscores), nested block
comments `(* ... *)`. _(labels `~label:` / `?label:` and heredoc `{|...|}`
deferred — surface tokens already work via `~`/`?` punct + `{`/`|` punct.)_
- [~] **Parser:** expressions: literals, identifiers, constructor application,
lambda, application (left-assoc), binary ops with precedence (29 ops via
`lib/guest/pratt.sx`), `if`/`then`/`else`, `let`/`in`, `let rec`,
`fun`/`->`, `match`/`with`, tuples, list literals, sequences `;`,
`begin`/`end`, unit `()`. Top-level decls: `let [rec] name params* = expr`
and bare expressions, `;;`-separated via `ocaml-parse-program`. _(Pending:
`type`/`module`/`exception`/`open`/`include` decls, `try`/`with`,
`function`, record literals/updates, field access, `and` mutually-recursive
bindings.)_
- [~] **Patterns:** constructor (nullary + with args, incl. flattened tuple
args `Pair (a, b)``(:pcon "Pair" PA PB)`), literal (int/string/char/
bool/unit), variable, wildcard `_`, tuple, list cons `::`, list literal.
_(Pending: record patterns, `as` binding, or-pattern `P1 | P2`, `when`
guard.)_
- [ ] OCaml is **not** indentation-sensitive — no layout algorithm needed.
- [ ] Tests in `lib/ocaml/tests/parse.sx` — 50+ round-trip parse tests.
### Phase 2 — Core evaluator (untyped)
- [x] `ocaml-eval` entry: walks OCaml AST, produces SX values.
- [x] `let`/`let rec`/`let ... in`. Mutually recursive `let rec f = … and
g = …` works at top level via `(:def-rec-mut BINDINGS)`; placeholders
are bound first, rhs evaluated in the joint env, cells filled in.
`let x = … and y = …` (non-rec) emits `(:def-mut BINDINGS)` —
sequential bindings against the parent env.
- [x] Lambda + application (curried by default — auto-curry multi-param defs).
- [x] `fun`/`function` (single-arg lambda with immediate match on arg).
- [x] `if`/`then`/`else`, `begin`/`end`, sequence `;`.
- [x] Arithmetic, comparison, boolean ops, string `^`, `mod`.
- [x] Unit `()` value; `ignore`.
- [x] References: `ref`, `!`, `:=`.
- [ ] Mutable record fields.
- [x] `for i = lo to hi do ... done` loop; `while cond do ... done` (incl.
`downto` direction).
- [x] `try`/`with` — maps to SX `guard`; `raise` is a builtin that calls
host SX `raise`. `failwith` and `invalid_arg` ship as builtins.
- [ ] Tests in `lib/ocaml/tests/eval.sx` — 50+ tests, pure + imperative.
### Phase 3 — ADTs + pattern matching
- [ ] `type` declarations: `type t = A | B of t1 * t2 | C of { x: int }`.
_(Parser + evaluator currently inferred-arity at runtime; type decls
pending.)_
- [x] Constructors as tagged lists: `A` → `("A")`, `B(1, "x")` → `("B" 1 "x")`.
- [~] `match`/`with`: constructor, literal, variable, wildcard, tuple, list
cons/nil, nested patterns. _(Pending: `as` binding, or-patterns,
`when` guard.)_
- [x] Exhaustiveness: runtime error on incomplete match (no compile-time check yet).
- [ ] Built-in types: `option` (`None`/`Some`), `result` (`Ok`/`Error`),
`list` (nil/cons), `bool`, `unit`, `exn`.
- [ ] `exception` declarations; built-in: `Not_found`, `Invalid_argument`,
`Failure`, `Match_failure`.
- [ ] Polymorphic variants (surface syntax `\`Tag value`; runtime same tagged list).
- [ ] Tests in `lib/ocaml/tests/adt.sx` — 40+ tests: ADTs, match, option/result.
### Phase 4 — Modules + functors
- [x] `module M = struct let x = 1 let f y = x + y end` → SX dict
`{"x" 1 "f" <fn>}`.
- [~] `module type S = sig val x : int val f : int -> int end` — signature
annotations are parsed-and-skipped (`skip-optional-sig`); typed
checking deferred to Phase 5.
- [x] `module M : S = struct ... end` — coercive sealing (signature ignored).
- [x] `functor (M : S) -> struct ... end` via shorthand `module F (M) = …`.
- [x] `module F = Functor(Base)` — functor application; multi-param via
`module P = F(A)(B)`.
- [x] `open M` — merge M's dict into current env (via
`ocaml-env-merge-dict`). Module path `M.Sub` resolves via
`ocaml-resolve-module-path`.
- [x] `include M` — at top level same as `open`; inside a module also
copies M's bindings into the surrounding module's exports.
- [x] `M.name` — dict get via field access.
- [ ] First-class modules (pack/unpack) — deferred to Phase 5.
- [ ] Standard module hierarchy: `List`, `Option`, `Result`, `String`, `Char`,
`Int`, `Float`, `Bool`, `Unit`, `Printf`, `Format` (stubs, filled in Phase 6).
- [ ] Tests in `lib/ocaml/tests/modules.sx` — 30+ tests.
### Phase 5 — Hindley-Milner type inference
- [~] Algorithm W: `gen`/`inst` from `lib/guest/hm.sx`, `unify` from
`lib/guest/match.sx`, `infer-expr` written here. Covers atoms, var,
lambda, app, let, if, op, neg, not. _(Pending: tuples, lists,
pattern matching, let-rec, modules.)_
- [x] Type variables: `'a`, `'b`; unification with occur-check (kit).
- [x] Let-polymorphism: generalise at let-bindings (kit `hm-generalize`).
- [ ] ADT types: `type 'a option = None | Some of 'a`.
- [~] Function types `T1 -> T2` work; tuples/records pending.
- [ ] Type signatures: `val f : int -> int` — verify against inferred type.
- [ ] Module type checking: seal against `sig` (Phase 4 stubs become real checks).
- [ ] Error reporting: position-tagged errors with expected vs actual types.
- [ ] First-class modules: `(module M : S)` pack; `(val m : (module S))` unpack.
- [ ] No rank-2 polymorphism, no GADTs (out of scope).
- [ ] Tests in `lib/ocaml/tests/types.sx` — 60+ inference tests.
### Phase 6 — Standard library
- [~] `List`: `map`, `filter`, `fold_left`, `fold_right`, `length`, `rev`,
`append`, `iter`, `for_all`, `exists`, `mem`, `nth`, `hd`, `tl`,
`rev_append`. _(Pending: concat/flatten, iteri/mapi, find/find_opt,
assoc/assq, sort, init, combine, split, partition.)_
- [~] `Option`: `map`, `bind`, `value`, `get`, `is_none`, `is_some`.
_(Pending: fold/join/iter/to_list/to_result.)_
- [~] `Result`: `map`, `bind`, `is_ok`, `is_error`. _(Pending:
fold/get_ok/get_error/map_error/to_option.)_
- [~] `String`: `length`, `get`, `sub`, `concat`, `uppercase_ascii`,
`lowercase_ascii`, `starts_with`. _(Pending: split_on_char, trim,
contains, ends_with, index_opt, replace_all.)_
- [~] `Char`: `code`, `chr`, `lowercase_ascii`, `uppercase_ascii`.
_(Pending: escaped.)_
- [~] `Int`: `to_string`, `of_string`, `abs`, `max`, `min`.
_(Pending: arithmetic helpers, min_int/max_int.)_
- [~] `Float`: `to_string`. _(Pending: of_string, arithmetic helpers.)_
- [~] `Printf`: stub `sprintf`/`printf`. _(Real format-string
interpretation pending.)_
- [ ] `String`: `length`, `get`, `sub`, `concat`, `split_on_char`, `trim`,
`uppercase_ascii`, `lowercase_ascii`, `contains`, `starts_with`, `ends_with`,
`index_opt`, `replace_all` (non-stdlib but needed).
- [ ] `Char`: `code`, `chr`, `escaped`, `lowercase_ascii`, `uppercase_ascii`.
- [ ] `Int`/`Float`: arithmetic, `to_string`, `of_string_opt`, `min_int`, `max_int`.
- [ ] `Hashtbl`: `create`, `add`, `replace`, `find`, `find_opt`, `remove`, `mem`,
`iter`, `fold`, `length` — backed by SX mutable dict.
- [ ] `Map.Make` functor — balanced BST backed by SX sorted dict.
- [ ] `Set.Make` functor.
- [ ] `Printf`: `sprintf`, `printf`, `eprintf` — format strings via `(format ...)`.
- [ ] `Sys`: `argv`, `getenv_opt`, `getcwd` — via `perform` IO.
- [ ] Scoreboard runner: `lib/ocaml/conformance.sh` + `scoreboard.json`.
- [ ] Target: 150+ tests across all stdlib modules.
### Phase 7 — Dream web framework (`lib/dream/`)
The five types: `request`, `response`, `handler = request -> response`,
`middleware = handler -> handler`, `route`. Everything else is a function over these.
- [ ] **Core types** in `lib/dream/types.sx`: request/response records, route record.
- [ ] **Router** in `lib/dream/router.sx`:
- `dream-get path handler`, `dream-post path handler`, etc. for all HTTP methods.
- `dream-scope prefix middlewares routes` — prefix mount with middleware chain.
- `dream-router routes` — dispatch tree, returns handler; no match → 404.
- Path param extraction: `:name` segments, `**` wildcard.
- `dream-param req name` — retrieve matched path param.
- [ ] **Middleware** in `lib/dream/middleware.sx`:
- `dream-pipeline middlewares handler` — compose middleware left-to-right.
- `dream-no-middleware` — identity.
- Logger: `(dream-logger next req)` — logs method, path, status, timing.
- Content-type sniffer.
- [ ] **Sessions** in `lib/dream/session.sx`:
- Cookie-backed session middleware.
- `dream-session-field req key`, `dream-set-session-field req key val`.
- `dream-invalidate-session req`.
- [ ] **Flash messages** in `lib/dream/flash.sx`:
- `dream-flash-middleware` — single-request cookie store.
- `dream-add-flash-message req category msg`.
- `dream-flash-messages req` — returns list of `(category, msg)`.
- [ ] **Forms + CSRF** in `lib/dream/form.sx`:
- `dream-form req` — returns `(Ok fields)` or `(Err :csrf-token-invalid)`.
- `dream-multipart req` — streaming multipart form data.
- CSRF middleware: stateless signed tokens, session-scoped.
- `dream-csrf-tag req` — returns hidden input fragment for SX templates.
- [ ] **WebSockets** in `lib/dream/websocket.sx`:
- `dream-websocket handler` — upgrades request; handler `(fn (ws) ...)`.
- `dream-send ws msg`, `dream-receive ws`, `dream-close ws`.
- [ ] **Static files:** `dream-static root-path` — serves files, ETags, range requests.
- [ ] **`dream-run`**: wires root handler into SX's `perform (:http-listen ...)`.
- [ ] **Demos** in `lib/dream/demos/`:
- `hello.ml` → `lib/dream/demos/hello.sx`: "Hello, World!" route.
- `counter.ml` → `lib/dream/demos/counter.sx`: in-memory counter with sessions.
- `chat.ml` → `lib/dream/demos/chat.sx`: multi-room WebSocket chat.
- `todo.ml` → `lib/dream/demos/todo.sx`: CRUD list with forms + CSRF.
- [ ] Tests in `lib/dream/tests/`: routing dispatch, middleware composition,
session round-trip, CSRF accept/reject, flash read-after-write — 60+ tests.
### Phase 8 — ReasonML syntax variant (`lib/reasonml/`)
ReasonML is OCaml with a JS-friendly surface: semicolons, `let` with `=` everywhere,
`=>` for lambdas, `switch` for match, `{j|...|j}` string interpolation. Same semantics —
different tokenizer + parser, same `lib/ocaml/transpile.sx` output.
- [ ] **Tokenizer** in `lib/reasonml/tokenizer.sx`:
- `let x = e;` binding syntax (semicolons required).
- `(x, y) => e` arrow function syntax.
- `switch (x) { | Pat => e | ... }` for match.
- JSX: `<Comp prop=val />`, `<div>children</div>`.
- String interpolation: `{j|hello $(name)|j}`.
- Type annotations: `x : int`, `let f : int => int = x => x + 1`.
- [ ] **Parser** in `lib/reasonml/parser.sx`:
- Produce same OCaml AST nodes as `lib/ocaml/parser.sx`.
- JSX → SX component calls: `<Comp x=1 />` → `(~comp :x 1)`.
- Multi-arg functions: `(x, y) => e` → auto-curried pair.
- [ ] Shared transpiler: `lib/reasonml/transpile.sx` delegates to
`lib/ocaml/transpile.sx` (parse → ReasonML AST → OCaml AST → SX AST).
- [ ] Tests in `lib/reasonml/tests/`: tokenizer, parser, eval, JSX — 40+ tests.
- [ ] ReasonML Dream demos: translate Phase 7 demos to ReasonML syntax.
## The meta-circular angle
SX is bootstrapped to OCaml (`hosts/ocaml/`). Running OCaml inside SX running on OCaml is
the "mother tongue" closure: OCaml → SX → OCaml. This means:
- The OCaml host's native pattern matching and ADTs are exact reference semantics for
the SX-level implementation — any mismatch is a bug.
- The SX `match` / `define-type` primitives (Phase 6 of the primitives roadmap) were
built knowing OCaml was the intended target.
- When debugging the transpiler, the OCaml REPL is always available as oracle.
- Dream running in SX can serve the sx.rose-ash.com docs site — the framework that
describes the runtime it runs on.
## Key dependencies
- **Phase 6 ADT primitive** (`define-type`/`match`) — required before Phase 3.
- **`perform`/`cek-resume`** IO suspension — required before Phase 7 (Dream async).
- **HO forms** and first-class lambdas — already in spec, no blocker.
- **Module system** (Phase 4) is independent of type inference (Phase 5) — can overlap.
- **ReasonML** (Phase 8) can start once OCaml parser is stable (after Phase 2).
## Progress log
_Newest first._
- 2026-05-08 Phase 6 — extended stdlib slice (+13 tests, 278 total).
Host primitives exposed via `_string_*`, `_char_*`, `_int_*`,
`_string_of_*` underscore-prefixed builtins so the OCaml-side
`lib/ocaml/runtime.sx` modules can wrap them: String (length, get,
sub, concat, uppercase_ascii, lowercase_ascii, starts_with), Char
(code, chr, lowercase_ascii, uppercase_ascii), Int (to_string,
of_string, abs, max, min), Float.to_string, Printf stubs. Also added
`print_string`/`print_endline`/`print_int` builtins.
- 2026-05-08 Phase 5 — Hindley-Milner type inference, paired-sequencing
consumer of `lib/guest/hm.sx` (algebra) and `lib/guest/match.sx`
(unify). `lib/ocaml/infer.sx` ships Algorithm W rules for OCaml AST:
atoms, var (instantiate), fun (auto-curry through fresh-tv), app
(unify against arrow), let (generalize over rhs), if (unify branches),
neg/not, op (treat as app of builtin). Builtin env types `+`/`-`/etc.
as monomorphic int->int->int and `=`/`<>` as polymorphic 'a->'a->bool.
Tested: literals, +1, identity polymorphism `'a -> 'a`, let-poly so
`let id = fun x -> x in id true : Bool`, `twice` infers
`('a -> 'a) -> 'a -> 'a`. Mandate satisfied: OCaml-on-SX is the
deferred second consumer for lib-guest Step 8. 265/265 (+14).
- 2026-05-08 Phase 2 — `let ... and ...` mutual recursion at top level.
Parser collects all bindings into a list, emitting `(:def-rec-mut)` or
`(:def-mut)` when there are 2+. Eval allocates a placeholder cell per
recursive binding, builds an env with all of them visible, then fills
the cells. Even/odd mutual-recursion test passes. 251/251 (+3).
- 2026-05-08 Phase 6 — `lib/ocaml/runtime.sx` minimal stdlib slice
written entirely in OCaml syntax: List (length, rev, rev_append, map,
filter, fold_left/right, append, iter, mem, for_all, exists, hd, tl,
nth), Option (map, bind, value, get, is_none, is_some), Result (map,
bind, is_ok, is_error). Loaded once via `ocaml-load-stdlib!`, cached
in `ocaml-stdlib-env`; `ocaml-run` and `ocaml-run-program` layer user
code on top via `ocaml-base-env`. The fact that these are written in
OCaml (not SX) and parse + evaluate cleanly is a substrate-validation
win: every parser, eval, match, ref, and module path proven by a
single nontrivial Ocaml program. 248/248 (+23).
- 2026-05-08 Phase 4 — functors + module aliases (+5 tests, 225 total).
Parser: `module F (M) = struct DECLS end` → `(:functor-def NAME PARAMS
DECLS)`. `module N = expr` (where expr isn't `struct`) → `(:module-alias
NAME BODY-SRC)`. Functor params accept `(P)` or `(P : Sig)` (signatures
parsed-and-skipped). Eval: `ocaml-make-functor` builds a curried
host-SX closure that takes module dicts and returns a module dict;
`ocaml-resolve-module-path` extended for `:app` so `F(A)`, `F(A)(B)`,
`Outer.Inner` all resolve to dicts. Tested: 1-arg functor, 2-arg
curried `Pair(One)(Two)`, module alias, submodule alias, identity
functor with include. Phase 4 LOC ~290 (still well under 2000).
- 2026-05-08 Phase 4 — `open M` / `include M` (+5 tests, 220 total).
Parser: top-level `open Path` / `include Path` decls; path is `Ctor (.
Ctor)*`. Eval resolves the path via `ocaml-resolve-module-path` (the
same `:con`-as-module-lookup escape hatch used for `:field`); merges
the dict bindings into the current env via `ocaml-env-merge-dict`.
`include` inside a module also adds the bindings to the module's
resulting dict, so `module Sphere = struct include Math let area r =
... end` exposes both Math's `pi` and Sphere's `area`. Phase 4 LOC
cumulative: ~165.
- 2026-05-08 Phase 4 — modules + field access (+11 tests, 215 total). Parser:
`module M = struct DECLS end` decl in `ocaml-parse-program`. Body parsed
by sub-tokenising the source between `struct` and the matching `end`,
tracking nesting via `struct`/`begin`/`sig`/`end`. Field access added
as a postfix layer above `parse-atom`, binding tighter than application:
`f r.x` → `(:app f (:field r "x"))`. Eval: `(:module-def NAME DECLS)`
builds a dict via new `ocaml-eval-module` that runs decls in a sub-env;
`(:field EXPR NAME)` looks up the field, with the special case that
`(:con NAME)` heads are interpreted as module-name lookups instead of
nullary ctors. Tested: simple module, multi-decl module, nested modules
(`Outer.Inner.v`), `let rec` inside a module, module containing tuple
pattern match. Phase 4 LOC: ~110 (well under 2000 budget).
- 2026-05-08 Phase 2 — `try`/`with` + `raise` builtin. Parser produces
`(:try EXPR CLAUSES)`; eval delegates to SX `guard` with `else`
matching the raised value against clause patterns and re-raising on
no-match. `raise`/`failwith`/`invalid_arg` exposed as builtins;
failwith builds `("Failure" msg)` so `Failure msg -> ...` patterns
match. 204/204 (+6).
- 2026-05-08 Phase 2 — `function | pat -> body | …` parser + eval.
Sugar for `fun x -> match x with | …`. AST: `(:function CLAUSES)`
evaluated to a unary closure that runs `ocaml-match-clauses` on the
argument. `let rec` knot also triggers when rhs is `:function`, so
`let rec map f = function | [] -> [] | h::t -> f h :: map f t` works.
ocaml-match-eval refactored to share `ocaml-match-clauses` with the
function form. 198/198 (+4).
- 2026-05-08 Phase 2 — `for`/`while` loops. `(:for NAME LO HI DIR BODY)`
with `:ascend`/`:descend` direction (`to`/`downto`); `(:while COND BODY)`.
Both eval to unit and re-bind the loop var per iteration. 194/194 (+5).
- 2026-05-08 Phase 2 — references (`ref`/`!`/`:=`). `ref` is a builtin
that boxes its argument in a one-element list (the mutable cell);
prefix `!` parses to `(:deref EXPR)` and reads `(nth cell 0)`; `:=`
joins the precedence table at the lowest binop level (right-assoc) and
short-circuits in eval to mutate via `set-nth!`. Closures capture refs
by sharing the underlying list. 189/189 (+6).
- 2026-05-08 Phase 3 — pattern matching evaluator + constructors (+18
tests, 183 total). Constructor application: `(:app (:con NAME) arg)`
builds a tagged list `(NAME …args)` with tuple args flattened (so
`Pair (a, b)` → `("Pair" a b)` matches the parser's pattern flatten).
Standalone ctor `(:con NAME)` → `(NAME)` (nullary). Pattern matcher:
:pwild / :pvar / :plit (unboxed compare) / :pcon (head + arity match) /
:pcons (cons-decompose) / :plist (length+items) / :ptuple (after `tuple`
tag). Match drives clauses until first success; runtime error on
exhaustion. Tested with option match, literal match, tuple match,
recursive list functions (`len`, `sum`), nested ctor (`Pair(a,b)`).
Note: arity flattening happens for any tuple-arg ctor — without ADT
declarations there's no way to distinguish `Some (1,2)` (single tuple
payload) from `Pair (1,2)` (two-arg ctor). All-flatten convention is
consistent across parser + evaluator.
- 2026-05-08 Phase 2 — `lib/ocaml/eval.sx`: ocaml-eval + ocaml-run +
ocaml-run-program. Coverage: atoms, var lookup, :app (curried),
:op (arithmetic/comparison/boolean/^/mod/::/|>), :neg, :not, :if,
:seq, :tuple, :list, :fun (auto-curried host-SX closures), :let,
:let-rec (recursive knot via one-element-list mutable cell). Initial
env exposes `not`/`succ`/`pred`/`abs`/`max`/`min`/`fst`/`snd`/`ignore`
as host-SX functions. Tests: literals, arithmetic, comparison, boolean,
string concat, conditionals, lambda + closures + recursion (fact 5,
fib 10, sum 100), sequences, top-level program decls, |> pipe. 165/165
passing (+42).
- 2026-05-07 Phase 1 — sequence operator `;`. Lowest-precedence binary;
`e1; e2; e3` → `(:seq e1 e2 e3)`. Two-phase grammar: `parse-expr-no-seq`
is the prior expression entry point; new `parse-expr` wraps it with
`;` chaining. List-literal items still use `parse-expr-no-seq` so `;`
retains its separator role inside `[…]`. Match-clause bodies use the
seq variant and stop at `|`, matching real OCaml semantics. Trailing `;`
before `end`/`)`/`|`/`in`/`then`/`else`/eof is permitted. 123/123 tests
passing (+10).
- 2026-05-07 Phase 1 — `match`/`with` + pattern parser. Patterns: wildcard,
literal, var, ctor (nullary + with arg, with tuple-arg flattening so
`Pair (a, b)` → `(:pcon "Pair" PA PB)`), tuple, list literal, cons `::`
(right-assoc), parens, unit. Match clauses: leading `|` optional, body
parsed via `parse-expr`. AST: `(:match SCRUT CLAUSES)` where each clause
is `(:case PAT BODY)`. 113/113 tests passing (+9). Note: parse-expr is
used for case bodies, so a trailing `| pat -> body` after a complex body
will be reached because `|` is not in the binop table for level 1.
- 2026-05-07 Phase 1 — top-level program parser `ocaml-parse-program`. Parses
a sequence of `let [rec] name params* = expr` decls and bare expressions
separated by `;;`. Output `(:program DECLS)` with each decl one of `(:def …)`,
`(:def-rec …)`, `(:expr E)`. Decl bodies parsed by re-feeding the source
slice through `ocaml-parse` (cheap stand-in until shared-state refactor).
104/104 tests now passing (+9).
- 2026-05-07 Phase 1 — `lib/ocaml/parser.sx` expression parser consuming
`lib/guest/pratt.sx` for binop precedence (29 operators across 8 levels,
incl. keyword-spelled binops `mod`/`land`/`lor`/`lxor`/`lsl`/`lsr`/`asr`).
Atoms (literals + var/con/unit/list), application (left-assoc), prefix
`-`/`not`, tuples, parens, `if`/`then`/`else`, `fun x y -> body`,
`let`/`let rec` with function shorthand. AST shapes match Haskell-on-SX
conventions (`(:int N)` `(:op OP L R)` `(:fun PARAMS BODY)` etc.). Total
95/95 tests now passing via `lib/ocaml/test.sh`.
- 2026-05-07 Phase 1 — `lib/ocaml/tokenizer.sx` consuming `lib/guest/lex.sx`
via `prefix-rename`. Covers idents, ctors, 51 keywords, numbers (int / float
/ hex / exponent / underscored), strings (with escapes), chars (with escapes),
type variables (`'a`), nested block comments, and 26 operator/punct tokens
(incl. `->` `|>` `<-` `:=` `::` `;;` `@@` `<>` `&&` `||` `**` etc.). 58/58
tokenizer tests pass via `lib/ocaml/test.sh` driving `sx_server.exe`.
## Blockers
_(none yet)_