Files
rose-ash/plans/ocaml-on-sx.md
giles 13fb1bd7a9
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 26s
ocaml: phase 5.1 newton_sqrt.ml baseline (Newton's method, sqrt(2)*1000 = 1414)
Newton's method for square root:

  let sqrt_newton x =
    let g = ref 1.0 in
    for _ = 1 to 20 do
      g := (!g +. x /. !g) /. 2.0
    done;
    !g

20 iterations is more than enough to converge for x=2 — result is
~1.414213562. Multiplied by 1000 and int_of_float'd: 1414.

First baseline exercising:
  - for _ = 1 to N do ... done (wildcard loop variable)
  - pure float arithmetic with +. /.
  - the int_of_float truncate-toward-zero fix from iter 117

38 baseline programs total.
2026-05-09 08:29:01 +00:00

1244 lines
75 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# OCaml-on-SX: OCaml + ReasonML + Dream on the CEK/VM
The meta-circular demo: SX's native evaluator is OCaml, so implementing OCaml on top of
SX closes the loop — the source language of the host is running inside the host it
compiles to. Beyond the elegance, it's practically useful: once OCaml expressions run on
the SX CEK/VM you get Dream (a clean OCaml web framework) almost for free, and ReasonML
is a syntax variant that shares the same transpiler output.
End-state goal: **OCaml programs running on the SX CEK/VM**, with enough of the standard
library to support Dream's middleware model. Dream-on-SX is the integration target —
a `handler`/`middleware`/`router` API that feels idiomatic while running purely in SX.
ReasonML (Phase 8) adds an alternative syntax frontend that targets the same transpiler.
## What this covers that nothing else in the set does
- **Strict ML semantics** — unlike Haskell, OCaml is call-by-value with explicit `Lazy.t`
for laziness. Pattern match is exhaustive. Polymorphic variants. Structural equality.
- **First-class modules and functors** — modules as values (phase 4); functors as SX
higher-order functions over module records. Unlike Haskell typeclasses, OCaml's module
system is explicit and compositional.
- **Mutable state without monads** — `ref`, `:=`, `!` are primitives. Arrays. `Hashtbl`.
The IO model is direct; `Lwt`/Dream map to `perform`/`cek-resume` for async.
- **Dream's composable HTTP model** — `handler = request -> response promise`,
`middleware = handler -> handler`. Algebraically clean; `@@` composition maps to SX
function composition trivially.
- **ReasonML** — same semantics, JS-friendly surface syntax. JSX variant pairs with SX
component rendering.
## Ground rules
- **Scope:** only touch `lib/ocaml/**`, `lib/dream/**`, `lib/reasonml/**`, and
`plans/ocaml-on-sx.md`. Do **not** edit `spec/`, `hosts/`, `shared/`, or other
`lib/<lang>/`.
- **Shared-file issues** go under "Blockers" below with a minimal repro; do not fix here.
- **SX files:** use `sx-tree` MCP tools only.
- **Architecture:** OCaml source → AST → SX AST → CEK. No standalone OCaml evaluator.
The OCaml AST is walked by an `ocaml-eval` function in SX that produces SX values.
- **Type system:** deferred until Phase 5. Phases 14 are intentionally untyped —
get the evaluator right first, then layer HM inference on top.
- **Dream:** implemented as a library in Phase 7; no separate build step. `Dream.run`
wraps SX's existing HTTP server machinery via `perform`/`cek-resume`.
- **Commits:** one feature per commit. Keep `## Progress log` updated and tick boxes.
## Architecture sketch
```
OCaml source text
lib/ocaml/tokenizer.sx — keywords, operators, string/char literals, comments
lib/ocaml/parser.sx — OCaml AST: let/let rec, fun, match, if, begin/end,
│ module/struct/functor, type decls, expressions
lib/ocaml/desugar.sx — surface → core: tuple patterns, or-patterns,
│ sequence (;) → (do), when guards, field punning
lib/ocaml/transpile.sx — OCaml AST → SX AST
lib/ocaml/runtime.sx — ADT constructors, module primitives, ref/array ops,
│ Stdlib shims, Dream server (phase 7)
SX CEK evaluator (both JS and OCaml hosts)
```
## Semantic mappings
| OCaml construct | SX mapping |
|----------------|-----------|
| `let x = e` (top-level) | `(define x e)` |
| `let f x y = e` | `(define (f x y) e)` |
| `let rec f x = e` | `(define (f x) e)` — SX define is already recursive |
| `fun x -> e` | `(fn (x) e)` |
| `e1 \|> f` | `(f e1)` — pipe desugars to reverse application |
| `e1; e2` | `(do e1 e2)` |
| `begin e1; e2; e3 end` | `(do e1 e2 e3)` |
| `if c then e1 else e2` | `(if c e1 e2)` |
| `match x with \| P -> e` | `(match x (P e) ...)` via Phase 6 ADT primitive |
| `type t = A \| B of int` | `(define-type t (A) (B v))` |
| `module M = struct ... end` | SX dict `{:let-bindings ...}` — module as record |
| `functor (M : S) -> ...` | `(fn (M) ...)` — functor as SX lambda over module record |
| `open M` | inject M's bindings into scope via `env-merge` |
| `M.field` | `(get M :field)` |
| `{ r with f = v }` | `(dict-set r :f v)` |
| `ref x` | `(make-ref x)` — mutable cell |
| `!r` | `(deref-ref r)` |
| `r := v` | `(set-ref! r v)` |
| `(a, b, c)` | tagged list `(:tuple a b c)` |
| `[1; 2; 3]` | `(list 1 2 3)` |
| `[| 1; 2; 3 |]` | `(make-array 1 2 3)` (Phase 6) |
| `try e with \| Ex -> h` | `(guard (fn (ex) h) e)` via SX exception system |
| `raise Ex` | `(perform (:raise Ex))` |
| `Printf.printf "%d" x` | `(perform (:print (format "%d" x)))` |
## Dream semantic mappings (Phase 7)
| Dream construct | SX mapping |
|----------------|-----------|
| `handler = request -> response promise` | `(fn (req) (perform (:http-respond ...)))` |
| `middleware = handler -> handler` | `(fn (next) (fn (req) ...))` |
| `Dream.router [routes]` | `(ocaml-dream-router routes)` — dispatch on method+path |
| `Dream.get "/path" h` | route record `{:method "GET" :path "/path" :handler h}` |
| `Dream.scope "/p" [ms] [rs]` | prefix mount with middleware chain |
| `Dream.param req "name"` | path param extracted during routing |
| `m1 @@ m2 @@ handler` | `(m1 (m2 handler))` — left-fold composition |
| `Dream.session_field req "k"` | `(perform (:session-get req "k"))` |
| `Dream.set_session_field req "k" v` | `(perform (:session-set req "k" v))` |
| `Dream.flash req` | `(perform (:flash-get req))` |
| `Dream.form req` | `(perform (:form-parse req))` — returns Ok/Error ADT |
| `Dream.websocket handler` | `(perform (:websocket handler))` |
| `Dream.run handler` | starts SX HTTP server with handler as root |
## Roadmap
### Phase 1 — Tokenizer + parser
- [x] **Tokenizer:** keywords (`let`, `rec`, `in`, `fun`, `function`, `match`, `with`,
`type`, `of`, `module`, `struct`, `end`, `functor`, `sig`, `open`, `include`,
`if`, `then`, `else`, `begin`, `try`, `exception`, `raise`, `mutable`,
`for`, `while`, `do`, `done`, `and`, `as`, `when`), operators (`->`, `|>`,
`<|`, `@@`, `@`, `:=`, `!`, `::`, `**`, `:`, `;`, `;;`), identifiers (lower,
upper/ctor), char literals `'c'`, string literals (escaped),
int/float literals (incl. hex, exponent, underscores), nested block
comments `(* ... *)`. _(labels `~label:` / `?label:` and heredoc `{|...|}`
deferred — surface tokens already work via `~`/`?` punct + `{`/`|` punct.)_
- [~] **Parser:** expressions: literals, identifiers, constructor application,
lambda, application (left-assoc), binary ops with precedence (29 ops via
`lib/guest/pratt.sx`), `if`/`then`/`else`, `let`/`in`, `let rec`,
`fun`/`->`, `match`/`with`, tuples, list literals, sequences `;`,
`begin`/`end`, unit `()`. Top-level decls: `let [rec] name params* = expr`
and bare expressions, `;;`-separated via `ocaml-parse-program`. _(Pending:
`type`/`module`/`exception`/`open`/`include` decls, `try`/`with`,
`function`, record literals/updates, field access, `and` mutually-recursive
bindings.)_
- [x] **Patterns:** constructor (nullary + with args, incl. flattened tuple
args), literal (int/string/char/bool/unit), variable, wildcard `_`,
tuple, list cons `::`, list literal, record `{ f = pat; … }`,
`as` binding, or-pattern `(P1 | P2 | …)` (parens-only — top-level
`|` is the clause separator). Match clauses support `when` guard
via `(:case-when PAT GUARD BODY)`.
- [ ] OCaml is **not** indentation-sensitive — no layout algorithm needed.
- [ ] Tests in `lib/ocaml/tests/parse.sx` — 50+ round-trip parse tests.
### Phase 2 — Core evaluator (untyped)
- [x] `ocaml-eval` entry: walks OCaml AST, produces SX values.
- [x] `let`/`let rec`/`let ... in`. Mutually recursive `let rec f = … and
g = …` works at top level via `(:def-rec-mut BINDINGS)`; placeholders
are bound first, rhs evaluated in the joint env, cells filled in.
`let x = … and y = …` (non-rec) emits `(:def-mut BINDINGS)` —
sequential bindings against the parent env.
- [x] Lambda + application (curried by default — auto-curry multi-param defs).
- [x] `fun`/`function` (single-arg lambda with immediate match on arg).
- [x] `if`/`then`/`else`, `begin`/`end`, sequence `;`.
- [x] Arithmetic, comparison, boolean ops, string `^`, `mod`.
- [x] Unit `()` value; `ignore`.
- [x] References: `ref`, `!`, `:=`.
- [x] Mutable record fields via `r.f <- v` — uses host SX `dict-set!`
to mutate the underlying record dict in place. All record fields
are de-facto mutable (the `mutable` keyword in type-decls is
currently parsed-and-discarded).
- [ ] Mutable record fields.
- [x] `for i = lo to hi do ... done` loop; `while cond do ... done` (incl.
`downto` direction).
- [x] `try`/`with` — maps to SX `guard`; `raise` is a builtin that calls
host SX `raise`. `failwith` and `invalid_arg` ship as builtins.
- [ ] Tests in `lib/ocaml/tests/eval.sx` — 50+ tests, pure + imperative.
### Phase 3 — ADTs + pattern matching
- [x] `type` declarations: `type [params] t = | A | B of t1 [* t2] | …`.
Parser emits `(:type-def NAME PARAMS CTORS)`. Runtime treats decls
as no-ops since constructors are dispatched dynamically by tag.
Phase 5 will register ctor types here for HM checking.
- [x] Constructors as tagged lists: `A` → `("A")`, `B(1, "x")` → `("B" 1 "x")`.
- [~] `match`/`with`: constructor, literal, variable, wildcard, tuple, list
cons/nil, nested patterns. _(Pending: `as` binding, or-patterns,
`when` guard.)_
- [x] Exhaustiveness: runtime error on incomplete match (no compile-time check yet).
- [ ] Built-in types: `option` (`None`/`Some`), `result` (`Ok`/`Error`),
`list` (nil/cons), `bool`, `unit`, `exn`.
- [x] `exception` declarations: `exception NAME [of TYPE]`. Parser emits
`(:exception-def NAME [ARG-TYPE-SRC])`. Runtime no-op since
raise/match work on tagged ctor values. Built-ins:
`Failure`/`Invalid_argument` via `failwith`/`invalid_arg`.
- [x] Polymorphic variants (surface syntax `` `Tag value ``; runtime same
tagged list as nominal ctors). Tokenizer recognises backtick + ctor;
parser/eval treat them identically to nominal ctors. Type system
handling deferred (proper row types).
- [ ] Tests in `lib/ocaml/tests/adt.sx` — 40+ tests: ADTs, match, option/result.
### Phase 4 — Modules + functors
- [x] `module M = struct let x = 1 let f y = x + y end` → SX dict
`{"x" 1 "f" <fn>}`.
- [x] `module type S = sig val x : int val f : int -> int end` parses
via `parse-decl-module-type`. Signature contents are skipped
(sig..end nesting tracked) — runtime no-op since types are
structural. AST: `(:module-type-def NAME)`.
- [x] `module M : S = struct ... end` — coercive sealing (signature ignored).
- [x] `functor (M : S) -> struct ... end` via shorthand `module F (M) = …`.
- [x] `module F = Functor(Base)` — functor application; multi-param via
`module P = F(A)(B)`.
- [x] `open M` — merge M's dict into current env (via
`ocaml-env-merge-dict`). Module path `M.Sub` resolves via
`ocaml-resolve-module-path`.
- [x] `include M` — at top level same as `open`; inside a module also
copies M's bindings into the surrounding module's exports.
- [x] `M.name` — dict get via field access.
- [ ] First-class modules (pack/unpack) — deferred to Phase 5.
- [ ] Standard module hierarchy: `List`, `Option`, `Result`, `String`, `Char`,
`Int`, `Float`, `Bool`, `Unit`, `Printf`, `Format` (stubs, filled in Phase 6).
- [ ] Tests in `lib/ocaml/tests/modules.sx` — 30+ tests.
### Phase 5.1 — Conformance scoreboard
- [x] `lib/ocaml/conformance.sh` runs the full test suite, classifies
each test by description prefix into a suite (tokenize, parser,
eval-core, phase2-refs, phase2-loops, phase2-function, phase2-exn,
phase3-adt, phase4-modules, phase5-hm, phase6-stdlib, let-and,
phase1-params, misc), and emits `scoreboard.json` + `scoreboard.md`.
- [~] Baseline OCaml programs at `lib/ocaml/baseline/` exercised through
`ocaml-run-program`. Currently 5/5: factorial.ml (recursion),
list_ops.ml (List.map + fold_left), option_match.ml (option +
pattern match), module_use.ml (module + ref + closure +
sequenced calls), sum_squares.ml (for-loop + ref). Real OCaml
testsuite vendoring is the next step.
### Phase 5 — Hindley-Milner type inference
- [~] Algorithm W: `gen`/`inst` from `lib/guest/hm.sx`, `unify` from
`lib/guest/match.sx`, `infer-expr` written here. Covers atoms, var,
lambda, app, let, if, op, neg, not. _(Pending: tuples, lists,
pattern matching, let-rec, modules.)_
- [x] Type variables: `'a`, `'b`; unification with occur-check (kit).
- [x] Let-polymorphism: generalise at let-bindings (kit `hm-generalize`).
- [x] ADT types: `option`/`result` ctors seeded;
`ocaml-hm-register-type-def!` registers user types from `:type-def`.
`ocaml-type-of-program` threads decls through the env, registering
types and binding `let` schemes. `:con NAME` / `:pcon NAME …`
instantiate from the registry. Ctor arg types parsed via
`ocaml-hm-parse-type-src` — handles primitives (`int`/`bool`/
`string`/`float`/`unit`), tyvars `'a`, simple parametric `T list`/
`T option`. Multi-arg/complex types fall back to a fresh tv.
- [~] Function types `T1 -> T2` work; tuples (`'a * 'b`) and lists
(`'a list`) supported. Records pending.
- [ ] Type signatures: `val f : int -> int` — verify against inferred type.
- [ ] Module type checking: seal against `sig` (Phase 4 stubs become real checks).
- [ ] Error reporting: position-tagged errors with expected vs actual types.
- [ ] First-class modules: `(module M : S)` pack; `(val m : (module S))` unpack.
- [ ] No rank-2 polymorphism, no GADTs (out of scope).
- [ ] Tests in `lib/ocaml/tests/types.sx` — 60+ inference tests.
### Phase 6 — Standard library
- [~] `List`: `map`, `filter`, `fold_left`, `fold_right`, `length`, `rev`,
`append`, `iter`, `for_all`, `exists`, `mem`, `nth`, `hd`, `tl`,
`rev_append`, `concat`/`flatten`, `init`, `iteri`, `mapi`, `find`,
`find_opt`, `assoc`, `assoc_opt`, `partition`, `sort`,
`stable_sort`, `combine`, `split`, `iter2`, `fold_left2`, `map2`.
30+ functions covered.
- [~] `Option`: `map`, `bind`, `value`, `get`, `is_none`, `is_some`,
`iter`, `fold`, `to_list`. _(Pending: join/to_result.)_
- [~] `Result`: `map`, `bind`, `is_ok`, `is_error`, `get_ok`,
`get_error`, `map_error`, `to_option`. _(Pending: fold/join.)_
- [~] `Hashtbl`: `create`, `add`, `find`, `find_opt`, `replace`, `mem`,
`length`. Backed by a one-element list cell holding a SX dict;
keys coerced to strings via `str` for polymorphic-key support.
- [~] `Buffer`: `create`, `add_string`, `add_char`, `contents`, `length`,
`clear`, `reset`. Backed by a ref holding a list of strings; reverse +
`String.concat` on `contents`. Mostly-OCaml impl.
- [~] `Stack`: `create`, `push`, `pop`, `top`, `is_empty`, `length`,
`clear`. Backed by a ref-holding-list (LIFO).
- [~] `Queue`: `create`, `push`, `pop`, `is_empty`, `length`, `clear`.
Backed by a `(front, back)` tuple-of-lists pair (amortised O(1)
enqueue/dequeue via list reversal).
- [~] `Sys`: `os_type` (`"SX"`), `word_size`, `max_array_length`,
`max_string_length`, `executable_name`, `big_endian`, `unix`,
`win32`, `cygwin`. Constants only; `argv`/`getenv_opt`/`command`
pending (would need host platform integration).
- [x] `String`: `length`, `get`, `sub`, `concat`, `uppercase_ascii`,
`lowercase_ascii`, `starts_with`, `ends_with`, `contains`, `trim`,
`split_on_char`, `replace_all`, `index_of`.
- [~] `Char`: `code`, `chr`, `lowercase_ascii`, `uppercase_ascii`.
_(Pending: escaped.)_
- [~] `Int`: `to_string`, `of_string`, `abs`, `max`, `min`.
_(Pending: arithmetic helpers, min_int/max_int.)_
- [~] `Float`: `to_string`, `sqrt`, `sin`, `cos`, `pow`, `floor`,
`ceil`, `round`, `pi`. _(Pending: of_string.)_
- [~] `Printf`: stub `sprintf`/`printf`. _(Real format-string
interpretation pending.)_
- [ ] `String`: `length`, `get`, `sub`, `concat`, `split_on_char`, `trim`,
`uppercase_ascii`, `lowercase_ascii`, `contains`, `starts_with`, `ends_with`,
`index_opt`, `replace_all` (non-stdlib but needed).
- [ ] `Char`: `code`, `chr`, `escaped`, `lowercase_ascii`, `uppercase_ascii`.
- [ ] `Int`/`Float`: arithmetic, `to_string`, `of_string_opt`, `min_int`, `max_int`.
- [ ] `Hashtbl`: `create`, `add`, `replace`, `find`, `find_opt`, `remove`, `mem`,
`iter`, `fold`, `length` — backed by SX mutable dict.
- [x] `Map.Make` functor — sorted association list backed
(insert/find/remove/mem/cardinal/bindings); not a balanced tree
but linear with parametric `Ord` ordering.
- [x] `Set.Make` functor — sorted list backed
(add/mem/remove/elements/cardinal).
- [ ] `Printf`: `sprintf`, `printf`, `eprintf` — format strings via `(format ...)`.
- [ ] `Sys`: `argv`, `getenv_opt`, `getcwd` — via `perform` IO.
- [ ] Scoreboard runner: `lib/ocaml/conformance.sh` + `scoreboard.json`.
- [ ] Target: 150+ tests across all stdlib modules.
### Phase 7 — Dream web framework (`lib/dream/`)
The five types: `request`, `response`, `handler = request -> response`,
`middleware = handler -> handler`, `route`. Everything else is a function over these.
- [ ] **Core types** in `lib/dream/types.sx`: request/response records, route record.
- [ ] **Router** in `lib/dream/router.sx`:
- `dream-get path handler`, `dream-post path handler`, etc. for all HTTP methods.
- `dream-scope prefix middlewares routes` — prefix mount with middleware chain.
- `dream-router routes` — dispatch tree, returns handler; no match → 404.
- Path param extraction: `:name` segments, `**` wildcard.
- `dream-param req name` — retrieve matched path param.
- [ ] **Middleware** in `lib/dream/middleware.sx`:
- `dream-pipeline middlewares handler` — compose middleware left-to-right.
- `dream-no-middleware` — identity.
- Logger: `(dream-logger next req)` — logs method, path, status, timing.
- Content-type sniffer.
- [ ] **Sessions** in `lib/dream/session.sx`:
- Cookie-backed session middleware.
- `dream-session-field req key`, `dream-set-session-field req key val`.
- `dream-invalidate-session req`.
- [ ] **Flash messages** in `lib/dream/flash.sx`:
- `dream-flash-middleware` — single-request cookie store.
- `dream-add-flash-message req category msg`.
- `dream-flash-messages req` — returns list of `(category, msg)`.
- [ ] **Forms + CSRF** in `lib/dream/form.sx`:
- `dream-form req` — returns `(Ok fields)` or `(Err :csrf-token-invalid)`.
- `dream-multipart req` — streaming multipart form data.
- CSRF middleware: stateless signed tokens, session-scoped.
- `dream-csrf-tag req` — returns hidden input fragment for SX templates.
- [ ] **WebSockets** in `lib/dream/websocket.sx`:
- `dream-websocket handler` — upgrades request; handler `(fn (ws) ...)`.
- `dream-send ws msg`, `dream-receive ws`, `dream-close ws`.
- [ ] **Static files:** `dream-static root-path` — serves files, ETags, range requests.
- [ ] **`dream-run`**: wires root handler into SX's `perform (:http-listen ...)`.
- [ ] **Demos** in `lib/dream/demos/`:
- `hello.ml` → `lib/dream/demos/hello.sx`: "Hello, World!" route.
- `counter.ml` → `lib/dream/demos/counter.sx`: in-memory counter with sessions.
- `chat.ml` → `lib/dream/demos/chat.sx`: multi-room WebSocket chat.
- `todo.ml` → `lib/dream/demos/todo.sx`: CRUD list with forms + CSRF.
- [ ] Tests in `lib/dream/tests/`: routing dispatch, middleware composition,
session round-trip, CSRF accept/reject, flash read-after-write — 60+ tests.
### Phase 8 — ReasonML syntax variant (`lib/reasonml/`)
ReasonML is OCaml with a JS-friendly surface: semicolons, `let` with `=` everywhere,
`=>` for lambdas, `switch` for match, `{j|...|j}` string interpolation. Same semantics —
different tokenizer + parser, same `lib/ocaml/transpile.sx` output.
- [ ] **Tokenizer** in `lib/reasonml/tokenizer.sx`:
- `let x = e;` binding syntax (semicolons required).
- `(x, y) => e` arrow function syntax.
- `switch (x) { | Pat => e | ... }` for match.
- JSX: `<Comp prop=val />`, `<div>children</div>`.
- String interpolation: `{j|hello $(name)|j}`.
- Type annotations: `x : int`, `let f : int => int = x => x + 1`.
- [ ] **Parser** in `lib/reasonml/parser.sx`:
- Produce same OCaml AST nodes as `lib/ocaml/parser.sx`.
- JSX → SX component calls: `<Comp x=1 />` → `(~comp :x 1)`.
- Multi-arg functions: `(x, y) => e` → auto-curried pair.
- [ ] Shared transpiler: `lib/reasonml/transpile.sx` delegates to
`lib/ocaml/transpile.sx` (parse → ReasonML AST → OCaml AST → SX AST).
- [ ] Tests in `lib/reasonml/tests/`: tokenizer, parser, eval, JSX — 40+ tests.
- [ ] ReasonML Dream demos: translate Phase 7 demos to ReasonML syntax.
## The meta-circular angle
SX is bootstrapped to OCaml (`hosts/ocaml/`). Running OCaml inside SX running on OCaml is
the "mother tongue" closure: OCaml → SX → OCaml. This means:
- The OCaml host's native pattern matching and ADTs are exact reference semantics for
the SX-level implementation — any mismatch is a bug.
- The SX `match` / `define-type` primitives (Phase 6 of the primitives roadmap) were
built knowing OCaml was the intended target.
- When debugging the transpiler, the OCaml REPL is always available as oracle.
- Dream running in SX can serve the sx.rose-ash.com docs site — the framework that
describes the runtime it runs on.
## Key dependencies
- **Phase 6 ADT primitive** (`define-type`/`match`) — required before Phase 3.
- **`perform`/`cek-resume`** IO suspension — required before Phase 7 (Dream async).
- **HO forms** and first-class lambdas — already in spec, no blocker.
- **Module system** (Phase 4) is independent of type inference (Phase 5) — can overlap.
- **ReasonML** (Phase 8) can start once OCaml parser is stable (after Phase 2).
## Progress log
_Newest first._
- 2026-05-08 Phase 1+6 — Buffer module + parser fix for `f !x` (+3
tests, 425 total). Parser: at-app-start? and parse-app's loop now
recognise `!` as the prefix-deref of an application argument, so
`String.concat "" (List.rev !b)` parses as `(... (deref b))`. Buffer
uses a ref holding a string list; contents reverses and concats.
- 2026-05-08 Phase 5.1 — btree.ml baseline (13/13 pass). Polymorphic
binary search tree (`type 'a tree = Leaf | Node of 'a * 'a tree *
'a tree`) with insert + in-order traversal. Tests parametric ADT,
recursive match, List.append, List.fold_left.
- 2026-05-09 Phase 5.1 — newton_sqrt.ml baseline (Newton's method
for sqrt, sqrt(2)*1000 truncated → 1414). 20 iterations of
`g := (g + x/g) / 2` converges to ~1.414213562 for x=2. Multiplied
by 1000 and int_of_float'd gives 1414. First baseline that
exercises `for _ = 1 to N do ... done` (wildcard loop variable),
pure float arithmetic with `+.` `/.`, and the `int_of_float` fix
from iteration 117. 38 baseline programs total.
- 2026-05-09 Phase 5.1 — hanoi.ml baseline (Tower of Hanoi move
count, n=10 → 1023). Classic doubly-recursive solution returning
the number of moves: `hanoi n from to via = hanoi (n-1) from via
to + 1 + hanoi (n-1) via to from`. Counts to 2^10 - 1 = 1023 for
n=10, exercising tail-position addition + 4-arg recursion +
conditional base case. (Uses `to_` instead of `to` to avoid
collision with the `to` keyword in for-loops — OCaml conventional
workaround.) 37 baseline programs total.
- 2026-05-09 Phase 5.1 — validate.ml baseline (Either-based input
validation, 3 errors × 100 + 117 sum = 417). validate_int returns
`Left msg` on empty / non-digit, `Right (int_of_string s)` on a
digit-only string. process folds inputs with a tuple accumulator
`(errs, sum)`, branching on the result. ["12"; "abc"; "5"; "";
"100"; "x"] → (3, 117) → 417. Exercises Either constructors used
bare (no qualification), char range comparison, tuple-pattern
destructuring on let-binding, recursive helper inside if-else. 36
baseline programs total.
- 2026-05-09 Phase 5.1 — word_freq.ml baseline (Map.Make on String,
count distinct words → 8). Defines a StringOrd module + applies
Map.Make to it. Folds the input through SMap.find_opt + SMap.add to
count each word, then reports SMap.cardinal. "the quick brown fox
jumps over the lazy dog" — "the" appears twice, so 8 distinct
words. First baseline using Map.Make on a string-keyed map. 35
baseline programs total.
- 2026-05-09 Phase 6 — Either module + Hashtbl.copy (+4 tests, 602
total). Either: left, right, is_left, is_right, find_left,
find_right, map_left, map_right, fold, equal, compare. Constructors
are bare `Left x` / `Right x` (per OCaml 4.12+). Hashtbl.copy
builds a fresh cell, walks `_hashtbl_to_list`, and re-adds; mutating
one copy doesn't touch the other (verified by `Hashtbl.length t +
Hashtbl.length t2 = 3` after a fork-and-add).
- 2026-05-09 Phase 5.1 — json_pretty.ml baseline (recursive ADT
serialization). Defines a JSON-like ADT (JNull / JBool / JInt /
JStr / JList) and recursively pretty-prints to a string, then
measures length. Tests algebraic data types with five constructors
(one nullary, three single-arg, one list-arg), recursive `match`
with five arms, `String.concat "," (List.map ...)`, and string
concatenation. `[1,true,null,"hi",[2,3]]` → 24 chars. 34 baseline
programs total.
- 2026-05-09 Phase 5.1 — shuffle.ml baseline (Fisher-Yates with
deterministic Random.init seed). In-place swap loop using `for i =
n - 1 downto 1` and `a.(i) <- a.(j)`. Sum is invariant under
permutation, so the test value (55 for [1..10]) verifies that the
shuffle is a valid permutation regardless of which one. Exercises
Random.init / Random.int + Array.of_list / to_list / length /
arr.(i) / arr.(i) <- v + downto loop. 33 baseline programs total.
- 2026-05-09 Phase 5.1 — pi_leibniz.ml baseline (Leibniz formula,
1000 terms × 100 → 314). Side-quest: `int_of_float` was wrong —
defined as identity in iteration 94 instead of truncation. Fixed
to `if f < 0.0 then ceil else floor` (truncate toward zero, real
OCaml semantics). Float.to_int still uses floor since OCaml's
documentation says "result is unspecified if the argument is nan
or falls outside the int range" — close enough for our scope. 32
baseline programs total.
- 2026-05-09 Phase 6 — Float module fleshed out (+6 tests, 598
total). New Float members: zero, one, minus_one, abs, neg, add,
sub, mul, div, max, min, equal, compare, to_int, of_int,
of_string. Most just lift the host operators (`+.` is already
available as a global). Aligns Float with Int module's API and
unblocks idiomatic float arithmetic in baselines.
- 2026-05-09 Phase 5.1 — balance.ml baseline (paren/bracket/brace
balance using Stack). is_balanced walks a string; on opener push,
on closer check stack non-empty + top matches expected opener (else
fail). Returns ok && is_empty stack at end. 5 test cases:
"({[abc]d}e)" ✓, "(a]" ✗, "{[}]" ✗ (mismatched closers), "(())" ✓,
"" ✓ → 3 balanced. Exercises Stack.create / push / pop / is_empty /
s.[!i] / while + bool ref short-circuit. 31 baseline programs total.
- 2026-05-09 Phase 5.1 — safe_div.ml baseline + Result.equal /
compare / iter_error (+3 tests, 592 total). safe_div divides only
if divisor non-zero, returns `Error "..."` otherwise. sum_safe folds
pairs with `Ok q -> acc+q | Error _ -> acc`.
`[(10,2);(20,4);(30,0);(50,5)]` → 5+5+0+10 = 20. Result additions:
equal/compare take separate eq/cmp for Ok and Error sides; Ok < Error
(-1) and Error > Ok (1). 30 baseline programs total.
- 2026-05-09 Phase 6 — List.equal / List.compare (+5 tests, 589
total). Both take an inner predicate / comparator and walk both
lists in lockstep. equal short-circuits on first mismatch.
compare returns -1 if a is a strict prefix, 1 if b is, 0 if both
empty, otherwise the first non-zero element comparison. Mirrors
real OCaml's signatures: `List.equal eq a b`, `List.compare cmp
a b`.
- 2026-05-09 Phase 6 — Bool module + Option.equal / Option.compare
(+5 tests, 584 total). Bool: equal, compare (false < true via if
ladder), to_string, of_string, not_, to_int. Option additions
take an `eq` or `cmp` parameter for the inner-value check, mirroring
real OCaml's signature: `Option.equal eq a b`, `Option.compare cmp
a b`. None < Some _ for compare.
- 2026-05-09 Phase 5.1 — bag.ml baseline + String.equal/compare/cat/
empty (+3 tests, 579 total). bag.ml: split a sentence on spaces,
count word frequency in a Hashtbl, return the maximum count.
Sentence "the quick brown fox jumps over the lazy dog the fox" has
"the"×3 as the most frequent → 3. Exercises String.split_on_char +
Hashtbl.find_opt/replace + Hashtbl.fold over (k, v) tuples. 29
baseline programs total. String additions: equal, compare (via host
`<`/`>`), cat (alias of `^`), empty.
- 2026-05-09 Phase 5.1 — fraction.ml baseline (rational arithmetic
via record + gcd canonicalization). Defines `type frac = { num;
den }`, `make` that reduces via gcd and forces den > 0, `add` and
`mul` constructors. Computes (1/2 + 1/3) + (2/3 * 3/4) = 4/3, sums
num + den = 7. Exercises records, recursive gcd, `mod`, `abs`,
integer division, and the new `Int.rem`-style truncate-zero
division semantics from iteration 94. 28 baseline programs total.
- 2026-05-09 Phase 6 — Seq module (eager, list-backed) (+4 tests,
576 total). Real OCaml's Seq is lazy (a thunk producing
Cons / Nil); ours is just a list, which is adequate for most
baseline programs that don't rely on infinite sequences. API:
empty, cons, return, is_empty, iter, iteri, map, filter,
filter_map, fold_left, length, take, drop, append, to_list,
of_list, init, unfold. unfold takes a step fn `acc -> Option (elt
* acc)` and threads through until it returns None. Lets us write
`Seq.fold_left (+) 0 (Seq.unfold (fun n -> if n > 4 then None else
Some (n, n + 1)) 1)` → 10.
- 2026-05-09 Phase 5.1 — unique_set.ml baseline (Set.Make + IntOrd
functor app, count uniques in [3;1;4;1;5;9;2;6;5;3;5;8;9;7;9] →
9). First baseline that exercises the functor pipeline end to
end: defines an Ord module with `type t = int` + `compare`, applies
Set.Make to it, then folds the input list adding each element to
the set and queries `IntSet.cardinal`. 27 baseline programs total.
- 2026-05-09 Phase 4 — Set.Make / Map.Make functor application
smoke tests (+3 tests, 572 total). Functors were already wired
through ocaml-make-functor in eval.sx but had no explicit tests
for the user-defined Ord application path. Confirms that
`module S = Set.Make (IntOrd) ;; let s = ... in S.elements s`,
`S.mem 2 s`, and `Map.Make (IntOrd) ;; M.cardinal m` all work end
to end.
- 2026-05-09 Phase 6 — Filename module + Char.compare/equal/escaped
(+7 tests, 569 total). Filename: basename, dirname, extension,
chop_extension, concat, is_relative + dir_sep / current_dir_name /
parent_dir_name constants. Forward-slash only, doesn't try to
detect Windows separators. Char additions: equal, compare (via
code subtraction), escaped (handles `\n`/`\t`/`\r`/`\\`/`\"`).
- 2026-05-09 Phase 4 — basic labeled / optional argument syntax
(label dropped, positional semantics) (+3 tests, 562 total). Three
parser changes:
(1) `at-app-start?` returns true on op `~` or `?` so the app loop
keeps consuming labeled args;
(2) the app arg parser handles `~name:VAL` (drop label, parse VAL),
`?name:VAL` (same), and `~name` punning (treat as `(:var name)`);
(3) `try-consume-param!` drops `~` / `?` and treats the following
ident as a regular positional param name.
Order in the call must match definition order — we don't reorder
args by label name. Optional args don't auto-wrap in Some, so the
function body sees the raw value for `?x:V`. Lets us write
`let f ~x ~y = x + y in f ~x:3 ~y:7` and `let x = 4 in let y = 5
in f ~x ~y` (punning).
- 2026-05-09 Phase 5.1 — merge_sort.ml baseline (user-implemented
mergesort, sorted sum = 44). Stress-tests `let (a, b) = split rest
in (x :: a, y :: b)` (let-tuple destructuring inside a recursive
match arm), nested match-in-match for the merge merge step, and
the (op) operator section `(+)` as fold accumulator. 26 baseline
programs total.
- 2026-05-09 Phase 4 — top-level `let f (a, b) = body` tuple-param
decl (+3 tests, 559 total). parse-decl-let (which lives outside
the ocaml-parse scope and lacks parse-pattern access) uses a
source-slicing approach: detect `(IDENT, ...)`, scan tokens to
matching `)`, slice the pattern source string, store as
(synth_name, pat_src). After collecting params, wrap the rhs
source string with `match SN with PAT_SRC -> (RHS_SRC)` for each
tuple-param, innermost first, then ocaml-parse the wrapped
string. End result is the same shape as the inner-let case: a
function whose body destructures a synthetic name.
- 2026-05-09 Phase 4 — `let f (a, b) = body in body2` tuple-param on
inner-let bindings (+3 tests, 556 total). Mirrors iteration 101's
parse-fun change inside parse-let's parse-one!: same `(IDENT, ...)`
detection, same `__pat_N` synth name, same innermost-first match
wrapping — but applied to the rhs of the let-binding (which is the
function value). Lets us write `let f (a, b) = a + b in f (3, 7)`,
`let g x (a, b) = x + a + b in g 1 (2, 3)`, and `let h (a, b)
(c, d) = a * b + c * d in h (1, 2) (3, 4)`.
- 2026-05-09 Phase 4 — `fun (a, b) -> body` tuple-param destructuring
(+4 tests, 553 total). parse-fun's collect-params now detects
`(IDENT, ...)` (lookahead at peek-tok-at 1/2 to distinguish from
`(x : T)` and `()` cases), generates a synthetic `__pat_N` name as
the actual fun param, and remembers the pattern in tuple-binds.
After parsing the body, wraps it innermost-first with one
`match __pat_N with PAT -> ...` per tuple-param. Also retroactively
simplifies `Hashtbl.keys`/`values` from
`fun pair -> match pair with (k, _) -> k` to plain
`fun (k, _) -> k`.
- 2026-05-09 Phase 6 — Random module (LCG-based, deterministic) (+4
tests, 549 total). Linear-congruential PRNG with mutable seed
(`_state` ref). API: init, self_init, int, bool, float, bits.
`int bound` returns `|state| mod bound` after stepping. Same seed
reproduces same sequence — useful for testing shuffles and Monte
Carlo demos. Real OCaml's Random uses Lagged Fibonacci; ours is
simpler but adequate for baseline programs.
- 2026-05-09 Phase 6 — Hashtbl.keys / values / bindings / remove /
clear / reset / to_seq / to_seq_keys / to_seq_values (+4 tests, 545
total). Two new host primitives `_hashtbl_remove` and
`_hashtbl_clear`; the rest are pure OCaml-syntax helpers in
runtime.sx that map over `_hashtbl_to_list`. `keys` and `values`
pattern-match the (k, v) tuples to extract one side. Note: a
detour to also support top-level `let (a, b) = expr` was reverted
`parse-decl-let` lives in the outer ocaml-parse-program scope
which doesn't have access to parse-pattern; will need a slice +
inner-parse trick later.
- 2026-05-09 Phase 4 — `let PATTERN = expr in body` tuple
destructuring (+3 tests, 541 total). When `let` is followed by `(`,
parse-let now reads a full pattern, expects `=`, then `in`, and
desugars to `(:match expr ((:case PATTERN body)))`. Reuses the
pattern parser used by match. `let (a, b, c) = (10, 20, 30) in
a+b+c` → 60. Also retroactively cleans up the Printf width-pos
packing hack from iteration 97 — it's now `let (width, spec_pos)
= parse_width_loop after_flags in ...` like real OCaml.
- 2026-05-09 Phase 6 — Printf width specifiers `%5d` / `%-5d` /
`%05d` / `%4s` etc. (+5 tests, 538 total). Walker now parses
optional `-` (left-align) and `0` (zero-pad) flags after `%`, then
optional decimal width digits, then the spec letter. After
formatting the arg into a base string, pads to the width using
spaces (or zeros if `0` flag and not `-`). Encoded width+spec_pos
return as `width * 1000000 + spec_pos` because the parser does not
yet support tuple destructuring in `let` (TODO: lift that
limitation; for now this round-trips losslessly for any practical
width). Examples: `%5d` 42 = " 42", `%-5d|` 42 = "42 |",
`%05d` 42 = "00042".
- 2026-05-09 Phase 6 — Printf.sprintf adds %i, %u (aliases of %d),
%x (lowercase hex), %X (uppercase hex), %o (octal) (+5 tests, 533
total). New host primitives `_int_to_hex_lower`, `_int_to_hex_upper`,
`_int_to_octal` build the digit string by repeated host
`floor (/ n base)` + `mod`. The Printf walker fans out specs to the
right host helper. Examples: `%x` 255 = "ff", `%X` 4096 = "1000",
`%o` 8 = "10", multi: `%x %X %o` 255 4096 8 = "ff 1000 10".
- 2026-05-09 Phase 6 — List.sort upgraded from O(n²) insertion sort
to O(n log n) mergesort (+3 tests, 528 total). split + merge are
inner functions of sort; tuple destructuring on the split result is
expressed via nested match (pattern parser needs explicit
paren-wrapping of tuple patterns inside match arms in some places —
inline let-tuple destructuring on a match RHS would be cleaner if
multi-binding `let (a, b) = ...` were promoted, but this works
today). Should make sort-using baselines noticeably faster on
larger lists; existing sort_uniq automatically benefits.
- 2026-05-09 Phase 4 — integer `/` is now truncate-toward-zero on
ints, IEEE on floats. Both operands integral → host floor/ceil based
on sign; otherwise host `/`. Fixes `Int.rem` (which was returning 0
for `Int.rem 17 5` because `a / b` was producing a float). Also adds
Int.{max_int,min_int,zero,one,minus_one,succ,pred,neg,add,sub,mul,
div,rem,equal,compare} and global max_int/min_int/abs_float/
float_of_int/int_of_float (+5 tests, 525 total).
- 2026-05-09 Phase 6 — Array.sort/stable_sort/fast_sort + sub +
append + exists + for_all + mem (+5 tests, 520 total). All
delegate to the corresponding List operation on the cell's
underlying list (sort mutates by replacing the cell, the rest are
pure observers). Array round-trip via of_list → sort → to_list
works as expected.
- 2026-05-09 Phase 5.1 — brainfuck.ml baseline (subset interpreter,
five `+++++.` groups → cumulative 5+10+15+20+25 = 75). No loop
brackets — the interpreter only handles `> < + - .`, but that's
enough to exercise Array.make, arr.(i), arr.(i) <- v, prog.[!pc],
ref/!/:=, while loop with conditional update via nested if/else.
25 baseline programs total.
- 2026-05-09 Phase 5.1 — sieve.ml baseline (Sieve of Eratosthenes,
count of primes ≤ 50 = 15). Stresses Array.make + arr.(i) +
arr.(i) <- v + nested for/while loops + `begin..end` block. 24
baseline programs total.
- 2026-05-09 Phase 4 — `arr.(i)` and `arr.(i) <- v` array indexing
syntax (+3 tests, 515 total). parse-atom-postfix's `.(...)` branch
now disambiguates between let-open and array-get based on whether
the head is a module path (`:con` or a `:field` chain rooted in a
`:con`). Module paths still emit `(:let-open M EXPR)`; everything
else emits `(:array-get ARR I)`. Eval handles `:array-get` by
reading the cell's underlying list at index. The `<-` assignment
handler now also accepts `:array-get` lhs and rewrites the cell
with one position changed. Lets us write idiomatic OCaml array code:
let a = Array.make 5 0 in
for i = 0 to 4 do a.(i) <- i * i done;
a.(3) + a.(4) (* = 25 *)
- 2026-05-09 Phase 6 — Array module (ref-of-list backing) + (op)
operator sections (+6 tests, 512 total). Array implements
make/length/get/set/init/iter/iteri/map/mapi/fold_left/to_list/
of_list/copy/blit/fill in OCaml syntax in runtime.sx; backing is a
`ref of list` so set is O(n) but mutation works. (op) sections in
parse-atom: when the token after `(` is a binop and the next is
`)`, emit `(:fun ("a" "b") (:op OP a b))``(+)` becomes `fun a b
-> a + b`. Recognises any binop in the precedence table including
`mod`, `land`, `^`, `@`, `::`, etc. Lets us write `List.fold_left
(+) 0 xs` and `((-) 10)` partial applications.
- 2026-05-09 Phase 5.1 — csv.ml baseline (split on '\n' then ',',
parse-int the second field, fold-left). Exercises char escapes
inside string literals, two-stage String.split_on_char, mixed
List.fold_left + int_of_string + List.nth. Sums column 2 of a
4-row inline CSV → 1+2+3+4 = 10. 23 baseline programs total.
- 2026-05-09 Phase 4 — polymorphic variants confirmation (+3 tests,
506 total). The tokenizer was already classifying `` `Tag `` as a
ctor identical to a nominal one, but it had never been exercised by
tests. Now verified that nullary, n-ary, and list-of-polyvariants
patterns all match: `` `Foo``, `` `Pair (5, 7)``, `[`On; `Off]`.
Effectively free since OCaml-on-SX is dynamic — there's no
structural row inference, but matching by tag works.
- 2026-05-09 Phase 6 — List.sort_uniq / List.find_map (+2 tests, 503
total). sort_uniq sorts then dedups consecutive equals. find_map
walks until the user fn returns `Some v` and returns it (or `None`
on empty/all-None). Closes two of the most-asked-for list ops; both
defined in OCaml syntax in runtime.sx.
- 2026-05-09 Phase 6 — String.iter / iteri / fold_left / fold_right /
to_seq / of_seq (+3 tests, 501 total). All implemented in OCaml
syntax inside the runtime stdlib; iter / iteri walk via index +
side-effecting `f`, fold_left / fold_right thread an accumulator,
to_seq returns a char list, of_seq concats a char list back to a
string. Round-trip: `String.of_seq (List.rev (String.to_seq
"hello"))` → "olleh".
- 2026-05-09 Phase 5.1 — frequency.ml baseline + Format module alias
(+2 tests, 498 total). frequency.ml builds a Hashtbl of char→count
via `Hashtbl.find_opt` + `Hashtbl.replace` inside a `for` loop, then
uses `Hashtbl.fold` to find the maximum count. `count_chars
"abracadabra"` → max is 5 (a×5). Format module added as a thin
alias of Printf — sprintf / printf / asprintf all delegate.
- 2026-05-09 Phase 4 — `lazy EXPR` + `Lazy.force` (+2 tests, 496
total). Tokenizer already had `lazy` as a keyword. parse-prefix now
emits `(:lazy EXPR)`; eval creates a one-element cell with state
`("Thunk" expr env)`. Host primitive `_lazy_force` flips the cell to
`("Forced" v)` on first call and returns the cached value on
subsequent calls. Memoization confirmed by tracking a side-effect
counter through two forces (counter increments only once).
- 2026-05-09 Phase 6 — Hashtbl.iter / Hashtbl.fold (+2 tests, 494
total). New host primitive `_hashtbl_to_list` returns the entries
as a list of OCaml tuples (`("tuple" k v)` form, matching the AST
representation that pattern matching expects). Hashtbl.iter / fold
in runtime walk that list with the user fn. Closes a long-standing
gap: previously Hashtbl was opaque after writing to it.
- 2026-05-09 Phase 6 — Printf.sprintf with %d/%s/%f/%c/%b/%% (+4
tests) and global `string_of_int`/`string_of_float`/`string_of_bool`
(+1 test). 492 total. sprintf walks fmt char-by-char accumulating
a prefix; on a recognised spec it returns a one-arg fn that formats
the arg and recurses on the rest of fmt — naturally curries to the
right arity since the spec count drives the chain. Dynamic typing
lets us return either a string (no specs) or a function (≥1 spec)
from the same expression, which OCaml proper would reject.
Examples:
Printf.sprintf "x=%d" 42 = "x=42"
Printf.sprintf "%s = %d" "answer" 42 = "answer = 42"
Printf.sprintf "%d%%" 50 = "50%"
- 2026-05-09 Phase 4 — `assert EXPR` (+3 tests, 487 total). Tokenizer
already classified `assert` as a keyword; parse-prefix now handles
it like `not` (advance, recur, wrap). Eval evaluates the operand and
returns nil on truthy, raises `Assert_failure` on false (host-side
error so existing try/with handles it). `try (assert false; 0) with
_ -> 99` → 99.
- 2026-05-09 Phase 5.1 — levenshtein.ml baseline (recursive edit
distance, no memo). Sums distances for five short pairs:
("abc","abx")=1 + ("ab","ba")=2 + ("abc","axyc")=2 +
("","abcd")=4 + ("ab","")=2 = 11. Exercises curried four-arg
recursion + s.[i] equality test + min nested twice + mixed empty
string base cases.
- 2026-05-09 Phase 5.1 — caesar.ml baseline (ROT13 with String.init +
s.[i] + Char.code/chr). Side-quests:
(1) top-level `let r = expr in body` is now treated as an expression
decl when has-matching-in? returns true at the dispatcher. Slices via
skip-let-rhs-boundary which already opens depth on a leading let
with matching in;
(2) added String.make / String.init / String.map to runtime;
(3) bumped lib/ocaml/baseline/run.sh per-program timeout 240→480s
for headroom on contended hosts.
Test = `Char.code r.[0] + Char.code r.[4]` after ROT13 round-trip on
"hello" → 215 (h+o).
- 2026-05-08 Phase 4 — `s.[i]` string indexing syntax (+3 tests, 484
total). parse-atom-postfix now handles `.[expr]` after `.`,
emitting `(:string-get S I)`; eval reduces to host `(nth s i)`.
Pairs with the existing `M.(expr)` and `.field` postfixes — all three
share one dot loop. `let s = "hi" in for i = 0 to String.length s -
1 do n := !n + Char.code s.[i] done; !n` returns 209 (h+i).
- 2026-05-08 Phase 5.1 — roman.ml baseline (Roman numeral greedy
encoding). Side-quest: top-level `let () = expr` was unsupported by
ocaml-parse-program — now parse-decl-let recognises `()` as a unit
binding (`__unit_NN` synthetic name), matching the inner-let handling
in parse-let. roman.ml uses recursive pattern match on
`(int * string) list` greedy table + `List.fold_left + String.length`
to compute the cumulative length of 9 encoded numbers (44).
Bumped test.sh server timeout 180→360s for headroom on contended
systems.
- 2026-05-08 Phase 4 — `M.(expr)` local-open expression form (+3
tests, 481 total). Implemented in parse-atom-postfix: after
consuming `.`, if next token is `(`, parse the inner expression and
emit `(:let-open M EXPR)` instead of `:field`. Cleanly composes with
existing `:let-open` evaluator. `List.(length [1;2;3])` → 3,
`Option.(map (fun x -> x * 10) (Some 4))` → Some 40.
- 2026-05-08 Phase 4 — `let open M in body` local opens (+3 tests, 478
total). Parser detects `let open` as a separate let-form, parses M
as a path (Ctor(.Ctor)*), and emits `(:let-open PATH BODY)`. Eval
resolves the path to a module dict and merges its bindings into the
env for body evaluation. `let open List in map (fun x -> x * 2)
[1;2;3]` → `[2;4;6]`.
- 2026-05-08 Phase 4 — `:def-mut` / `:def-rec-mut` inside module
bodies (+2 tests, 475 total). `ocaml-eval-module` now handles
multi-binding `let .. and ..` decls. `module M = struct let rec a n =
... and b n = ... end` works.
- 2026-05-08 Phase 5.1 — bfs.ml baseline (20/20 pass). Graph
breadth-first search using Queue + Hashtbl visited-set + List.assoc_opt
+ List.iter. Returns the count of reachable nodes (6 for the demo
graph A→B→D→F, A→C→{D,E}, E→F).
- 2026-05-08 Phase 1 — type annotations on let-bindings and parens
expressions (+4 tests, 473 total). `let NAME [PARAMS] : T = expr`
and `(expr : T)` parse and skip the type source. Runtime no-op
(dynamic). Works in inline let, top-level let, and parenthesised
expressions: `let f (x : int) : int = x + 1 in f 41`.
- 2026-05-08 Phase 1+5.1 — type aliases + poly_stack baseline (+3
tests, 469 total + 19 baseline). Parser dispatch on the post-`=`
token: `|` or `Ctor` → sum, `{` → record, otherwise → alias (skip
to boundary). AST `(:type-alias NAME PARAMS)` with body discarded.
Runtime no-op. poly_stack.ml baseline exercises a functor whose
parameter has `type t = int` (record alias) + `let show : t ->
string`. Stack uses ref + module field lookup to format ints.
- 2026-05-08 Phase 2+3 — `try ... with | pat when GUARD -> body` guard
support (+3 tests, 467 total). parse-try mirrors match/function;
eval-try clause loop now dispatches on `case`/`case-when` and falls
through to next clause when guard is false.
- 2026-05-08 Phase 1+3 — `function | pat when GUARD -> body | …`
guard support (+3 tests, 464 total). `parse-function` mirrors the
match-clause when-handling.
- 2026-05-08 Phase 5.1 — anagrams.ml baseline (18/18 pass). Counts
anagram-equivalence groups via Hashtbl + List.sort + String.get +
for-loop. `["eat";"tea";"tan";"ate";"nat";"bat"]` → 3 groups.
- 2026-05-08 Phase 5.1 — lambda_calc.ml baseline (17/17 pass). Untyped
lambda calculus interpreter using two ADTs (`type term = Var | Abs |
App | Num`, `type value = VNum | VClos`), an env as `(string * value)
list`, and recursive eval. `(\x.\y.x) 7 99 = 7` end-to-end. Demonstrates
the substrate handles a non-trivial AST + closure-based evaluator
written in OCaml-on-SX.
- 2026-05-08 Phase 6 — Char predicates: is_digit/is_alpha/is_alnum/
is_whitespace/is_upper/is_lower (+7 tests, 461 total). All written
in OCaml in runtime.sx using Char.code + ASCII range checks.
- 2026-05-08 Phase 5 — HM for top-level `let..and..` decls (+3
tests, 454 total). `ocaml-type-of-program` now handles `:def-mut`
(sequential generalization) and `:def-rec-mut` (mutual recursion
with shared tvs) decls. Mutual `even`/`odd` and `map`/`length`
type-check at top level.
- 2026-05-08 Phase 5.1 — memo_fib.ml baseline (16/16 pass). Memoized
fibonacci using `Hashtbl.find_opt` + `Hashtbl.add`. fib(25) = 75025.
Demonstrates mutable dict semantics through the OCaml stdlib API.
- 2026-05-08 Phase 5.1 — queens.ml baseline (15/15 pass). 4-queens
count via recursive backtracking with `List.fold_left`. Returns 2
(the two solutions of 4-queens). Per-program timeout in run.sh
bumped to 240s — tree-walking interpreter is slow on heavy recursion
but correct. The substrate handles full backtracking + safe-check
recursion + list-driven candidate enumeration end-to-end.
- 2026-05-08 Phase 5.1 — mutable_record.ml baseline (14/14 pass).
Counter-style record with two mutable fields, bump function uses
`r.f <- v` to mutate. End-to-end validates type decl + record
literal + field access + field assignment + sequence operator.
- 2026-05-08 Phase 2 — mutable record fields `r.f <- v` (+4 tests, 451
total). `<-` added to op-table at level 1 (same as `:=`). Eval
short-circuits on `<-` to mutate the lhs's field via host SX
`dict-set!`. Tested with for-loop accumulator (`for i = 1 to 5 do
r.x <- r.x + i done`) and string-field reassignment. The `mutable`
keyword in record-type decls is parsed-and-discarded; runtime
semantics: every field is mutable.
- 2026-05-08 Phase 1+3 — record type declarations `type r = { x : int;
mutable y : string }` (+3 tests, 447 total). Parser dispatches on
`{` after `=` to parse field list (`mutable` keyword tracked).
AST: `(:type-def-record NAME PARAMS FIELDS)` with FIELDS each being
`(NAME)` or `(:mutable NAME)`. Runtime is no-op (records already
work as dynamic dicts). Field-type sources are skipped; HM type
registration deferred.
- 2026-05-08 Phase 5.1 — fizzbuzz.ml baseline (12/12 pass). Classic
fizzbuzz using ref-cell accumulator, for-loop, mod, if/elseif chain,
String.concat, Int.to_string. Verifies output via String.length.
- 2026-05-08 Phase 2+6 — print primitives wired to host `display` (+2
tests, 444 total). `print_string` / `print_endline` / `print_int` /
`print_newline` now use SX `display` (no auto-newline) plus an
explicit `"\n"` for endline. Prior version referenced `print`/
`println` host primitives that don't exist. `let _ = expr ;;`
top-level decl works as expected (already supported by the
wildcard-param parser).
- 2026-05-08 Phase 5 — HM let-mut / let-rec-mut inference (+3 tests,
442 total). `ocaml-infer-let-mut` infers each rhs in the parent env
and generalizes sequentially; `ocaml-infer-let-rec-mut` pre-binds
all names with fresh tvs, infers each rhs against the joint env,
unifies, generalizes, then infers body. Mutual recursion works:
`let rec even n = ... and odd n = ... in even : Int -> Bool`.
- 2026-05-08 Phase 6 — Option/Result/Bytes extensions (+9 tests, 439
total). Option: join, to_result, some, none. Result: value, iter,
fold. Bytes: length, get, of_string, to_string, concat, sub (thin
alias of String — SX has no separate immutable byte type). Ordering
fix: Bytes module placed after String so its closures capture String
in scope.
- 2026-05-08 Phase 6 — `Stack` and `Queue` modules in OCaml (+5 tests,
430 total). Stack: ref-holding-list LIFO with push/pop/top/length/
is_empty/clear. Queue: amortised O(1) two-list `(front, back)` queue
with push/pop/length/is_empty/clear. Both written entirely in OCaml
via lib/ocaml/runtime.sx.
- 2026-05-08 Phase 5.1+1+2 — calc.ml baseline (11/11 pass) — a
recursive-descent calculator parsing `(1 + 2) * 3 + 4` to 13. Two
parser bugs fixed along the way: parse-let now handles inline
`let rec ... and ... in body` via new `:let-rec-mut` / `:let-mut`
AST shapes (eval supports both); `has-matching-in?` no longer stops
at `and` (which is internal to a let-rec, not a decl boundary). The
baseline exercises mutually-recursive functions, while-loops, and
ref-cell-driven imperative parsing.
- 2026-05-08 Phase 5.1 — word_count.ml baseline (10/10 pass). Uses
Map.Make(StrOrd) + List.fold_left to count word frequencies; tests
the full functor pipeline with a real OCaml idiom.
- 2026-05-08 Phase 6 — Map/Set extensions: iter/fold/map/filter/
is_empty + Set.union/inter (+4 tests, 422 total). Functor
bodies grow naturally — all in OCaml syntax.
- 2026-05-08 Phase 6 — `Map.Make` / `Set.Make` functors written in
OCaml (+4 tests, 418 total). Sorted association list / sorted list
backed (linear ops, but correct). Both take an `Ord` module supplying
`compare`. Tested: `module IntMap = Map.Make(IntOrd) ;; IntMap.find
` and `IntSet.elements (IntSet.add 3 (IntSet.add 1 …))` returning
`[1; 2; 3]`. Strong substrate-validation for the functor system —
Map.Make is a non-trivial functor implemented entirely on top of the
OCaml-on-SX evaluator.
- 2026-05-08 Phase 6 — `Sys` module constants (+5 tests, 414 total).
os_type, word_size, max_array_length, max_string_length,
executable_name, big_endian, unix, win32, cygwin. Constants-only
for now; `argv`/`getenv_opt`/`command` need host platform integration.
- 2026-05-08 Phase 5 — parse simple type sources in user type-defs
(+3 tests, 409 total). `ocaml-hm-parse-type-src` recognises
primitive type names, tyvars `'a`, and `T list`/`T option`-style
parametric types. Replaces the old "default to Int" placeholder so
`type t = TStr of string` correctly registers `TStr : string -> t`.
Multi-arg / function types still fall back to a fresh tv.
- 2026-05-08 Phase 6 — String extensions: ends_with/contains/trim/
split_on_char/replace_all/index_of (+6 tests, 406 total). Wraps host
primitives via `_string_*` builtins.
- 2026-05-08 Phase 6 — `List.take/drop/filter_map/flat_map/concat_map`
(+6 tests, 400 total). Common functional helpers, all written in
OCaml. **400-test milestone.**
- 2026-05-08 Phase 1+3 — or-patterns `(P1 | P2 | ...)` parens-only
(+5 tests, 394 total). Parser: when `|` follows a pattern inside
parens, build `(:por ALT1 ALT2 ...)`. Eval: try alternatives, succeed
on the first match. Top-level `|` remains the clause separator (no
lookahead needed). Examples: `(1 | 2 | 3) -> ...`, `(Red | Green) -> 1`.
- 2026-05-08 Phase 4 — `module type S = sig … end` parser (+3 tests,
389 total). Signatures are parsed-and-discarded — sig..end balanced
skipping. AST: `(:module-type-def NAME)`. Runtime no-op (signatures
are type-level). Allows real OCaml code with module type decls to
parse and run without removing the sig blocks.
- 2026-05-08 Phase 1+3 — record patterns `{ f = pat; … }` (+4 tests,
386 total). Parser adds `(:precord (FIELD PAT) …)` alongside
the existing record-literal `{` handling. Eval matches against
dicts: required fields must be present and each pat must match the
value. Can mix with literals: `{ x = 1; y = y }` matches only when
x is 1.
- 2026-05-08 Phase 5.1 — expr_eval.ml baseline (9/9 pass). A tiny
arithmetic-expression evaluator using ADT (`type expr = Lit | Add |
Mul | Neg`) + recursive eval + pattern match — exercises the full
type-decl + ctor + match pipeline end-to-end. Per-program timeout
bumped to 120s in run.sh.
- 2026-05-08 Phase 3 — polymorphic variants `` `Tag `` (+4 tests, 382
total). Tokenizer recognises backtick followed by an upper ident,
tokenizing identically to nominal ctors. Parser and evaluator treat
them as ctors — same tagged-list runtime. Match patterns `` `Red ``
/ `` `Pair (a, b) `` work without any extra wiring. Proper row
types in HM deferred.
- 2026-05-08 Phase 6 — Float module: sqrt/sin/cos/pow/floor/ceil/round
+ pi constant (+6 tests, 378 total). Wraps host SX math primitives
via `_float_*` builtins.
- 2026-05-08 Phase 1+5+6 — Float arithmetic (`+.` `-.` `*.` `/.`)
(+5 tests, 372 total). Tokenizer recognises the dotted operators.
Parser table places them at int's level (7 / 8). Eval routes them
to host SX `+`/`-`/`*`/`/` (which works for both ints and floats).
HM types them `Float -> Float -> Float`; `1.5 +. 2.5 : Float`.
Float type added to formatter as a plain `Float` ctor.
- 2026-05-08 Phase 6 — List.combine/split/iter2/fold_left2/map2 (+4
tests, 367 total). Mechanical pair-walk OCaml implementations,
failwith on length-mismatch matching Stdlib semantics. List module
now covers 30+ functions.
- 2026-05-08 Phase 5.1 — baseline expanded to 8 programs (8/8 pass).
Added: closures.ml (curried adders), quicksort.ml (recursive sort
on lists), exception_handle.ml (exception decl + raise + try/with).
All 8 programs together exercise let-rec, modules, refs, for-loops,
pattern matching, exceptions, lambdas, list functions, arithmetic.
Run.sh streamlined to one sx_server invocation per program (was
two). End-to-end runtime ≈2 min for the suite.
- 2026-05-08 Phase 5.1 — `lib/ocaml/baseline/` with five sample OCaml
programs (.ml files), driven by `lib/ocaml/baseline/run.sh` through
`ocaml-run-program (file-read F)`. All 5/5 pass: factorial,
list_ops, option_match, module_use (module + ref + closure +
sequenced calls), sum_squares (for-loop). To make module_use parse,
parser's `skip-let-rhs-boundary!` now lookaheads for a matching `in`
before any decl-keyword — distinguishes nested let-in from a new
top-level decl. Test 274 (`let x = 1 let y = 2`) still works because
its body has no inner `in`.
- 2026-05-08 Phase 5 — HM with user `type` declarations (+6 tests, 363
total). `ocaml-hm-ctors` is now a mutable list cell; user type-defs
register their constructors via `ocaml-hm-register-type-def!`. New
`ocaml-type-of-program` processes top-level decls in order: type-def
registers ctors, def/def-rec generalize, exception-def is a no-op,
expr returns its inferred type. Examples:
`type color = Red | Green | Blue;; Red : color`
`type shape = Circle of int | Square of int;; let area s = match s
with | Circle r -> r * r | Square s -> s * s;; area : shape -> Int`
Caveat: ctor arg types parsed as raw source strings; the registry
defaults to `int` for any single-arg ctor. Proper type-source parsing
is pending.
- 2026-05-08 Phase 5 — HM let-rec inference + `::`/`@` operator types
(+6 tests, 357 total). `ocaml-infer-let-rec` pre-binds the function
name to a fresh tv, infers rhs (which can recursively call name),
unifies inferred-rhs-type with the tv, generalizes, then infers
body. Builtin env now types `:: : 'a -> 'a list -> 'a list` and
`@ : 'a list -> 'a list -> 'a list`. Now `let rec fact …`,
`let rec map f xs = match xs with … h :: t -> f h :: map f t`, and
`let rec sum …` all infer correctly:
`fact : Int -> Int`
`len : 'a list -> Int`
`map : ('a -> 'b) -> 'a list -> 'b list`
`sum [1;2;3] : Int`
- 2026-05-08 Phase 5 — HM constructor inference for option/result (+7
tests, 351 total). `ocaml-hm-ctor-env` registers None/Some (`'a opt`),
Ok/Error (`('a, 'b) result`). `:con NAME` instantiates the scheme;
`:pcon NAME ARG-PATS` walks arg patterns through the constructor's
arrow type, unifying each. Pretty-printer renders `Int option` and
`(Int, 'b) result`. Examples:
`fun x -> Some x : 'a -> 'a option`
`fun o -> match o with | None -> 0 | Some n -> n : Int option -> Int`
- 2026-05-08 Phase 5 — HM pattern-matching inference (+5 tests, 344
total). `ocaml-infer-pat` covers wild, var, lit, cons, list, tuple,
as. `ocaml-infer-match` unifies each clause's pattern type with the
scrutinee, runs the body in the env extended with pattern-bound vars,
and unifies all body types via a fresh result tv. Examples:
`fun lst -> match lst with | [] -> 0 | h :: _ -> h : Int list -> Int`.
Constructor patterns fall through to a fresh tv for now (need a ctor
type registry from `type` decls — pending).
- 2026-05-08 Phase 6 — `List.sort` + polymorphic `compare` (+7 tests,
339 total). `compare` is a host primitive that returns -1/0/1 like
Stdlib.compare, defers to host SX `<`/`>`. `List.sort` is implemented
in OCaml as insertion sort: O(n²) but correct, and passes all tests
including descending custom comparator and string sort.
- 2026-05-08 Phase 6 — `Hashtbl` (+6 tests, 332 total). Backing store is
a one-element list cell holding a SX dict; keys are coerced to
strings via `str` so any value type can serve as a key. API: create,
add, replace, find, find_opt, mem, length.
- 2026-05-08 Phase 5 — HM extensions for tuples and lists (+7 tests,
326 total). Tuple type `(hm-con "*" TYPES)`, list type `(hm-con
"list" (TYPE))`. `ocaml-infer-tuple` threads substitution through
each item; `ocaml-infer-list` unifies all elements with a fresh
`'a` (giving `'a list` for `[]`). Pretty-printer renders `Int * Int`
and `Int list` like real OCaml. `fun x y -> (x, y) : 'a -> 'b -> 'a
* 'b`. `fun x -> [x; x] : 'a -> 'a list`.
- 2026-05-08 Phase 6 — expanded stdlib slice (+15 tests, 319 total).
List: concat/flatten, init, find/find_opt, partition, mapi/iteri,
assoc/assoc_opt. Option: iter, fold, to_list. Result: get_ok,
get_error, map_error, to_option. Also fixed parser's
skip-to-boundary! to track `let..in` / `begin..end` / `struct..end`
/ `for/while..done` nesting via a depth counter so nested let
expressions inside top-level decl bodies don't trip over the
decl-boundary detector. Stdlib functions like `init` use `begin..end`
to make nested-let intent explicit.
- 2026-05-08 Phase 3 — `exception` declarations (+4 tests, 304 total).
`exception NAME [of TYPE]` parses to `(:exception-def NAME [ARG-SRC])`.
Runtime is a no-op: exception values are just tagged ctor values, so
the existing `raise`/`try`/`with` machinery works without any extra
wiring.
- 2026-05-08 Phase 3 — `type` declarations (+5 tests, 300 total). Parser
handles `type [PARAMS] NAME = | Ctor [of T1 [* T2]*] | ...`, with
optional `'a` or `('a, 'b)` type parameters. Argument types are
captured as raw source strings (treated opaquely at runtime). Runtime
is a no-op since ctor application + match already work dynamically.
300th test! Constructors `Red`/`Green`/`Blue` and `Circle of int` /
`Square of int` round-trip through parse + eval cleanly.
- 2026-05-08 Phase 3 — `as` aliases + `when` guards in match (+6 tests,
295 total). Parser: pattern parser wraps with `as ident``(:pas
PAT NAME)`. Match's `one` consumes optional `when GUARD-EXPR` → emits
`(:case-when PAT GUARD BODY)` instead of `:case`. Eval `:pas` matches
inner pattern then also binds the alias name; `case-when` checks the
guard after a successful match and falls through if false. Or-pat
`(P1 | P2)` deferred — ambiguous with clause separator without
parens-only support.
- 2026-05-08 Phase 1+2 — record literals `{ x = 1; y = 2 }` and
functional update `{ r with x = 99 }`. Parser produces `(:record (F E)
...)` and `(:record-update BASE-EXPR (F E) ...)`. Eval builds a dict
from field bindings; record-update merges over the base dict (the same
dict-based representation we already use for modules). Field access
via existing `:field` postfix. Record patterns deferred. 289/289 (+6).
- 2026-05-08 Phase 5.1 — `lib/ocaml/conformance.sh` + `scoreboard.json`
+ `scoreboard.md`. Classifies tests into 14 suites by description
prefix and emits structured pass/fail counts. Current: 284 pass / 0
fail (one test counted twice in classifier, hence 284 vs 283
underlying). Vendoring real OCaml testsuite is the next step but
needs more stdlib coverage to make .ml files runnable end-to-end.
- 2026-05-08 Phase 1 — unit `()` and wildcard `_` parameters in `let f ()
= …` / `fun _ -> …` / `let f _ = …`. Parser helper `try-consume-param!`
now handles ident, wildcard `_` (renamed to `__wild_N`), unit `()`
(renamed to `__unit_N`), and typed `(x : T)` (signature skipped).
Same for top-level `parse-decl-let`. test.sh timeout extended from
60s to 180s for the growing suite. 283/283 (+5).
- 2026-05-08 Phase 6 — extended stdlib slice (+13 tests, 278 total).
Host primitives exposed via `_string_*`, `_char_*`, `_int_*`,
`_string_of_*` underscore-prefixed builtins so the OCaml-side
`lib/ocaml/runtime.sx` modules can wrap them: String (length, get,
sub, concat, uppercase_ascii, lowercase_ascii, starts_with), Char
(code, chr, lowercase_ascii, uppercase_ascii), Int (to_string,
of_string, abs, max, min), Float.to_string, Printf stubs. Also added
`print_string`/`print_endline`/`print_int` builtins.
- 2026-05-08 Phase 5 — Hindley-Milner type inference, paired-sequencing
consumer of `lib/guest/hm.sx` (algebra) and `lib/guest/match.sx`
(unify). `lib/ocaml/infer.sx` ships Algorithm W rules for OCaml AST:
atoms, var (instantiate), fun (auto-curry through fresh-tv), app
(unify against arrow), let (generalize over rhs), if (unify branches),
neg/not, op (treat as app of builtin). Builtin env types `+`/`-`/etc.
as monomorphic int->int->int and `=`/`<>` as polymorphic 'a->'a->bool.
Tested: literals, +1, identity polymorphism `'a -> 'a`, let-poly so
`let id = fun x -> x in id true : Bool`, `twice` infers
`('a -> 'a) -> 'a -> 'a`. Mandate satisfied: OCaml-on-SX is the
deferred second consumer for lib-guest Step 8. 265/265 (+14).
- 2026-05-08 Phase 2 — `let ... and ...` mutual recursion at top level.
Parser collects all bindings into a list, emitting `(:def-rec-mut)` or
`(:def-mut)` when there are 2+. Eval allocates a placeholder cell per
recursive binding, builds an env with all of them visible, then fills
the cells. Even/odd mutual-recursion test passes. 251/251 (+3).
- 2026-05-08 Phase 6 — `lib/ocaml/runtime.sx` minimal stdlib slice
written entirely in OCaml syntax: List (length, rev, rev_append, map,
filter, fold_left/right, append, iter, mem, for_all, exists, hd, tl,
nth), Option (map, bind, value, get, is_none, is_some), Result (map,
bind, is_ok, is_error). Loaded once via `ocaml-load-stdlib!`, cached
in `ocaml-stdlib-env`; `ocaml-run` and `ocaml-run-program` layer user
code on top via `ocaml-base-env`. The fact that these are written in
OCaml (not SX) and parse + evaluate cleanly is a substrate-validation
win: every parser, eval, match, ref, and module path proven by a
single nontrivial Ocaml program. 248/248 (+23).
- 2026-05-08 Phase 4 — functors + module aliases (+5 tests, 225 total).
Parser: `module F (M) = struct DECLS end` → `(:functor-def NAME PARAMS
DECLS)`. `module N = expr` (where expr isn't `struct`) → `(:module-alias
NAME BODY-SRC)`. Functor params accept `(P)` or `(P : Sig)` (signatures
parsed-and-skipped). Eval: `ocaml-make-functor` builds a curried
host-SX closure that takes module dicts and returns a module dict;
`ocaml-resolve-module-path` extended for `:app` so `F(A)`, `F(A)(B)`,
`Outer.Inner` all resolve to dicts. Tested: 1-arg functor, 2-arg
curried `Pair(One)(Two)`, module alias, submodule alias, identity
functor with include. Phase 4 LOC ~290 (still well under 2000).
- 2026-05-08 Phase 4 — `open M` / `include M` (+5 tests, 220 total).
Parser: top-level `open Path` / `include Path` decls; path is `Ctor (.
Ctor)*`. Eval resolves the path via `ocaml-resolve-module-path` (the
same `:con`-as-module-lookup escape hatch used for `:field`); merges
the dict bindings into the current env via `ocaml-env-merge-dict`.
`include` inside a module also adds the bindings to the module's
resulting dict, so `module Sphere = struct include Math let area r =
... end` exposes both Math's `pi` and Sphere's `area`. Phase 4 LOC
cumulative: ~165.
- 2026-05-08 Phase 4 — modules + field access (+11 tests, 215 total). Parser:
`module M = struct DECLS end` decl in `ocaml-parse-program`. Body parsed
by sub-tokenising the source between `struct` and the matching `end`,
tracking nesting via `struct`/`begin`/`sig`/`end`. Field access added
as a postfix layer above `parse-atom`, binding tighter than application:
`f r.x` → `(:app f (:field r "x"))`. Eval: `(:module-def NAME DECLS)`
builds a dict via new `ocaml-eval-module` that runs decls in a sub-env;
`(:field EXPR NAME)` looks up the field, with the special case that
`(:con NAME)` heads are interpreted as module-name lookups instead of
nullary ctors. Tested: simple module, multi-decl module, nested modules
(`Outer.Inner.v`), `let rec` inside a module, module containing tuple
pattern match. Phase 4 LOC: ~110 (well under 2000 budget).
- 2026-05-08 Phase 2 — `try`/`with` + `raise` builtin. Parser produces
`(:try EXPR CLAUSES)`; eval delegates to SX `guard` with `else`
matching the raised value against clause patterns and re-raising on
no-match. `raise`/`failwith`/`invalid_arg` exposed as builtins;
failwith builds `("Failure" msg)` so `Failure msg -> ...` patterns
match. 204/204 (+6).
- 2026-05-08 Phase 2 — `function | pat -> body | …` parser + eval.
Sugar for `fun x -> match x with | …`. AST: `(:function CLAUSES)`
evaluated to a unary closure that runs `ocaml-match-clauses` on the
argument. `let rec` knot also triggers when rhs is `:function`, so
`let rec map f = function | [] -> [] | h::t -> f h :: map f t` works.
ocaml-match-eval refactored to share `ocaml-match-clauses` with the
function form. 198/198 (+4).
- 2026-05-08 Phase 2 — `for`/`while` loops. `(:for NAME LO HI DIR BODY)`
with `:ascend`/`:descend` direction (`to`/`downto`); `(:while COND BODY)`.
Both eval to unit and re-bind the loop var per iteration. 194/194 (+5).
- 2026-05-08 Phase 2 — references (`ref`/`!`/`:=`). `ref` is a builtin
that boxes its argument in a one-element list (the mutable cell);
prefix `!` parses to `(:deref EXPR)` and reads `(nth cell 0)`; `:=`
joins the precedence table at the lowest binop level (right-assoc) and
short-circuits in eval to mutate via `set-nth!`. Closures capture refs
by sharing the underlying list. 189/189 (+6).
- 2026-05-08 Phase 3 — pattern matching evaluator + constructors (+18
tests, 183 total). Constructor application: `(:app (:con NAME) arg)`
builds a tagged list `(NAME …args)` with tuple args flattened (so
`Pair (a, b)` → `("Pair" a b)` matches the parser's pattern flatten).
Standalone ctor `(:con NAME)` → `(NAME)` (nullary). Pattern matcher:
:pwild / :pvar / :plit (unboxed compare) / :pcon (head + arity match) /
:pcons (cons-decompose) / :plist (length+items) / :ptuple (after `tuple`
tag). Match drives clauses until first success; runtime error on
exhaustion. Tested with option match, literal match, tuple match,
recursive list functions (`len`, `sum`), nested ctor (`Pair(a,b)`).
Note: arity flattening happens for any tuple-arg ctor — without ADT
declarations there's no way to distinguish `Some (1,2)` (single tuple
payload) from `Pair (1,2)` (two-arg ctor). All-flatten convention is
consistent across parser + evaluator.
- 2026-05-08 Phase 2 — `lib/ocaml/eval.sx`: ocaml-eval + ocaml-run +
ocaml-run-program. Coverage: atoms, var lookup, :app (curried),
:op (arithmetic/comparison/boolean/^/mod/::/|>), :neg, :not, :if,
:seq, :tuple, :list, :fun (auto-curried host-SX closures), :let,
:let-rec (recursive knot via one-element-list mutable cell). Initial
env exposes `not`/`succ`/`pred`/`abs`/`max`/`min`/`fst`/`snd`/`ignore`
as host-SX functions. Tests: literals, arithmetic, comparison, boolean,
string concat, conditionals, lambda + closures + recursion (fact 5,
fib 10, sum 100), sequences, top-level program decls, |> pipe. 165/165
passing (+42).
- 2026-05-07 Phase 1 — sequence operator `;`. Lowest-precedence binary;
`e1; e2; e3` → `(:seq e1 e2 e3)`. Two-phase grammar: `parse-expr-no-seq`
is the prior expression entry point; new `parse-expr` wraps it with
`;` chaining. List-literal items still use `parse-expr-no-seq` so `;`
retains its separator role inside `[…]`. Match-clause bodies use the
seq variant and stop at `|`, matching real OCaml semantics. Trailing `;`
before `end`/`)`/`|`/`in`/`then`/`else`/eof is permitted. 123/123 tests
passing (+10).
- 2026-05-07 Phase 1 — `match`/`with` + pattern parser. Patterns: wildcard,
literal, var, ctor (nullary + with arg, with tuple-arg flattening so
`Pair (a, b)` → `(:pcon "Pair" PA PB)`), tuple, list literal, cons `::`
(right-assoc), parens, unit. Match clauses: leading `|` optional, body
parsed via `parse-expr`. AST: `(:match SCRUT CLAUSES)` where each clause
is `(:case PAT BODY)`. 113/113 tests passing (+9). Note: parse-expr is
used for case bodies, so a trailing `| pat -> body` after a complex body
will be reached because `|` is not in the binop table for level 1.
- 2026-05-07 Phase 1 — top-level program parser `ocaml-parse-program`. Parses
a sequence of `let [rec] name params* = expr` decls and bare expressions
separated by `;;`. Output `(:program DECLS)` with each decl one of `(:def …)`,
`(:def-rec …)`, `(:expr E)`. Decl bodies parsed by re-feeding the source
slice through `ocaml-parse` (cheap stand-in until shared-state refactor).
104/104 tests now passing (+9).
- 2026-05-07 Phase 1 — `lib/ocaml/parser.sx` expression parser consuming
`lib/guest/pratt.sx` for binop precedence (29 operators across 8 levels,
incl. keyword-spelled binops `mod`/`land`/`lor`/`lxor`/`lsl`/`lsr`/`asr`).
Atoms (literals + var/con/unit/list), application (left-assoc), prefix
`-`/`not`, tuples, parens, `if`/`then`/`else`, `fun x y -> body`,
`let`/`let rec` with function shorthand. AST shapes match Haskell-on-SX
conventions (`(:int N)` `(:op OP L R)` `(:fun PARAMS BODY)` etc.). Total
95/95 tests now passing via `lib/ocaml/test.sh`.
- 2026-05-07 Phase 1 — `lib/ocaml/tokenizer.sx` consuming `lib/guest/lex.sx`
via `prefix-rename`. Covers idents, ctors, 51 keywords, numbers (int / float
/ hex / exponent / underscored), strings (with escapes), chars (with escapes),
type variables (`'a`), nested block comments, and 26 operator/punct tokens
(incl. `->` `|>` `<-` `:=` `::` `;;` `@@` `<>` `&&` `||` `**` etc.). 58/58
tokenizer tests pass via `lib/ocaml/test.sh` driving `sx_server.exe`.
## Blockers
_(none yet)_