Defines:
type frac = { num : int; den : int }
let rec gcd a b = if b = 0 then a else gcd b (a mod b)
let make n d = (* canonicalise: gcd-reduce and
force den > 0 *)
let add x y = make (x.num * y.den + y.num * x.den) (x.den * y.den)
let mul x y = make (x.num * y.num) (x.den * y.den)
Test:
let r = add (make 1 2) (make 1 3) in (* 5/6 *)
let s = mul (make 2 3) (make 3 4) in (* 1/2 *)
let t = add r s in (* 5/6 + 1/2 = 4/3 *)
t.num + t.den (* = 7 *)
Exercises records, recursive gcd, mod, abs, integer division (the
truncate-toward-zero semantics from iter 94 are essential here —
make would diverge from real OCaml's behaviour with float division).
28 baseline programs total.
First baseline that exercises the functor pipeline end to end:
module IntOrd = struct
type t = int
let compare a b = a - b
end
module IntSet = Set.Make (IntOrd)
let unique_count xs =
let s = List.fold_left (fun s x -> IntSet.add x s) IntSet.empty xs in
IntSet.cardinal s
Counts unique elements in [3;1;4;1;5;9;2;6;5;3;5;8;9;7;9]:
{1,2,3,4,5,6,7,8,9} -> 9
The input has 15 elements with 9 unique values. The 'type t = int'
declaration in IntOrd is required by real OCaml; OCaml-on-SX is
dynamic and would accept it without, but we include it for source
fidelity. 27 baseline programs total.
User-implemented mergesort that exercises features added across the
last few iterations:
let rec split lst = match lst with
| x :: y :: rest ->
let (a, b) = split rest in (* iter 98 let-tuple destruct *)
(x :: a, y :: b)
| ...
let rec merge xs ys = match xs with
| x :: xs' ->
match ys with (* nested match-in-match *)
| y :: ys' -> ...
...
List.fold_left (+) 0 (sort [...]) (* iter 89 (op) section *)
Sum of [3;1;4;1;5;9;2;6;5;3;5] = 44 regardless of order, so the
result is also a smoke test of the implementation correctness — if
merge_sort drops or duplicates an element the sum diverges. 26
baseline programs total.
Five '+++++.' groups, cumulative accumulator 5+10+15+20+25 = 75.
This is a brainfuck *subset* — only > < + - . (no [ ] looping). That's
intentional: the goal is to stress imperative idioms that the recently
added Array module + array indexing syntax + s.[i] make ergonomic, all
in one program.
Exercises:
Array.make 256 0
arr.(!ptr)
arr.(!ptr) <- arr.(!ptr) + 1
prog.[!pc]
ref / ! / :=
while + nested if/else if/else if for op dispatch
25 baseline programs total.
Counts primes <= 50, expected 15.
Stresses the recently-added Array module + the new array-indexing
syntax together with nested control flow:
let sieve = Array.make (n + 1) true in
sieve.(0) <- false;
sieve.(1) <- false;
for i = 2 to n do
if sieve.(i) then begin
let j = ref (i * i) in
while !j <= n do
sieve.(!j) <- false;
j := !j + i
done
end
done;
...
Exercises: Array.make, arr.(i), arr.(i) <- v, nested for/while,
begin..end blocks, ref/!/:=, integer arithmetic. 24 baseline
programs total.
Inline CSV-like text:
a,1,extra
b,2,extra
c,3,extra
d,4,extra
Two-stage String.split_on_char: first on '\n' for rows, then on ','
for fields per row. List.fold_left accumulates int_of_string of the
second field across rows. Result = 1+2+3+4 = 10.
Exercises char escapes inside string literals ('\n'), nested
String.split_on_char, List.fold_left with a non-trivial closure body,
and int_of_string. 23 baseline programs total.
frequency.ml exercises the recently-added Hashtbl.iter / fold +
Hashtbl.find_opt + s.[i] indexing + for-loop together: build a
char-count table for 'abracadabra' then take the max via
Hashtbl.fold. Expected = 5 (a x 5). Total 25 baseline programs.
Format module added as a thin alias of Printf — sprintf, printf, and
asprintf all delegate to Printf.sprintf. The dynamic runtime doesn't
distinguish boxes/breaks, so format strings work the same as in
Printf and most Format-using OCaml programs now compile.
Recursive Levenshtein edit distance with no memoization (the test
strings are short enough for the exponential-without-memo version to
fit in <2 minutes on contended hosts). Sums distances for five short
pairs:
('abc','abx') + ('ab','ba') + ('abc','axyc') + ('','abcd') + ('ab','')
= 1 + 2 + 2 + 4 + 2 = 11
Exercises:
* curried four-arg recursion
* s.[i] equality test (char comparison)
* min nested twice for the three-way recurrence
* mixed empty-string base cases
Side-quests required to land caesar.ml:
1. Top-level 'let r = expr in body' is now an expression decl, not a
broken decl-let. ocaml-parse-program's dispatch now checks
has-matching-in? at every top-level let; if matched, slices via
skip-let-rhs-boundary (which already opens depth on a leading let
with matching in) and ocaml-parse on the slice, wrapping as :expr.
2. runtime.sx: added String.make / String.init / String.map. Used by
caesar.ml's encode = String.init n (fun i -> shift_char s.[i] k).
3. baseline run.sh per-program timeout 240->480s (system load on the
shared host frequently exceeds 240s for large baselines).
caesar.ml exercises:
* the new top-level let-in expression dispatch
* s.[i] string indexing
* Char.code / Char.chr round-trip math
* String.init with a closure that captures k
Test value: Char.code r.[0] + Char.code r.[4] after ROT13(ROT13('hello')) = 104 + 111 = 215.
Side-quest emerged from adding roman.ml baseline (Roman numeral greedy
encoding): top-level 'let () = expr' was unsupported because
ocaml-parse-program's parse-decl-let consumed an ident strictly. Now
parse-decl-let recognises a leading '()' as a unit binding and
synthesises a __unit_NN name (matching how parse-let already handles
inner-let unit patterns).
roman.ml exercises:
* tuple list literal [(int * string); ...]
* recursive pattern match on tuple-cons
* String.length + List.fold_left
* the new top-level let () support (sanity in a comment, even though
the program ends with a bare expression for the test harness)
Bumped lib/ocaml/test.sh server timeout 180->360s — the recent surge in
test count plus a CPU-contended host was crowding out the sole epoch
reaching the deeper smarts.
Graph BFS using Queue + Hashtbl visited-set + List.assoc_opt + List.iter.
Returns 6 for a graph where A reaches B/C/D/E/F. Demonstrates 4 stdlib
modules (Queue, Hashtbl, List) cooperating in a real algorithm.
Parser: in parse-decl-type, dispatch on the post-= token:
'|' or Ctor -> sum type
'{' -> record type
otherwise -> type alias (skip to boundary)
AST (:type-alias NAME PARAMS) with body discarded. Runtime no-op since
SX has no nominal types.
poly_stack.ml baseline exercises:
module type ELEMENT = sig type t val show : t -> string end
module IntElem = struct type t = int let show x = ... end
module Make (E : ELEMENT) = struct ... use E.show ... end
module IntStack = Make(IntElem)
Demonstrates the substrate handles signature decls + abstract types +
functor parameter with sig constraint.
Group anagrams by canonical (sorted-chars) key using Hashtbl +
List.sort. Demonstrates char-by-char traversal via String.get + for-loop +
ref accumulator + Hashtbl as a multi-valued counter.
Untyped lambda calculus interpreter inside OCaml-on-SX:
type term = Var | Abs of string * term | App | Num of int
type value = VNum of int | VClos of string * term * env
let rec eval env t = match t with ...
(\x.\y.x) 7 99 = 7. The substrate handles two ADTs, recursive eval,
closure-based env, and pattern matching all written as a single
self-contained OCaml program — strong validation.
Memoized fibonacci using Hashtbl.find_opt + Hashtbl.add.
fib(25) = 75025. Demonstrates mutable Hashtbl through the OCaml
stdlib API in real recursive code.
4-queens via recursive backtracking + List.fold_left. Returns 2 (the
two solutions of 4-queens). Per-program timeout in run.sh bumped to
240s — the tree-walking interpreter is slow on heavy recursion but
correct.
The substrate handles full backtracking + safe-check recursion +
list-driven candidate enumeration end-to-end.
Counter-style record with two mutable fields. Validates the new
r.f <- v field mutation end-to-end through type decl + record literal
+ field access + field assignment + sequence operator.
type counter = { mutable count : int; mutable last : int }
let bump c = c.count <- c.count + 1 ; c.last <- c.count
After 5 bumps: count=5, last=5, sum=10.
Polymorphic binary search tree with insert + in-order traversal.
Exercises parametric ADT (type 'a tree = Leaf | Node of 'a * 'a tree
* 'a tree), recursive match, List.append, List.fold_left.
Classic fizzbuzz using ref-cell accumulator, for-loop, mod, if/elseif
chain, String.concat, Int.to_string. Output verified via String.length
of the comma-joined result for n=15: 57.
Recursive-descent calculator parses '(1 + 2) * 3 + 4' = 13. Two parser
bugs fixed:
1. parse-let now handles inline 'let rec a () = ... and b () = ... in
body' via new (:let-rec-mut BINDINGS BODY) and (:let-mut BINDINGS
BODY) AST shapes; eval handles both.
2. has-matching-in? lookahead no longer stops at 'and' — 'and' is
internal to let-rec, not a decl boundary. Without this fix, the
inner 'let rec a () = ... and b () = ...' inside a let-decl rhs
would have been treated as the start of a new top-level decl.
Baseline exercises mutually-recursive functions, while-loops, ref-cell
imperative parsing, and ADT-based AST construction.
Uses Map.Make(StrOrd) + List.fold_left to count word frequencies;
exercises the full functor pipeline with a real-world idiom:
let inc_count m word =
match StrMap.find_opt word m with
| None -> StrMap.add word 1 m
| Some n -> StrMap.add word (n + 1) m
let count words = List.fold_left inc_count StrMap.empty words
10/10 baseline programs pass.
A tiny arithmetic-expression evaluator using:
type expr = Lit of int | Add of expr*expr | Mul of expr*expr | Neg of expr
let rec eval e = match e with | Lit n -> n | Add (a,b) -> ...
Exercises type-decl + multi-arg ctor + recursive match end-to-end.
Per-program timeout in run.sh bumped to 120s.
lib/ocaml/baseline/{factorial,list_ops,option_match,module_use,sum_squares}.ml
exercised through ocaml-run-program (file-read F). lib/ocaml/baseline/
run.sh runs them and compares against expected.json — all 5 pass.
To make module_use.ml (with nested let-in) parse, parser's
skip-let-rhs-boundary! now uses has-matching-in? lookahead: a let at
depth 0 in a let-decl rhs opens a nested block IFF a matching in
exists before any decl-keyword. Without that in, the let is a new
top-level decl (preserves test 274 'let x = 1 let y = 2').
This is the first piece of Phase 5.1 'vendor a slice of OCaml
testsuite' — handcrafted fixtures for now, real testsuite TBD.