Iterative Levenshtein DP with rolling 1D arrays for O(min(m,n))
space. Distances:
kitten -> sitting : 3
saturday -> sunday : 3
abc -> abc : 0
"" -> abcde : 5
intention -> execution : 5
----------------------------
total : 16
Complementary to the existing levenshtein.ml which uses the
exponential recursive form (only sums tiny strings); this one is
the practical iterative variant used for real ED.
Tests the recently-fixed <- with bare `if` rhs:
curr.(j) <- (if m1 < c then m1 else c) + 1
153 baseline programs total.
Array-backed binary min-heap with explicit size tracking via ref:
let push a size x =
a.(!size) <- x; size := !size + 1; sift_up a (!size - 1)
let pop a size =
let m = a.(0) in
size := !size - 1;
a.(0) <- a.(!size);
sift_down a !size 0;
m
Push [9;4;7;1;8;3;5;2;6], pop nine times -> 1,2,3,4,5,6,7,8,9.
Fold-as-decimal: ((((((((1*10+2)*10+3)*10+4)*10+5)*10+6)*10+7)*10+8)*10+9 = 123456789.
Tests recursive sift_up + sift_down, in-place array swap,
parent/lchild/rchild index arithmetic, combined push/pop session
with refs.
152 baseline programs total.
Polynomial rolling hash mod 1000003 with base 257:
- precompute base^(m-1)
- slide window updating hash in O(1) per step
- verify hash match with O(m) memcmp to skip false positives
rolling_match "abcabcabcabcabcabc" "abc" = 6
Six non-overlapping copies of "abc" at positions 0,3,6,9,12,15.
Tests `for _ = 0 to m - 2 do … done` unused loop variable
(uses underscore wildcard pattern), Char.code arithmetic, mod
arithmetic with intermediate negative subtractions, complex nested
if/begin branching with inner break-via-flag.
151 baseline programs total.
Classic CLRS Huffman code example. ADT:
type tree = Leaf of int * char | Node of int * tree * tree
Build by repeatedly merging two lightest trees (sorted-list pq):
let rec build_tree lst = match lst with
| [t] -> t
| a :: b :: rest ->
let merged = Node (weight a + weight b, a, b) in
build_tree (insert merged rest)
weighted path length (= total Huffman bits):
leaves {(5,a) (9,b) (12,c) (13,d) (16,e) (45,f)} -> 224
Tests sum-typed ADT with mixed arities, `function` keyword
pattern matching, recursive sorted insert, depth-counting recursion.
150 baseline programs total.
Previously `a.(i) <- if c then x else y` failed with
"unexpected token keyword if" because parse-binop-rhs called
parse-prefix for the rhs, which doesn't accept if/match/let/fun.
Real OCaml allows full expressions on the rhs of <-/:=. Fix:
special-case prec-1 ops in parse-binop-rhs to call parse-expr-no-seq
instead of parse-prefix. The recursive parse-binop-rhs with
min-prec restored after picks up any further chained <- (since both
ops are right-associative with no higher-prec binops above them).
Manacher baseline updated to use bare `if` on rhs of <-,
removing the parens workaround from iter 235. 607/607 regressions
remain clean.
Manacher's algorithm: insert # separators (length 2n+1) to unify
odd/even cases, then maintain palindrome radii p[] alongside a
running (center, right) pair to skip work via mirror reflection.
Linear time.
manacher "babadaba" = 7 (* witness: "abadaba", positions 1..7 *)
Note: requires parenthesizing the if-expression on the rhs of <-:
p.(i) <- (if pm < v then pm else v)
Real OCaml parses bare `if` at <-rhs since the rhs is at expr
level; our parser places <-rhs at binop level which doesn't include
`if` / `match` / `let`. Workaround until we relax the binop
RHS grammar.
149 baseline programs total.
Floyd-Warshall all-pairs shortest path with triple-nested for-loop:
for k = 0 to n - 1 do
for i = 0 to n - 1 do
for j = 0 to n - 1 do
if d.(i).(k) + d.(k).(j) < d.(i).(j) then
d.(i).(j) <- d.(i).(k) + d.(k).(j)
done
done
done
Graph (4 nodes, directed):
0->1 weight 5, 0->3 weight 10, 1->2 weight 3, 2->3 weight 1
Direct edge 0->3 = 10, but path 0->1->2->3 = 5+3+1 = 9.
Tests 2D array via Array.init with closure, nested .(i).(j) read
+ write, triple-nested for, in-place mutation under aliasing.
148 baseline programs total.
Modified merge sort that counts inversions during the merge step:
when an element from the right half is selected, the remaining
elements of the left half (mid - i + 1) all form inversions with
that right element.
count_inv [|8; 4; 2; 1; 3; 5; 7; 6|] = 12
Inversions of [8;4;2;1;3;5;7;6]:
with 8: (8,4)(8,2)(8,1)(8,3)(8,5)(8,7)(8,6) = 7
with 4: (4,2)(4,1)(4,3) = 3
with 2: (2,1) = 1
with 7: (7,6) = 1
total = 12
Tests: let rec ... and ... mutual recursion, while + ref + array
mutation, in-place sort with auxiliary scratch array.
145 baseline programs total.
Kahn's algorithm BFS topological sort:
let topo_sort n adj =
let in_deg = Array.make n 0 in
for i = 0 to n - 1 do
List.iter (fun j -> in_deg.(j) <- in_deg.(j) + 1) adj.(i)
done;
let q = Queue.create () in
for i = 0 to n - 1 do
if in_deg.(i) = 0 then Queue.push i q
done;
let count = ref 0 in
while not (Queue.is_empty q) do
let u = Queue.pop q in
count := !count + 1;
List.iter (fun v ->
in_deg.(v) <- in_deg.(v) - 1;
if in_deg.(v) = 0 then Queue.push v q
) adj.(u)
done;
!count
Graph: 0->{1,2}; 1->{3}; 2->{3,4}; 3->{5}; 4->{5}; 5.
Acyclic, so all 6 nodes can be ordered.
Tests Queue.{create,push,pop,is_empty}, mutable array via closure.
144 baseline programs total.
Standard 1D 0/1 knapsack DP with reverse inner loop:
let knapsack values weights cap =
let n = Array.length values in
let dp = Array.make (cap + 1) 0 in
for i = 0 to n - 1 do
let v = values.(i) and w = weights.(i) in
for c = cap downto w do
let take = dp.(c - w) + v in
if take > dp.(c) then dp.(c) <- take
done
done;
dp.(cap)
values: [|6; 10; 12; 15; 20|]
weights: [|1; 2; 3; 4; 5|]
knapsack v w 8 = 36 (* take items with weights 1, 2, 5 *)
Tests for-downto + array literal access in the same hot loop.
143 baseline programs total.
Classic 2D DP for longest common subsequence, optimized to use
two rolling 1D arrays (prev / curr) for O(min(m,n)) space:
for i = 1 to m do
for j = 1 to n do
if s1.[i-1] = s2.[j-1] then curr.(j) <- prev.(j-1) + 1
else if prev.(j) >= curr.(j-1) then curr.(j) <- prev.(j)
else curr.(j) <- curr.(j-1)
done;
for j = 0 to n do prev.(j) <- curr.(j) done
done;
prev.(n)
lcs "ABCBDAB" "BDCAB" = 4
Two valid LCS witnesses: BCAB and BDAB.
Avoids Array.make_matrix (not in our runtime) by manual rolling.
142 baseline programs total.
Disjoint-set union with path compression:
let make_uf n = Array.init n (fun i -> i)
let rec find p x =
if p.(x) = x then x
else begin let r = find p p.(x) in p.(x) <- r; r end
let union p x y =
let rx = find p x in let ry = find p y in
if rx <> ry then p.(rx) <- ry
After unioning (0,1), (2,3), (4,5), (6,7), (0,2), (4,6):
{0,1,2,3} {4,5,6,7} {8} {9} --> 4 components.
Tests Array.init with closure, recursive find, in-place .(i)<-r.
139 baseline programs total.
Hoare quickselect with Lomuto partition: recursively narrows the
range to whichever side contains the kth index. Mutates the array
in place via .(i)<-v. The median (k=4) of [7;2;9;1;5;6;3;8;4] is 5.
let rec quickselect arr lo hi k =
if lo = hi then arr.(lo)
else begin
let pivot = arr.(hi) in
let i = ref lo in
for j = lo to hi - 1 do
if arr.(j) < pivot then begin
let t = arr.(!i) in
arr.(!i) <- arr.(j); arr.(j) <- t;
i := !i + 1
end
done;
...
end
Exercises array literal syntax + in-place mutation in the same
program, ensuring [|...|] yields a mutable backing.
138 baseline programs total.
Added parser support for OCaml array literal syntax:
[| e1; e2; ...; en |] --> Array.of_list [e1; e2; ...; en]
[||] --> Array.of_list []
Desugaring keeps the array representation unchanged (ref-of-list)
since Array.of_list is a no-op constructor for that backing.
Tokenizer emits [, |, |, ] as separate ops; parser detects [ followed
by | and enters array-literal mode, terminating on |].
Baseline lis.ml exercises the syntax:
let lis arr =
let n = Array.length arr in
let dp = Array.make n 1 in
for i = 1 to n - 1 do
for j = 0 to i - 1 do
if arr.(j) < arr.(i) && dp.(j) + 1 > dp.(i) then
dp.(i) <- dp.(j) + 1
done
done;
let best = ref 0 in
for i = 0 to n - 1 do
if dp.(i) > !best then best := dp.(i)
done;
!best
lis [|10; 22; 9; 33; 21; 50; 41; 60; 80|] = 6
137 baseline programs total.
Classic Josephus problem solved with the standard recurrence:
let rec josephus n k =
if n = 1 then 0
else (josephus (n - 1) k + k) mod n
josephus 50 3 + 1 = 11
50 people stand in a circle, every 3rd is eliminated; the last
survivor is at position 11 (1-indexed). Tests recursion + mod.
136 baseline programs total.
Counts integer partitions via classic DP:
let partition_count n =
let dp = Array.make (n + 1) 0 in
dp.(0) <- 1;
for k = 1 to n do
for i = k to n do
dp.(i) <- dp.(i) + dp.(i - k)
done
done;
dp.(n)
partition_count 15 = 176
Tests Array.make, .(i)<-/.(i) array access, nested for-loops, refs.
135 baseline programs total.
Uses Euclid's formula: for coprime m > k of opposite parity, the
triple (m^2 - k^2, 2mk, m^2 + k^2) is a primitive Pythagorean.
let count_primitive_triples n =
let c = ref 0 in
for m = 2 to 50 do
let kk = ref 1 in
while !kk < m do
if (m - !kk) mod 2 = 1 && gcd m !kk = 1 then begin
let h = m * m + !kk * !kk in
if h <= n then c := !c + 1
end;
kk := !kk + 1
done
done;
!c
count_primitive_triples 100 = 16
The 16 triples include the classics (3,4,5), (5,12,13), (8,15,17),
(7,24,25), and end with (65,72,97).
134 baseline programs total.
A Harshad (or Niven) number is divisible by its digit sum:
let count_harshad limit =
let c = ref 0 in
for n = 1 to limit do
if n mod (digit_sum n) = 0 then c := !c + 1
done;
!c
count_harshad 100 = 33
All single-digit numbers (1..9) qualify trivially. Plus 10, 12, 18,
20, 21, 24, 27, 30, 36, 40, 42, 45, 48, 50, 54, 60, 63, 70, 72, 80,
81, 84, 90, 100 (24 more) = 33 total under 100.
133 baseline programs total.
Walks digits via mod 10 / div 10, accumulating the reversed value:
let reverse_int n =
let m = ref n in
let r = ref 0 in
while !m > 0 do
r := !r * 10 + !m mod 10;
m := !m / 10
done;
!r
reverse 12345 + reverse 100 + reverse 7
= 54321 + 1 + 7
= 54329
Trailing zeros collapse (reverse 100 = 1, not 001).
132 baseline programs total.
Walks the pin-knockdown list applying strike/spare bonuses through
a 10-frame counter:
strike (10): score 10 + next 2 throws, advance i+1
spare (a + b = 10): score 10 + next 1 throw, advance i+2
open (a + b < 10): score a + b, advance i+2
Frame ten special-cases are handled implicitly: the input includes
bonus throws naturally and the while-loop terminates after frame 10.
bowling_score [10; 7; 3; 9; 0; 10; 0; 8; 8; 2; 0; 6;
10; 10; 10; 8; 1]
= 20+19+9+18+8+10+6+30+28+19
= 167
131 baseline programs total.
Single-helper tail-recursive loop threading an accumulator:
let factorial n =
let rec go n acc =
if n <= 1 then acc
else go (n - 1) (n * acc)
in
go n 1
factorial 12 = 479_001_600
Companion to factorial.ml (10! = 3628800 via doubly-recursive
style); same answer-shape, different evaluator stress: this version
has constant stack depth.
130 baseline programs total — milestone.
safe_div returns None on division by zero; safe_chain stitches two
divisions, propagating None on either failure:
let safe_div a b =
if b = 0 then None else Some (a / b)
let safe_chain a b c =
match safe_div a b with
| None -> None
| Some q -> safe_div q c
Test:
safe_chain 100 2 5 = Some 10
safe_chain 100 0 5 = None -> -1
safe_chain 50 5 0 = None -> -1
safe_chain 1000 10 5 = Some 20
10 - 1 - 1 + 20 = 28
Tests Option chaining + match-on-result with sentinel default.
Demonstrates the canonical 'fail-early on None' pattern.
129 baseline programs total.
19-arm match returning the English word for each number 1..19, then
sum String.length:
let number_to_words n =
match n with
| 1 -> 'one' | 2 -> 'two' | ... | 19 -> 'nineteen'
| _ -> ''
total_letters 19 = 36 + 70 = 106
(1-9) (10-19)
Real PE17 covers 1..1000 (answer 21124) but needs more elaborate
number-to-words logic (compounds, 'and', 'thousand'). 1..19 keeps
the program small while exercising literal-pattern match dispatch
on many arms.
128 baseline programs total.
let palindrome_sum lo hi =
let total = ref 0 in
for n = lo to hi do
if is_pal n then total := !total + n
done;
!total
palindrome_sum 100 999 = 49500
There are 90 three-digit palindromes (form aba; 9 choices for a, 10
for b). Average value 550, sum 49500.
Companion to palindrome.ml (predicate-only) and paren_depth.ml.
127 baseline programs total.
PE12 with target = 10:
let count_divisors n =
let c = ref 0 in
let i = ref 1 in
while !i * !i <= n do
if n mod !i = 0 then begin
c := !c + 1;
if !i * !i <> n then c := !c + 1
end;
i := !i + 1
done;
!c
let first_triangle_with_divs target =
walk triangles T(n) = T(n-1) + n until count_divisors T > target
T(15) = 120 has 16 divisors — first to exceed 10. Real PE12 uses
target 500 (answer 76576500); 10 stays well under budget.
126 baseline programs total.
Perfect numbers = those where the proper-divisor sum equals n. Three
exist under 500: 6, 28, 496. (8128 is the next; 33550336 the one
after that.)
Same div_sum machinery as euler21_small.ml / abundant.ml (the
trial-division up to sqrt-n).
Original 10000 limit timed out at 10 minutes under contention (496
itself takes thousands of trials at the inner loop). 500 stays under
budget while still finding all three small perfects.
125 baseline programs total — milestone.
A number n is abundant if its proper-divisor sum exceeds n. Reuses
the trial-division div_sum helper:
let count_abundant n =
let c = ref 0 in
for i = 12 to n - 1 do
if div_sum i > i then c := !c + 1
done;
!c
count_abundant 100 = 21
Abundant numbers under 100, starting at 12, 18, 20, 24, 30, 36, 40,
42, 48, 54, 56, 60, 66, 70, 72, 78, 80, 84, 88, 90, 96 -> 21.
Companion to euler21_small.ml (amicable). The classification:
perfect: d(n) = n (e.g. 6, 28)
abundant: d(n) > n (e.g. 12, 18)
deficient:d(n) < n (everything else)
124 baseline programs total.
Numbers that read the same in base 10 and base 2:
1, 3, 5, 7, 9, 33, 99, 313, 585, 717
sum = 1772
Implementation:
pal_dec n check decimal palindrome via index walk
to_binary n build binary string via mod 2 / div 2 stack
pal_bin n check binary palindrome
euler36 limit scan 1..limit-1, sum where both palindromes
Real PE36 uses 10^6 (answer 872187). 1000 takes ~9 minutes on
contended host but stays within reasonable budget for the
spec-level evaluator.
123 baseline programs total.
Build the Champernowne string '12345678910111213...' until at
least 1500 chars; product of digits at positions 1, 10, 100, 1000
is 1 * 1 * 5 * 3 = 15.
Initial implementation timed out: 'String.length (Buffer.contents
buf) < 1500' rebuilt the full string each iteration (O(n^2) in our
spec-level evaluator). Fixed by tracking length separately from
the Buffer:
let len = ref 0 in
while !len < 1500 do
let s = string_of_int !i in
Buffer.add_string buf s;
len := !len + String.length s;
i := !i + 1
done
Real PE40 uses positions up to 10^6 (answer 210); 1000 keeps under
budget while exercising the same string-build + char-pick pattern.
122 baseline programs total.
Number that equals the sum of factorials of its digits:
145 = 1! + 4! + 5! = 1 + 24 + 120
Implementation:
fact n iterative factorial
digit_fact_sum n walk digits, sum fact(digit)
euler34 limit scan 3..limit, accumulate matches
The only other factorion is 40585 = 4!+0!+5!+8!+5!. Real PE34 sums
both (= 40730); 2000 keeps under our search budget.
121 baseline programs total.
Compute every power a^b for a, b in [2..N] and count distinct
values. Hashtbl as a set with unit-payload (iter-168 idiom):
let euler29 n =
let h = Hashtbl.create 64 in
for a = 2 to n do
for b = 2 to n do
let p = ref 1 in
for _ = 1 to b do p := !p * a done;
Hashtbl.replace h !p ()
done
done;
Hashtbl.length h
For N=5: 16 powers minus one duplicate (4^2 = 2^4 = 16) -> 15.
Real PE29 uses N=100 (answer 9183).
120 baseline programs total — milestone.
div_sum computes proper divisor sum via trial division up to sqrt(n):
let div_sum n =
let s = ref 1 in
let i = ref 2 in
while !i * !i <= n do
if n mod !i = 0 then begin
s := !s + !i;
let q = n / !i in
if q <> !i then s := !s + q
end;
i := !i + 1
done;
if n = 1 then 0 else !s
Outer loop finds amicable pairs (a, b) with d(a) = b, d(b) = a,
a != b. Only pair under 300 is (220, 284); 220 + 284 = 504.
Real PE21 uses 10000 (answer 31626). 300 keeps the run under
budget while exercising the same divisor-sum trick.
119 baseline programs total.
Numbers equal to the sum of cubes of their digits:
153 = 1 + 125 + 27
370 = 27 + 343 + 0
371 = 27 + 343 + 1
407 = 64 + 0 + 343
sum = 1301
Implementation:
pow_digit_sum n p walk digits of n, accumulate d^p
euler30 p limit scan 2..limit and sum where pow_digit_sum n p = n
Real PE30 uses 5th powers (answer 443839); the cube version
exercises the same algorithm in a smaller search space.
118 baseline programs total.
For each layer 1..(n-1)/2, the four corners of an Ulam spiral are
spaced 2*layer apart. Step k four times per layer, accumulate:
let euler28 n =
let s = ref 1 in
let k = ref 1 in
for layer = 1 to (n - 1) / 2 do
let step = 2 * layer in
for _ = 1 to 4 do
k := !k + step;
s := !s + !k
done
done;
!s
euler28 7 = 1 + (3+5+7+9) + (13+17+21+25) + (31+37+43+49) = 261
Real PE28 uses 1001x1001 (answer 669171001); 7x7 is fast.
117 baseline programs total.
collatz_len walks n through n/2 if even, 3n+1 if odd, counting
steps. Outer loop scans 2..N tracking the best length and arg-best:
let euler14 limit =
let best = ref 0 in
let best_n = ref 0 in
for n = 2 to limit do
let l = collatz_len n in
if l > !best then begin
best := l;
best_n := n
end
done;
!best_n
euler14 100 = 97 (97 generates a 118-step chain)
Real PE14 uses limit = 1_000_000 (answer 837799); 100 exercises the
same algorithm in <2 minutes on our contended host.
116 baseline programs total.
Computes 2^n via for-loop multiplication, then walks the digits via
mod 10 / div 10:
let euler16 n =
let p = ref 1 in
for _ = 1 to n do p := !p * 2 done;
let sum = ref 0 in
let m = ref !p in
while !m > 0 do
sum := !sum + !m mod 10;
m := !m / 10
done;
!sum
euler16 15 = 3 + 2 + 7 + 6 + 8 = 26 (= digit sum of 32768)
Real PE16 asks for 2^1000 which exceeds float precision; 2^15 stays
safe and exercises the same digit-decomposition pattern.
115 baseline programs total.
Iteratively grows two refs while the larger is below 10^(n-1),
counting iterations:
let euler25 n =
let a = ref 1 in
let b = ref 1 in
let i = ref 2 in
let target = ref 1 in
for _ = 1 to n - 1 do target := !target * 10 done;
while !b < !target do
let c = !a + !b in
a := !b;
b := c;
i := !i + 1
done;
!i
euler25 12 = 55 (F(55) = 139_583_862_445, 12 digits)
Real PE25 asks for 1000 digits (answer 4782); 12 keeps within
safe-int while exercising the identical algorithm.
114 baseline programs total — 200 iterations landed.
PE3's worked example. Trial-division streaming: when the current
factor divides m, divide and update largest; otherwise bump factor:
let largest_prime_factor n =
let m = ref n in
let factor = ref 2 in
let largest = ref 0 in
while !m > 1 do
if !m mod !factor = 0 then begin
largest := !factor;
m := !m / !factor
end else factor := !factor + 1
done;
!largest
largest_prime_factor 13195 = 29 (= 5 * 7 * 13 * 29)
The full PE3 number 600851475143 exceeds JS safe-int (2^53 ≈ 9e15
in float terms; 6e11 is fine but the intermediate 'i mod !factor'
on the way to 6857 can overflow precision). 13195 keeps the program
portable across hosts.
113 baseline programs total.
Scaled-down PE7 (real version asks for the 10001st prime = 104743).
Trial-division within an outer while loop searching forward from 2,
short-circuited via bool ref:
let nth_prime n =
let count = ref 0 in
let i = ref 1 in
let result = ref 0 in
while !count < n do
i := !i + 1;
let p = ref true in
let j = ref 2 in
while !j * !j <= !i && !p do
if !i mod !j = 0 then p := false;
j := !j + 1
done;
if !p then begin
count := !count + 1;
if !count = n then result := !i
end
done;
!result
nth_prime 100 = 541
100 keeps the run under our 3-minute budget while exercising the
same algorithm.
112 baseline programs total.
Scaled-down Project Euler #4. Real version uses 3-digit numbers
yielding 906609 = 913 * 993; that's an 810k-iteration nested loop
that times out under our contended-host spec-level evaluator.
The 2-digit version (10..99) is fast enough and tests the same
algorithm:
9009 = 91 * 99 (the only 2-digit-product palindrome > 9000)
Implementation:
is_pal n index-walk comparing s.[i] to s.[len-1-i]
euler4 lo hi nested for with running max + early-skip via
'p > !m && is_pal p' short-circuit
111 baseline programs total.
Sieve of Eratosthenes followed by a sum loop:
let sieve_sum n =
let s = Array.make (n + 1) true in
s.(0) <- false;
s.(1) <- false;
for i = 2 to n do
if s.(i) then begin
let j = ref (i * i) in
while !j <= n do
s.(!j) <- false;
j := !j + i
done
end
done;
let total = ref 0 in
for i = 2 to n do
if s.(i) then total := !total + i
done;
!total
Real PE10 asks for sum below 2,000,000; that's a ~2-3 second loop in
native OCaml but minutes-to-hours under our contended-host
spec-level evaluator. 100 keeps the run under 3 minutes while still
exercising the same algorithm.
110 baseline programs total.
Iteratively takes lcm of running result with i:
let rec gcd a b = if b = 0 then a else gcd b (a mod b)
let lcm a b = a * b / gcd a b
let euler5 n =
let r = ref 1 in
for i = 2 to n do
r := lcm !r i
done;
!r
euler5 20 = 232792560
= 2^4 * 3^2 * 5 * 7 * 11 * 13 * 17 * 19
Tests gcd_lcm composition (iter 140) on a fresh problem.
109 baseline programs total.
Two ref lists accumulating in reverse, then List.rev'd — preserves
original order:
let partition pred xs =
let yes = ref [] in
let no = ref [] in
List.iter (fun x ->
if pred x then yes := x :: !yes
else no := x :: !no
) xs;
(List.rev !yes, List.rev !no)
partition (fun x -> x mod 2 = 0) [1..10]
-> ([2;4;6;8;10], [1;3;5;7;9])
evens sum * 100 + odds sum = 30 * 100 + 25 = 3025
Tests higher-order predicate, tuple return, and iter-98 let-tuple
destructuring on the call site.
108 baseline programs total.
Trial division up to sqrt(n) with early-exit via bool ref:
let is_prime n =
if n < 2 then false
else
let p = ref true in
let i = ref 2 in
while !i * !i <= n && !p do
if n mod !i = 0 then p := false;
i := !i + 1
done;
!p
Outer count_primes loops 2..n calling is_prime, accumulating count.
Returns 25 — the canonical prime-counting function pi(100).
107 baseline programs total.
DP recurrence:
C(0) = 1
C(n) = sum_{j=0}^{n-1} C(j) * C(n-1-j)
let catalan n =
let dp = Array.make (n + 1) 0 in
dp.(0) <- 1;
for i = 1 to n do
for j = 0 to i - 1 do
dp.(i) <- dp.(i) + dp.(j) * dp.(i - 1 - j)
done
done;
dp.(n)
C(5) = 42 — also the count of distinct binary trees with 5 internal
nodes, balanced paren strings of length 10, monotonic lattice paths,
etc.
106 baseline programs total.
Two functions:
classify n maps i to a polymorphic variant
FizzBuzz | Fizz | Buzz | Num of int
score x pattern-matches the variant to a weight
FizzBuzz=100, Fizz=10, Buzz=5, Num n=n
For i in 1..30:
FizzBuzz at 15, 30: 2 * 100 = 200
Fizz at 3,6,9,12,18,21,24,27: 8 * 10 = 80
Buzz at 5,10,20,25: 4 * 5 = 20
Num: rest (16 numbers) = 240
total = 540
Exercises polymorphic-variant match (iter 87) including a
payload-bearing 'Num n' arm.
105 baseline programs total.
Sort, then compare two candidates:
p1 = product of three largest values
p2 = product of two smallest (potentially negative) values and the largest
For [-10;-10;1;3;2]:
sorted = [-10;-10;1;2;3]
p1 = 3 * 2 * 1 = 6
p2 = (-10) * (-10) * 3 = 300
max = 300
Tests List.sort + Array.of_list + arr.(n-i) end-walk + candidate-pick
via if-then-else.
104 baseline programs total.
Find the unique Pythagorean triple with a + b + c = 1000 and
return their product.
The naive triple loop timed out under host contention (10-minute
cap exceeded with ~333 * 999 ~= 333k inner iterations of complex
checks). Rewritten with algebraic reduction:
a + b + c = 1000 AND a^2 + b^2 = c^2
=> b = (500000 - 1000 * a) / (1000 - a)
so only the outer a-loop is needed (333 iterations). Single-pass
form:
for a = 1 to 333 do
let num = 500000 - 1000 * a in
let den = 1000 - a in
if num mod den = 0 then begin
let b = num / den in
if b > a then
let c = 1000 - a - b in
if c > b then result := a * b * c
end
done
Triple (200, 375, 425), product 31875000.
103 baseline programs total.
Project Euler #6: difference between square of sum and sum of squares
for 1..100.
let euler6 n =
let sum = ref 0 in
let sum_sq = ref 0 in
for i = 1 to n do
sum := !sum + i;
sum_sq := !sum_sq + i * i
done;
!sum * !sum - !sum_sq
euler6 100 = 5050^2 - 338350 = 25502500 - 338350 = 25164150
102 baseline programs total.
Sum of even-valued Fibonacci numbers up to 4,000,000:
let euler2 limit =
let a = ref 1 in
let b = ref 2 in
let sum = ref 0 in
while !a <= limit do
if !a mod 2 = 0 then sum := !sum + !a;
let c = !a + !b in
a := !b;
b := c
done;
!sum
Sequence: 1, 2, 3, 5, 8, 13, 21, 34, ... Only every third term
(2, 8, 34, 144, ...) is even. Sum below 4M: 4613732.
101 baseline programs total.
Project Euler #1: sum of all multiples of 3 or 5 below 1000.
let euler1 limit =
let sum = ref 0 in
for i = 1 to limit - 1 do
if i mod 3 = 0 || i mod 5 = 0 then sum := !sum + i
done;
!sum
euler1 1000 = 233168
Trivial DSL exercise but symbolically meaningful: this is the 100th
baseline program.
100 baseline programs total — milestone.
canonical builds a sorted-by-frequency string representation:
let canonical s =
let chars = Array.make 26 0 in
for i = 0 to String.length s - 1 do
let k = Char.code s.[i] - Char.code 'a' in
if k >= 0 && k < 26 then chars.(k) <- chars.(k) + 1
done;
expand into a-z order via a Buffer
For 'eat', 'tea', 'ate' -> all canonicalise to 'aet'. For 'tan',
'nat' -> 'ant'. For 'bat' -> 'abt'.
group_anagrams folds the input, accumulating per-key string lists;
final answer is Hashtbl.length (number of distinct groups):
['eat'; 'tea'; 'tan'; 'ate'; 'nat'; 'bat'] -> 3 groups
99 baseline programs total.
Tracks two bool refs (inc, dec). For each pair of consecutive
elements: if h < prev clear inc, if h > prev clear dec. Returns
inc OR dec at the end:
let is_monotonic xs =
match xs with
| [] -> true
| [_] -> true
| _ ->
let inc = ref true in
let dec = ref true in
let rec walk prev rest = ... in
(match xs with h :: t -> walk h t | [] -> ());
!inc || !dec
Five test cases:
[1;2;3;4] inc only true
[4;3;2;1] dec only true
[1;2;1] neither false
[5;5;5] both (constant) true
[] empty true (vacuous)
sum = 4
98 baseline programs total.
O(n) time / O(1) space majority vote algorithm:
let majority xs =
let cand = ref 0 in
let count = ref 0 in
List.iter (fun x ->
if !count = 0 then begin
cand := x;
count := 1
end else if x = !cand then count := !count + 1
else count := !count - 1
) xs;
!cand
The candidate is updated to the current element whenever count
reaches zero. When a strict majority exists, this guarantees the
result.
majority [3;3;4;2;4;4;2;4;4] = 4 (5 of 9, > n/2)
97 baseline programs total.
Two running sums modulo 65521:
a = (1 + sum of bytes) mod 65521
b = sum of running 'a' values mod 65521
checksum = b * 65536 + a
let adler32 s =
let a = ref 1 in
let b = ref 0 in
let m = 65521 in
for i = 0 to String.length s - 1 do
a := (!a + Char.code s.[i]) mod m;
b := (!b + !a) mod m
done;
!b * 65536 + !a
For 'Wikipedia': 0x11E60398 = 300286872 (the canonical test value).
Tests for-loop accumulating two refs together, modular arithmetic,
and Char.code on s.[i].
96 baseline programs total.
Single-formula generation:
gray[i] = i lxor (i lsr 1)
For n = 4, generates 16 values, each differing from its neighbour
by one bit. Output is a permutation of 0..15, so its sum equals the
natural-sequence sum 120; +16 entries -> 136.
Tests lsl / lxor / lsr together (the iter-127 bitwise ops) plus
Array.make / Array.fold_left.
95 baseline programs total.
Walks list keeping a previous-value reference; increments cur on
match, resets to 1 otherwise. Uses 'Some y when y = x' guard pattern
in match for the prev-value comparison:
let max_run xs =
let max_so_far = ref 0 in
let cur = ref 0 in
let last = ref None in
List.iter (fun x ->
(match !last with
| Some y when y = x -> cur := !cur + 1
| _ -> cur := 1);
last := Some x;
if !cur > !max_so_far then max_so_far := !cur
) xs;
!max_so_far
Three test cases:
[1;1;2;2;2;2;3;3;1;1;1] max run = 4 (the 2's)
[1;2;3;4;5] max run = 1
[] max run = 0
sum = 5
Tests 'when' guard pattern in match arm + Option ref + ref-mutation
sequence inside List.iter closure body.
94 baseline programs total.
Two-pointer walk:
let is_subseq s t =
let i = ref 0 in
let j = ref 0 in
while !i < n && !j < m do
if s.[!i] = t.[!j] then i := !i + 1;
j := !j + 1
done;
!i = n
advance i only on match; always advance j. Pattern matches if i
reaches n.
Five test cases:
'abc' in 'ahbgdc' yes
'axc' in 'ahbgdc' no (no x in t)
'' in 'anything' yes (empty trivially)
'abc' in 'abc' yes
'abcd' in 'abc' no (s longer)
sum = 3
93 baseline programs total.
Sort intervals by start, then sweep maintaining a current (cs, ce)
window — extend ce if next start <= ce, else push current and start
fresh:
let merge_intervals xs =
let sorted = List.sort (fun (a, _) (b, _) -> a - b) xs in
let rec aux acc cur xs =
match xs with
| [] -> List.rev (cur :: acc)
| (s, e) :: rest ->
let (cs, ce) = cur in
if s <= ce then aux acc (cs, max e ce) rest
else aux (cur :: acc) (s, e) rest
in
match sorted with
| [] -> []
| h :: rest -> aux [] h rest
[(1,3);(2,6);(8,10);(15,18);(5,9)]
-> [(1,10); (15,18)]
total length = 9 + 3 = 12
Tests List.sort with custom comparator using tuple patterns, plus
tuple destructuring in lambda + let-tuple from accumulator + match
arms.
92 baseline programs total.
Counts position-wise differences between two strings of equal
length; returns -1 sentinel for length mismatch:
let hamming s t =
if String.length s <> String.length t then -1
else
let d = ref 0 in
for i = 0 to String.length s - 1 do
if s.[i] <> t.[i] then d := !d + 1
done;
!d
Three test cases:
'karolin' vs 'kathrin' 3 (positions 2,3,4)
'1011101' vs '1001001' 2 (positions 2,4)
'abc' vs 'abcd' -1 (length mismatch)
sum 4
91 baseline programs total.
For each character, XOR with the corresponding key char (key cycled
via 'i mod kn'):
let xor_cipher key text =
let buf = Buffer.create n in
for i = 0 to n - 1 do
let c = Char.code text.[i] in
let k = Char.code key.[i mod kn] in
Buffer.add_string buf (String.make 1 (Char.chr (c lxor k)))
done;
Buffer.contents buf
XOR is its own inverse, so encrypt + decrypt with the same key yields
the original. Test combines:
- String.length decoded = 6
- decoded = 'Hello!' -> 1
- 6 * 100 + 1 = 601
Tests Char.code + Char.chr round-trip, the iter-127 lxor operator,
Buffer.add_string + String.make 1, and key-cycling via mod.
90 baseline programs total.
Composite Simpson's 1/3 rule with 100 panels:
let simpson f a b n =
let h = (b -. a) /. float_of_int n in
let sum = ref (f a +. f b) in
for i = 1 to n - 1 do
let x = a +. float_of_int i *. h in
let coef = if i mod 2 = 0 then 2.0 else 4.0 in
sum := !sum +. coef *. f x
done;
h *. !sum /. 3.0
The 1-4-2-4-...-4-1 coefficient pattern is implemented via even/odd
index dispatch. Endpoints get coefficient 1.
For x^2 over [0, 1], exact value is 1/3 ~= 0.33333. Scaled by 30000
gives 9999.99..., int_of_float -> 10000.
Tests higher-order function (passing the integrand 'fun x -> x *. x'),
float arithmetic in for-loop, and float_of_int for index->x conversion.
89 baseline programs total.
Newton's method on integers, converging when y >= x:
let isqrt n =
if n < 2 then n
else
let x = ref n in
let y = ref ((!x + 1) / 2) in
while !y < !x do
x := !y;
y := (!x + n / !x) / 2
done;
!x
Test cases:
isqrt 144 = 12 (perfect square)
isqrt 200 = 14 (floor of sqrt(200) ~= 14.14)
isqrt 1000000 = 1000
isqrt 2 = 1
sum = 1027
Companion to newton_sqrt.ml (iter 124, float Newton). Tests integer
division semantics from iter 94 and a while-until-convergence loop.
88 baseline programs total.
Uses the identities:
F(2k) = F(k) * (2 * F(k+1) - F(k))
F(2k+1) = F(k)^2 + F(k+1)^2
to compute Fibonacci in O(log n) recursive depth instead of O(n).
let rec fib_pair n =
if n = 0 then (0, 1)
else
let (a, b) = fib_pair (n / 2) in
let c = a * (2 * b - a) in
let d = a * a + b * b in
if n mod 2 = 0 then (c, d)
else (d, c + d)
Each call returns the pair (F(n), F(n+1)). fib(40) = 102334155 fits
in JS safe-int (< 2^53). Tests tuple returns with let-tuple
destructuring + recursion on n / 2.
86 baseline programs total.
Standard two-finger merge with nested match-in-match:
let rec merge xs ys =
match xs with
| [] -> ys
| x :: xs' ->
match ys with
| [] -> xs
| y :: ys' ->
if x <= y then x :: merge xs' (y :: ys')
else y :: merge (x :: xs') ys'
Used as a building block in merge_sort.ml (iter 104) but called out
as its own baseline here.
merge [1;4;7;10] [2;3;5;8;9] = [1;2;3;4;5;7;8;9;10]
length 9, sum 49, product 441.
85 baseline programs total.
Fast exponentiation by squaring with modular reduction:
let rec pow_mod base exp m =
if exp = 0 then 1
else if exp mod 2 = 0 then
let half = pow_mod base (exp / 2) m in
(half * half) mod m
else
(base * pow_mod base (exp - 1) m) mod m
Even exponent halves and squares (O(log n)); odd decrements and
multiplies. mod-reduction at each step keeps intermediates bounded.
pow_mod 2 30 1000003 + pow_mod 3 20 13 + pow_mod 5 17 100 = 738639
84 baseline programs total.
Same 'tree = Leaf | Node of int * tree * tree' ADT as iter-159
max_path_tree.ml, but the recursion ignores the value:
let rec depth t = match t with
| Leaf -> 0
| Node (_, l, r) ->
let dl = depth l in
let dr = depth r in
1 + (if dl > dr then dl else dr)
For the test tree:
1
/ 2 3
/ 4 5
/
8
longest path is 1 -> 2 -> 5 -> 8, depth = 4.
Tests wildcard pattern in constructor 'Node (_, l, r)', two nested
let-bindings in match arm, inline if-as-expression for max.
83 baseline programs total.
Walk input with Hashtbl.mem + Hashtbl.add seen x () (unit-payload
turns the table into a set); on first occurrence cons to the result
list; reverse at the end:
let stable_unique xs =
let seen = Hashtbl.create 8 in
let result = ref [] in
List.iter (fun x ->
if not (Hashtbl.mem seen x) then begin
Hashtbl.add seen x ();
result := x :: !result
end
) xs;
List.rev !result
For [3;1;4;1;5;9;2;6;5;3;5;8;9]:
result = [3;1;4;5;9;2;6;8] (input order, dupes dropped)
length = 8, sum = 38 total = 46
Tests Hashtbl as a set abstraction (unit-payload), the rev-build
idiom, and 'not (Hashtbl.mem seen x)' membership negation.
82 baseline programs total.
Inverse of run_length.ml from iteration 130. Takes a list of
(value, count) tuples and expands:
let rec rle_decode pairs =
match pairs with
| [] -> []
| (x, n) :: rest ->
let rec rep k = if k = 0 then [] else x :: rep (k - 1) in
rep n @ rle_decode rest
rle_decode [(1,3); (2,2); (3,4); (1,2)]
= [1;1;1; 2;2; 3;3;3;3; 1;1]
sum = 3 + 4 + 12 + 2 = 21.
Tests tuple-cons pattern, inner-let recursion, list concat (@), and
the 'List.fold_left (+) 0' invariant on encoding round-trips.
81 baseline programs total.
Defines unary Peano numerals with two recursive functions for
arithmetic:
type peano = Zero | Succ of peano
let rec plus a b = match a with
| Zero -> b
| Succ a' -> Succ (plus a' b)
let rec mul a b = match a with
| Zero -> Zero
| Succ a' -> plus b (mul a' b)
mul is defined inductively: mul Zero _ = Zero; mul (Succ a) b =
b + (a * b).
to_int (mul (from_int 5) (from_int 6)) = 30
The result is a Peano value with 30 nested Succ wrappers; to_int
unrolls them to a host int. Tests recursive ADT with a single-arg
constructor + four mutually-defined recursive functions (no rec/and
needed since each is defined separately).
80 baseline programs total — milestone.
Companion to coin_change.ml (min coins). Counts distinct multisets
via the unbounded-knapsack DP:
let count_ways coins target =
let dp = Array.make (target + 1) 0 in
dp.(0) <- 1;
List.iter (fun c ->
for i = c to target do
dp.(i) <- dp.(i) + dp.(i - c)
done
) coins;
dp.(target)
Outer loop over coins, inner DP relaxes dp.(i) += dp.(i - c). The
order matters — coin in outer, amount in inner — to count multisets
rather than ordered sequences.
count_ways [1; 2; 5; 10; 25] 50 = 406.
79 baseline programs total.
One-pass walk tracking current depth and a high-water mark:
let max_depth s =
let d = ref 0 in
let m = ref 0 in
for i = 0 to String.length s - 1 do
if s.[i] = '(' then begin
d := !d + 1;
if !d > !m then m := !d
end
else if s.[i] = ')' then d := !d - 1
done;
!m
Three inputs:
'((1+2)*(3-(4+5)))' 3 (innermost (4+5) at depth 3)
'(((deep)))' 3
'()()()' 1 (no nesting)
sum 7
Tests for-loop char comparison s.[i] = '(' and the high-water-mark
idiom with two refs.
78 baseline programs total.
Each pass:
1. find_max in [0..size-1]
2. if max not at the right end, flip max to position 0 (if needed)
3. flip the size-prefix to push max to the end
Inner 'flip k' reverses prefix [0..k] using two pointer refs lo/hi.
Inner 'find_max k' walks 1..k tracking the max-position.
pancake_sort [3;1;4;1;5;9;2;6]
= 9 flips * 100 + a.(0) + a.(n-1)
= 9 * 100 + 1 + 9
= 910
The output combines flip count and sorted endpoints, so the test
verifies both that the sort terminates and that it sorts correctly.
Tests two inner functions closing over the same Array, ref-based
two-pointer flip, and downto loop with conditional flip dispatch.
77 baseline programs total.
Iterative two-ref Fibonacci with modular reduction every step:
let fib_mod n m =
let a = ref 0 in
let b = ref 1 in
for _ = 1 to n do
let c = (!a + !b) mod m in
a := !b;
b := c
done;
!a
The 100th Fibonacci is 354_224_848_179_261_915_075, well past JS
safe-int (2^53). Modular reduction every step keeps intermediate
values within int53 precision so the answer is exact in our
runtime. fib(100) mod 1000003 = 391360.
76 baseline programs total.
Walks digits right-to-left, doubles every other starting from the
second-from-right; if a doubled value > 9, subtract 9. Sum must be
divisible by 10:
let luhn s =
let n = String.length s in
let total = ref 0 in
for i = 0 to n - 1 do
let d = Char.code s.[n - 1 - i] - Char.code '0' in
let v = if i mod 2 = 1 then
let dd = d * 2 in
if dd > 9 then dd - 9 else dd
else d
in
total := !total + v
done;
!total mod 10 = 0
Test cases:
'79927398713' valid
'79927398710' invalid
'4532015112830366' valid (real Visa test)
'1234567890123456' invalid
sum = 2
Tests right-to-left index walk via 'n - 1 - i', Char.code '0'
arithmetic for digit conversion, and nested if-then-else.
75 baseline programs total.
Bottom-up DP minimum-path through a triangle:
2
3 4
6 5 7
4 1 8 3
let min_path_triangle rows =
initialise dp from last row;
for r = n - 2 downto 0 do
for c = 0 to row_len - 1 do
dp.(c) <- row.(c) + min(dp.(c), dp.(c+1))
done
done;
dp.(0)
The optimal path 2 -> 3 -> 5 -> 1 sums to 11.
Tests downto loop, Array.of_list inside loop body, nested arr.(i)
reads + writes, and inline if-then-else for min.
74 baseline programs total.
Recursive ADT for binary trees:
type tree = Leaf | Node of int * tree * tree
let rec max_path t =
match t with
| Leaf -> 0
| Node (v, l, r) ->
let lp = max_path l in
let rp = max_path r in
v + (if lp > rp then lp else rp)
For the test tree:
1
/ 2 3
/ \ \
4 5 7
paths sum: 1+2+4=7, 1+2+5=8, 1+3+7=11. max = 11.
Tests 3-arg Node constructor with positional arg destructuring, two
nested let-bindings, and if-then-else as an inline expression.
73 baseline programs total.
Extended Euclidean returns a triple (gcd, x, y) such that
a*x + b*y = gcd:
let rec ext_gcd a b =
if b = 0 then (a, 1, 0)
else
let (g, x1, y1) = ext_gcd b (a mod b) in
(g, y1, x1 - (a / b) * y1)
let mod_inverse a m =
let (_, x, _) = ext_gcd a m in
((x mod m) + m) mod m
Three invariants checked:
inv(3, 11) = 4 (3*4 = 12 = 1 mod 11)
inv(5, 26) = 21 (5*21 = 105 = 1 mod 26)
inv(7, 13) = 2 (7*2 = 14 = 1 mod 13)
sum = 27
Tests recursive triple-tuple return, tuple-pattern destructuring on
let-binding (with wildcard for unused fields), and nested
let-binding inside the recursive call site.
72 baseline programs total.
Three small functions:
hist xs build a Hashtbl of count-by-value
max_value h Hashtbl.fold to find the max bin
total h Hashtbl.fold to sum all bins
For the 15-element list [1;2;3;1;4;5;1;2;6;7;1;8;9;1;0]:
total = 15
max_value = 5 (the number 1 appears 5 times)
product = 75
Companion to bag.ml (string keys) and frequency.ml (char keys) —
same Hashtbl.fold + Hashtbl.find_opt pattern, exercised on int
keys this time.
71 baseline programs total.
Standard amortising-mortgage formula:
payment = P * r * (1 + r)^n / ((1 + r)^n - 1)
where r = annual_rate / 12, n = years * 12.
let payment principal annual_rate years =
let r = annual_rate /. 12.0 in
let n = years * 12 in
let pow_r = ref 1.0 in
for _ = 1 to n do pow_r := !pow_r *. (1.0 +. r) done;
principal *. r *. !pow_r /. (!pow_r -. 1.0)
For 200,000 at 5% over 30 years: monthly payment ~= $1073.64,
int_of_float -> 1073.
Manual (1+r)^n via for-loop instead of Float.pow keeps the program
portable to any environment where pow is restricted.
Tests float arithmetic precedence, for-loop accumulation in a float
ref, int_of_float on the result.
70 baseline programs total — milestone.
One-liner that swaps the lists on every recursive call:
let rec zigzag xs ys =
match xs with
| [] -> ys
| x :: xs' -> x :: zigzag ys xs'
This works because each call emits the head of xs and recurses with
ys as the new xs and the rest of xs as the new ys.
zigzag [1;3;5;7;9] [2;4;6;8;10] = [1;2;3;4;5;6;7;8;9;10] sum = 55
Tests recursive list cons + arg-swap idiom that is concise but
non-obvious to readers expecting symmetric-handling.
68 baseline programs total.
Two utility functions:
prefix_sums xs builds Array of len n+1 such that
arr.(i) = sum of xs[0..i-1]
range_sum p lo hi = p.(hi+1) - p.(lo)
For [3;1;4;1;5;9;2;6;5;3]:
range_sum 0 4 = 14 (3+1+4+1+5)
range_sum 5 9 = 25 (9+2+6+5+3)
range_sum 2 7 = 27 (4+1+5+9+2+6)
sum = 66
Tests List.iter mutating Array indexed by a ref counter, plus the
classic prefix-sum technique for O(1) range queries.
67 baseline programs total.
Board as 9-element flat int array, 0=empty, 1=X, 2=O. Three
predicate functions:
check_row b r check_col b c check_diag b
each return the winning player's mark or 0. Main 'winner' loops
i = 0..2 calling row(i)/col(i) then check_diag, threading via a
result ref.
Test board:
X X X
. O .
. . O
X wins on row 0 -> winner returns 1.
Tests Array.of_list with row-major 'b.(r * 3 + c)' indexing,
multi-fn collaboration, and structural equality on int values.
66 baseline programs total.
Pure recursion — at each element, take it or don't:
let rec count_subsets xs target =
match xs with
| [] -> if target = 0 then 1 else 0
| x :: rest ->
count_subsets rest target
+ count_subsets rest (target - x)
For [1;2;3;4;5;6;7;8] target 10, the recursion tree has 2^8 = 256
leaves. Returns 8 — number of subsets summing to 10:
1+2+3+4, 1+2+7, 1+3+6, 1+4+5, 2+3+5, 2+8, 3+7, 4+6 = 8
Tests doubly-recursive list traversal pattern with single-arg list
+ accumulator-via-target.
65 baseline programs total.
Iterative Collatz / hailstone sequence:
let collatz_length n =
let m = ref n in
let count = ref 0 in
while !m > 1 do
if !m mod 2 = 0 then m := !m / 2
else m := 3 * !m + 1;
count := !count + 1
done;
!count
27 is the famous 'long-running' Collatz starter. Reaches a peak of
9232 mid-sequence and takes 111 steps to bottom out at 1.
64 baseline programs total.
Walks list with List.iteri, checking if target - x is already in
the hashtable; if yes, the earlier index plus current is the
answer; otherwise record the current pair.
twosum [2;7;11;15] 9 = (0, 1) 2+7
twosum [3;2;4] 6 = (1, 2) 2+4
twosum [3;3] 6 = (0, 1) 3+3
Sum of i+j over each pair: 1 + 3 + 1 = 5.
Tests Hashtbl.find_opt + add (the iter-99 cleanup), List.iteri, and
tuple destructuring on let-binding (iter 98 'let (i, j) = twosum
... in').
63 baseline programs total.
Bisection method searching for f(x) = 0 in [lo, hi] over 50
iterations:
let bisect f lo hi =
let lo = ref lo and hi = ref hi in
for _ = 1 to 50 do
let mid = (!lo +. !hi) /. 2.0 in
if f mid = 0.0 || f !lo *. f mid < 0.0 then hi := mid
else lo := mid
done;
!lo
Solving x^2 - 2 = 0 in [1, 2] via 'bisect (fun x -> x *. x -. 2.0)
1.0 2.0' converges to ~1.41421356... -> int_of_float (r *. 100) =
141.
Tests:
- higher-order function passing
- multi-let 'let lo = ref ... and hi = ref ...'
- float arithmetic
- int_of_float truncate-toward-zero (iter 117)
62 baseline programs total.
36-character digit alphabet '0..9A..Z' supports any base 2..36. Loop
divides the magnitude by base and prepends the digit:
while !m > 0 do
acc := String.make 1 digits.[!m mod base] ^ !acc;
m := !m / base
done
Special-cases n = 0 -> '0' and prepends '-' for negatives.
Test cases (length, since the strings differ in alphabet):
255 hex 'FF' 2
1024 binary '10000000000' 11
100 dec '100' 3
0 any base '0' 1
sum 17
Combines digits.[i] (string indexing) + String.make 1 ch + String
concatenation in a loop.
61 baseline programs total.
Three refs threading through a while loop:
m remaining quotient
d current divisor
result accumulator (built in reverse, List.rev at end)
while !m > 1 do
if !m mod !d = 0 then begin
result := !d :: !result;
m := !m / !d
end else
d := !d + 1
done
360 = 2^3 * 3^2 * 5 factors to [2;2;2;3;3;5], sum 17.
60 baseline programs total — milestone.
Models a bank account using a mutable record + a user exception:
type account = { mutable balance : int }
exception Insufficient
let withdraw acct amt =
if amt > acct.balance then raise Insufficient
else acct.balance <- acct.balance - amt
Sequence:
start 100
deposit 50 150
withdraw 30 120
withdraw 200 raises Insufficient
handler returns acct.balance (= 120, transaction rolled back)
Combines mutable record fields, user exception declaration,
try-with-bare-pattern, and verifies that a raise in the middle of a
sequence doesn't leave a partial mutation.
59 baseline programs total.
to_counts builds a 256-slot int array of character frequencies:
let to_counts s =
let counts = Array.make 256 0 in
for i = 0 to String.length s - 1 do
let c = Char.code s.[i] in
counts.(c) <- counts.(c) + 1
done;
counts
same_counts compares two arrays element-by-element via for loop +
bool ref. is_anagram composes them.
Four pairs:
listen ~ silent true
hello !~ world false
anagram ~ nagaram true
abc !~ abcd false (length differs)
sum 2
Exercises Array.make + arr.(i) + arr.(i) <- v + nested for loops +
Char.code + s.[i].
57 baseline programs total.
Defines a user exception with int payload:
exception Negative of int
let safe_sqrt n =
if n < 0 then raise (Negative n)
else <integer sqrt via while loop>
let try_sqrt n =
try safe_sqrt n with
| Negative x -> -x
try_sqrt 16 -> 4
try_sqrt 25 -> 5
try_sqrt -7 -> 7 (handler returns -(-7) = 7)
try_sqrt 100 -> 10
sum -> 26
Tests exception declaration with int payload, raise with carry, and
try-with arm pattern-matching the constructor with payload binding.
56 baseline programs total.
Defines a parametric tree:
type 'a tree = Leaf of 'a | Node of 'a tree list
let rec flatten t =
match t with
| Leaf x -> [x]
| Node ts -> List.concat (List.map flatten ts)
Test tree has 3 levels of nesting:
Node [Leaf 1; Node [Leaf 2; Leaf 3];
Node [Node [Leaf 4]; Leaf 5; Leaf 6];
Leaf 7]
flattens to [1;2;3;4;5;6;7] -> sum = 28.
Tests parametric ADT, mutual recursion via map+self, List.concat.
55 baseline programs total.
Two-line baseline:
let rec gcd a b = if b = 0 then a else gcd b (a mod b)
let lcm a b = a * b / gcd a b
gcd 36 48 = 12
lcm 4 6 = 12
lcm 12 18 = 36
sum = 60
Tests mod arithmetic and the integer-division fix from iteration 94
(without truncate-toward-zero, 'lcm 4 6 = 4 * 6 / 2 = 12.0' rather
than the expected 12).
54 baseline programs total.
zip walks both lists in lockstep, truncating at the shorter. unzip
uses tuple-pattern destructuring on the recursive result.
let pairs = zip [1;2;3;4] [10;20;30;40] in
let (xs, ys) = unzip pairs in
List.fold_left (+) 0 xs * List.fold_left (+) 0 ys
= 10 * 100
= 1000
Exercises:
- tuple-cons patterns in match scrutinee: 'match (xs, ys) with'
- tuple constructor in return value: '(a :: la, b :: lb)'
- the iter-98 let-tuple destructuring: 'let (la, lb) = unzip rest'
53 baseline programs total.
Recursive 4-arm match on (a, b) tuples threading a carry:
match (a, b) with
| ([], []) -> if carry = 0 then [] else [carry]
| (x :: xs, []) -> (s mod 10) :: aux xs [] (s / 10) where s = x + carry
| ([], y :: ys) -> ...
| (x :: xs, y :: ys) -> ... where s = x + y + carry
Little-endian digit lists. Three tests:
[9;9;9] + [1] = [0;0;0;1] (=1000, digit sum 1)
[5;6;7] + [8;9;1] = [3;6;9] (=963, digit sum 18)
[9;9;9;9;9;9;9;9] + [1] length 9 (carry propagates 8x)
Sum = 1 + 18 + 9 = 28.
Exercises tuple-pattern match on nested list-cons with the integer
arithmetic and carry-threading idiom typical of multi-precision
implementations.
52 baseline programs total.
Recursive ADT with three constructors (Num/Add/Mul). simp does
bottom-up rewrite using algebraic identities:
x + 0 -> x
0 + x -> x
x * 0 -> 0
0 * x -> 0
x * 1 -> x
1 * x -> x
constant folding for Num + Num and Num * Num
Uses tuple pattern in nested match: 'match (simp a, simp b) with'.
Add (Mul (Num 3, Num 5), Add (Num 0, Mul (Num 1, Num 7)))
-> simp -> Add (Num 15, Num 7)
-> eval -> 22
51 baseline programs total.
Triple-nested for loop with row-major indexing:
for i = 0 to n - 1 do
for j = 0 to n - 1 do
for k = 0 to n - 1 do
c.(i * n + j) <- c.(i * n + j) + a.(i * n + k) * b.(k * n + j)
done
done
done
For 3x3 matrices A=[[1..9]] and B=[[9..1]], the resulting C has sum
621. Tests deeply nested for loops on Array, Array.make + arr.(i) +
arr.(i) <- v + Array.fold_left.
50 baseline programs total — milestone.
Iterative binary search on a sorted int array:
let bsearch arr target =
let n = Array.length arr in
let lo = ref 0 and hi = ref (n - 1) in
let found = ref (-1) in
while !lo <= !hi && !found = -1 do
let mid = (!lo + !hi) / 2 in
if arr.(mid) = target then found := mid
else if arr.(mid) < target then lo := mid + 1
else hi := mid - 1
done;
!found
For [1;3;5;7;9;11;13;15;17;19;21]:
bsearch a 13 = 6
bsearch a 5 = 2
bsearch a 100 = -1
sum = 7
Exercises Array.of_list + arr.(i) + multi-let 'let lo = ... and
hi = ...' + while + multi-arm if/else if/else.
49 baseline programs total.
Two-pointer palindrome check:
let is_palindrome s =
let n = String.length s in
let rec check i j =
if i >= j then true
else if s.[i] <> s.[j] then false
else check (i + 1) (j - 1)
in
check 0 (n - 1)
Tests on six strings:
racecar = true
hello = false
abba = true
'' = true (vacuously, i >= j on entry)
'a' = true
'ab' = false
Sum = 4.
Uses s.[i] <> s.[j] (string-get + structural inequality), recursive
2-arg pointer advancement, and a multi-clause if/else if/else for
the three cases.
48 baseline programs total.
Bottom-up dynamic programming. dp[i] = minimum coins to make
amount i.
let dp = Array.make (target + 1) (target + 1) in (* sentinel *)
dp.(0) <- 0;
for i = 1 to target do
List.iter (fun c ->
if c <= i && dp.(i - c) + 1 < dp.(i) then
dp.(i) <- dp.(i - c) + 1
) coins
done
Sentinel 'target + 1' means impossible — any real solution uses at
most 'target' coins.
coin_change [1; 5; 10; 25] 67 = 6 (= 25+25+10+5+1+1)
Exercises Array.make + arr.(i) + arr.(i) <- v + nested
for/List.iter + guard 'c <= i'.
47 baseline programs total.
Kadane's algorithm in O(n):
let max_subarray xs =
let max_so_far = ref min_int in
let cur = ref 0 in
List.iter (fun x ->
cur := max x (!cur + x);
max_so_far := max !max_so_far !cur
) xs;
!max_so_far
For [-2;1;-3;4;-1;2;1;-5;4] the optimal subarray is [4;-1;2;1] = 6.
Exercises min_int (iter 94), max as global, ref / ! / :=, and
List.iter with two side-effecting steps in one closure body.
46 baseline programs total.
next_row prepends 1, walks adjacent pairs (x, y) emitting x+y,
appends a final 1:
let rec next_row prev =
let rec aux a =
match a with
| [_] -> [1]
| x :: y :: rest -> (x + y) :: aux (y :: rest)
| [] -> []
in
1 :: aux prev
row n iterates next_row n times starting from [1] using a ref +
'for _ = 1 to n do r := next_row !r done'.
row 10 = [1;10;45;120;210;252;210;120;45;10;1]
List.nth (row 10) 5 = 252 = C(10, 5)
Exercises three-arm match including [_] singleton wildcard, x :: y
:: rest binding, and the for-loop with wildcard counter. 45 baseline
programs total.
Run-length encoding via tail-recursive 4-arg accumulator:
let rle xs =
let rec aux xs cur n acc =
match xs with
| [] -> List.rev ((cur, n) :: acc)
| h :: t ->
if h = cur then aux t cur (n + 1) acc
else aux t h 1 ((cur, n) :: acc)
in
match xs with
| [] -> []
| h :: t -> aux t h 1 []
rle [1;1;1;2;2;3;3;3;3;1;1] = [(1,3);(2,2);(3,4);(1,2)]
sum of counts = 11 (matches input length)
The sum-of-counts test verifies that the encoding preserves total
length — drops or duplicates would diverge.
44 baseline programs total.
Defines a recursive str_contains that walks the haystack with
String.sub to find a needle substring. Real OCaml's String.contains
only accepts a single char, so this baseline implements its own
substring search to stay portable.
let rec str_contains s sub i =
if i + sl > nl then false
else if String.sub s i sl = sub then true
else str_contains s sub (i + 1)
count_matching splits text on newlines, folds with the predicate.
'the quick brown fox\nfox runs fast\nthe dog\nfoxes are clever'
needle = 'fox'
matches = 3 (lines 1, 2, 4 — 'foxes' contains 'fox')
43 baseline programs total.
The binop precedence table already had land/lor/lxor/lsl/lsr/asr
(iter 0 setup) but eval-op fell through to 'unknown operator' for
all of them. SX doesn't expose host bitwise primitives, so each is
implemented in eval.sx via arithmetic on the host:
land/lor/lxor: mask & shift loop, accumulating 1<<k digits
lsl k: repeated * 2 k times
lsr k: repeated floor (/ 2) k times
asr: aliased to lsr (no sign extension at our bit width)
bits.ml baseline: popcount via 'while m > 0 do if m land 1 = 1 then
... ; m := m lsr 1 done'. Sum of popcount(1023, 5, 1024, 0xff) = 10
+ 2 + 1 + 8 = 21.
5 land 3 = 1
5 lor 3 = 7
5 lxor 3 = 6
1 lsl 8 = 256
256 lsr 4 = 16
41 baseline programs total.
Classic Ackermann function:
let rec ack m n =
if m = 0 then n + 1
else if n = 0 then ack (m - 1) 1
else ack (m - 1) (ack m (n - 1))
ack(3, 4) = 125, expanding to ~6700 evaluator frames — a useful
stress test of the call stack and control transfer. Real OCaml
evaluates this in milliseconds; ours takes ~2 minutes on a
contended host but completes correctly.
40 baseline programs total.
Stack-based RPN evaluator:
let eval_rpn tokens =
let stack = Stack.create () in
List.iter (fun tok ->
if tok is operator then
let b = Stack.pop stack in
let a = Stack.pop stack in
Stack.push (apply tok a b) stack
else
Stack.push (int_of_string tok) stack
) tokens;
Stack.pop stack
For tokens [3 4 + 2 * 5 -]:
3 4 + -> 7
7 2 * -> 14
14 5 - -> 9
Exercises Stack.create / push / pop, mixed branch on string
equality, multi-arm if/else if for operator dispatch, int_of_string
for token parsing.
39 baseline programs total.
Newton's method for square root:
let sqrt_newton x =
let g = ref 1.0 in
for _ = 1 to 20 do
g := (!g +. x /. !g) /. 2.0
done;
!g
20 iterations is more than enough to converge for x=2 — result is
~1.414213562. Multiplied by 1000 and int_of_float'd: 1414.
First baseline exercising:
- for _ = 1 to N do ... done (wildcard loop variable)
- pure float arithmetic with +. /.
- the int_of_float truncate-toward-zero fix from iter 117
38 baseline programs total.
Classic doubly-recursive solution returning the move count:
hanoi n from to via =
if n = 0 then 0
else hanoi (n-1) from via to + 1 + hanoi (n-1) via to from
For n = 10, returns 2^10 - 1 = 1023.
Exercises 4-arg recursion, conditional base case, and tail-position
addition. Uses 'to_' instead of 'to' for the destination param to
avoid collision with the 'to' keyword in for-loops — the OCaml
conventional workaround.
37 baseline programs total.
validate_int returns Left msg on empty / non-digit, Right
(int_of_string s) on a digit-only string. process folds inputs with a
tuple accumulator (errs, sum), branching on the result.
['12'; 'abc'; '5'; ''; '100'; 'x']
-> 3 errors (abc, '', x), valid sum = 12+5+100 = 117
-> errs * 100 + sum = 417
Exercises:
- Either constructors used bare (Left/Right without 'Either.'
qualification)
- char range comparison: c >= '0' && c <= '9'
- tuple-pattern destructuring on let-binding (iter 98)
- recursive helper defined inside if-else
- List.fold_left with tuple accumulator
36 baseline programs total.
First baseline using Map.Make on a string-keyed map:
module StringOrd = struct
type t = string
let compare = String.compare
end
module SMap = Map.Make (StringOrd)
let count_words text =
let words = String.split_on_char ' ' text in
List.fold_left (fun m w ->
let n = match SMap.find_opt w m with
| Some n -> n
| None -> 0
in
SMap.add w (n + 1) m
) SMap.empty words
For 'the quick brown fox jumps over the lazy dog' ('the' appears
twice), SMap.cardinal -> 8.
Complements bag.ml (Hashtbl-based) and unique_set.ml (Set.Make)
with a sorted Map view of the same kind of counting problem. 35
baseline programs total.
Either module (mirrors OCaml 4.12+ stdlib):
left x / right x
is_left / is_right
find_left / find_right (return Option)
map_left / map_right (single-side mappers)
fold lf rf e (case dispatch)
equal eq_l eq_r a b
compare cmp_l cmp_r a b (Left < Right)
Constructors are bare 'Left x' / 'Right x' (OCaml 4.12+ exposes them
directly without an explicit type-decl).
Hashtbl.copy:
build a fresh cell with _hashtbl_create
walk _hashtbl_to_list and re-add each (k, v)
mutating one copy doesn't touch the other
(Hashtbl.length t + Hashtbl.length t2 = 3 after fork-and-add
verifies that adds to t2 don't appear in t)
Defines a JSON-like algebraic data type:
type json =
| JNull
| JBool of bool
| JInt of int
| JStr of string
| JList of json list
Recursively serialises to a string via match-on-constructor, then
measures the length:
JList [JInt 1; JBool true; JNull; JStr 'hi'; JList [JInt 2; JInt 3]]
-> '[1,true,null,"hi",[2,3]]' length 24
Exercises:
- five-constructor ADT (one nullary, three single-arg, one list-arg)
- recursive match
- String.concat ',' (List.map to_string xs)
- string-cat with embedded escaped quotes
34 baseline programs total.
In-place Fisher-Yates shuffle using:
Random.init 42 deterministic seed
let a = Array.of_list xs
for i = n - 1 downto 1 do reverse iteration
let j = Random.int (i + 1)
let tmp = a.(i) in
a.(i) <- a.(j);
a.(j) <- tmp
done
Sum is invariant under permutation, so the test value (55 for
[1..10] = 1+2+...+10) verifies the shuffle is a valid permutation
regardless of which permutation the seed yields.
Exercises Random.init / Random.int + Array.of_list / to_list /
length / arr.(i) / arr.(i) <- v + downto loop + multi-statement
sequencing within for-body.
33 baseline programs total.
pi_leibniz.ml: Leibniz formula for pi.
pi/4 = 1 - 1/3 + 1/5 - 1/7 + ...
pi ~= 4 * sum_{k=0}^{n-1} (-1)^k / (2k+1)
For n=1000, pi ~= 3.140593. Multiply by 100 and int_of_float -> 314.
Side-quest: int_of_float was wrongly defined as identity in
iteration 94. Fixed to:
let int_of_float f =
if f < 0.0 then _float_ceil f else _float_floor f
(truncate toward zero, mirroring real OCaml's int_of_float). The
identity definition was a stub from when integer/float dispatch was
not yet split — now they're separate, the stub is wrong.
Float.to_int still uses floor since OCaml's docs say the result is
unspecified for nan / out-of-range; close enough for our scope.
32 baseline programs total.
Both take an inner predicate / comparator and walk both lists in
lockstep:
equal eq a b short-circuits on first mismatch
compare cmp a b -1 if a is a strict prefix
1 if b is
0 if both empty
otherwise first non-zero element comparison
Mirrors real OCaml's signatures.
List.equal (=) [1;2;3] [1;2;3] = true
List.equal (=) [1;2;3] [1;2;4] = false
List.compare compare [1;2;3] [1;2;4] = -1
List.compare compare [1;2] [1;2;3] = -1
List.compare compare [] [] = 0
Bool module:
equal a b = a = b
compare a b = 0 if equal, 1 if a, -1 if b (false < true)
to_string 'true' / 'false'
of_string s = s = 'true'
not_ wraps host not
to_int true=1, false=0
Option additions (take eq/cmp parameter for the inner value):
equal eq a b None=None, otherwise eq the inner values
compare cmp a b None < Some _; otherwise cmp inner
Option.equal (=) (Some 1) (Some 1) = true
Option.equal (=) (Some 1) None = false
Option.compare compare (Some 5) (Some 3) = 1
bag.ml: split a sentence on spaces, count each word in a Hashtbl,
return the maximum count via Hashtbl.fold.
count_words 'the quick brown fox jumps over the lazy dog the fox'
-> Hashtbl with 'the' = 3 as the max
-> 3
Exercises String.split_on_char + Hashtbl.find_opt/replace +
Hashtbl.fold (k v acc -> ...). Together with frequency.ml from
iter 84 we now have two Hashtbl-counting baselines exercising
slightly different idioms. 29 baseline programs total.
String additions:
equal a b = a = b
compare a b = -1 / 0 / 1 via host < / >
cat a b = a ^ b
empty = '' (constant)
Defines:
type frac = { num : int; den : int }
let rec gcd a b = if b = 0 then a else gcd b (a mod b)
let make n d = (* canonicalise: gcd-reduce and
force den > 0 *)
let add x y = make (x.num * y.den + y.num * x.den) (x.den * y.den)
let mul x y = make (x.num * y.num) (x.den * y.den)
Test:
let r = add (make 1 2) (make 1 3) in (* 5/6 *)
let s = mul (make 2 3) (make 3 4) in (* 1/2 *)
let t = add r s in (* 5/6 + 1/2 = 4/3 *)
t.num + t.den (* = 7 *)
Exercises records, recursive gcd, mod, abs, integer division (the
truncate-toward-zero semantics from iter 94 are essential here —
make would diverge from real OCaml's behaviour with float division).
28 baseline programs total.
Real OCaml's Seq.t is 'unit -> Cons of elt * Seq.t | Nil' — a lazy
thunk that lets you build infinite sequences. Ours is just a list,
which gives the right shape for everything in baseline programs that
don't rely on laziness (taking from infinite sequences would force
memory).
API: empty, cons, return, is_empty, iter, iteri, map, filter,
filter_map, fold_left, length, take, drop, append, to_list,
of_list, init, unfold.
unfold takes a step fn 'acc -> Option (elt * acc)' and threads
through until it returns None:
Seq.fold_left (+) 0
(Seq.unfold (fun n -> if n > 4 then None
else Some (n, n + 1))
1)
= 1 + 2 + 3 + 4 = 10
First baseline that exercises the functor pipeline end to end:
module IntOrd = struct
type t = int
let compare a b = a - b
end
module IntSet = Set.Make (IntOrd)
let unique_count xs =
let s = List.fold_left (fun s x -> IntSet.add x s) IntSet.empty xs in
IntSet.cardinal s
Counts unique elements in [3;1;4;1;5;9;2;6;5;3;5;8;9;7;9]:
{1,2,3,4,5,6,7,8,9} -> 9
The input has 15 elements with 9 unique values. The 'type t = int'
declaration in IntOrd is required by real OCaml; OCaml-on-SX is
dynamic and would accept it without, but we include it for source
fidelity. 27 baseline programs total.
Functors were already wired through ocaml-make-functor in eval.sx
(curried host closure consuming module dicts) but had no explicit
tests for the user-defined Ord application path. This commit adds
three smoke tests that confirm:
module IntOrd = struct let compare a b = a - b end
module S = Set.Make (IntOrd)
S.elements (fold-add [5;1;3;1;5]) sums to 9 (dedupe + sort)
S.mem 2 (S.add 1 (S.add 2 (S.add 3 S.empty))) = true
M.cardinal (M.add 1 'a' (M.add 2 'b' M.empty)) = 2
The Ord parameter is properly threaded through the functor body —
elements are sorted in compare order and dedupe works.
Three parser changes:
1. at-app-start? returns true on op '~' or '?' so the app loop
keeps consuming labeled args.
2. The app arg parser handles:
~name:VAL drop label, parse VAL as the arg
?name:VAL same
~name punning -- treat as (:var name)
?name same
3. try-consume-param! drops '~' or '?' and treats the following
ident as a regular positional param name.
Caveats:
- Order in the call must match definition order; we don't reorder
by label name.
- Optional args don't auto-wrap in Some, so the function body sees
the raw value for ?x:V.
Lets us write idiomatic-looking OCaml even though the runtime is
positional underneath:
let f ~x ~y = x + y in f ~x:3 ~y:7 = 10
let x = 4 in let y = 5 in f ~x ~y = 20 (punning)
let f ?x ~y = x + y in f ?x:1 ~y:2 = 3
User-implemented mergesort that exercises features added across the
last few iterations:
let rec split lst = match lst with
| x :: y :: rest ->
let (a, b) = split rest in (* iter 98 let-tuple destruct *)
(x :: a, y :: b)
| ...
let rec merge xs ys = match xs with
| x :: xs' ->
match ys with (* nested match-in-match *)
| y :: ys' -> ...
...
List.fold_left (+) 0 (sort [...]) (* iter 89 (op) section *)
Sum of [3;1;4;1;5;9;2;6;5;3;5] = 44 regardless of order, so the
result is also a smoke test of the implementation correctness — if
merge_sort drops or duplicates an element the sum diverges. 26
baseline programs total.
parse-decl-let lives in the outer ocaml-parse-program scope and does
not have access to parse-pattern (which is local to ocaml-parse).
Source-slicing approach instead:
1. detect '(IDENT, ...)' in collect-params
2. scan tokens to the matching ')' (tracking nested parens)
3. slice the pattern source string from src
4. push (synth_name, pat_src) onto tuple-srcs
Then after collecting params, the rhs source string gets wrapped with
'match SN with PAT_SRC -> (RHS_SRC)' for each tuple-param,
innermost-first, and the final string is fed through ocaml-parse.
End result is the same AST shape as the iteration-102 inner-let
case: a function whose body destructures a synthetic name.
let f (a, b) = a + b ;; f (3, 7) = 10
let g x (a, b) = x + a + b ;; g 1 (2, 3) = 6
let h (a, b) (c, d) = a * b + c * d
;; h (1, 2) (3, 4) = 14
Mirrors iteration 101's parse-fun change inside parse-let's
parse-one!:
- same '(IDENT, ...)' detection on collect-params
- same __pat_N synth name for the function param
- same innermost-first match-wrapping
Difference: for inner-let the wrapping is applied to the rhs of the
let-binding (which is the function value), not directly to a fun
body.
let f (a, b) = a + b in f (3, 7) = 10
let g x (a, b) = x + a + b in g 1 (2, 3) = 6
let h (a, b) (c, d) = a * b + c * d
in h (1, 2) (3, 4) = 14
parse-fun's collect-params now detects '(IDENT, ...)' as a
tuple-pattern parameter (lookahead at peek-tok-at 1/2 distinguishes
from '(x : T)' and '()' cases that try-consume-param! already
handles). For each tuple param it:
1. parse-pattern to get the full pattern AST
2. generate a synthetic __pat_N name as the actual fun parameter
3. push (synth_name, pattern) onto tuple-binds
After parsing the body, wraps it innermost-first with one
'match __pat_N with PAT -> ...' per tuple-param. The user-visible
result is a (:fun (params...) body) where params are all simple
names but the body destructures.
Also retroactively simplifies Hashtbl.keys/values from
'fun pair -> match pair with (k, _) -> k' to plain
'fun (k, _) -> k', closing the iteration-99 workaround.
(fun (a, b) -> a + b) (3, 7) = 10
List.map (fun (a, b) -> a * b)
[(1, 2); (3, 4); (5, 6)] = [2; 12; 30]
List.map (fun (k, _) -> k)
[("a", 1); ("b", 2)] = ["a"; "b"]
(fun a (b, c) d -> a + b + c + d) 1 (2, 3) 4 = 10
Linear-congruential PRNG with mutable seed (_state ref). API:
init s seed the PRNG
self_init () default seed (1)
int bound 0 <= n < bound
bool () fair coin
float bound uniform in [0, bound)
bits () 30 bits
Stepping rule:
state := (state * 1103515245 + 12345) mod 2147483647
result := |state| mod bound
Same seed reproduces the same sequence. Real OCaml's Random uses
Lagged Fibonacci; ours is simpler but adequate for shuffles and
Monte Carlo demos in baseline programs.
Random.init 42; Random.int 100 = 48
Random.init 1; Random.int 10 = 0
Two new host primitives:
_hashtbl_remove t k -> dissoc the key from the underlying dict
_hashtbl_clear t -> reset the cell to {}
Eight new OCaml-syntax helpers in runtime.sx Hashtbl module:
bindings t = _hashtbl_to_list t
keys t = List.map (fun (k, _) -> k) (...)
values t = List.map (fun (_, v) -> v) (...)
to_seq t = bindings t
to_seq_keys / to_seq_values
remove / clear / reset
The keys/values implementations use a 'fun pair -> match pair with
(k, _) -> k' indirection because parse-fun does not currently allow
tuple patterns directly on parameters. Same restriction we worked
around in iteration 98's let-pattern desugaring.
Also: a detour attempting to add top-level 'let (a, b) = expr'
support was started but reverted — parse-decl-let in the outer
ocaml-parse-program scope does not have access to parse-pattern
(which is local to ocaml-parse). Will need a slice + re-parse trick
later.
When 'let' is followed by '(', parse-let now reads a full pattern
(via the existing parse-pattern used by match), expects '=', then
'in', and desugars to:
let PATTERN = EXPR in BODY => match EXPR with PATTERN -> BODY
This reuses the entire pattern-matching machinery, so any pattern
the match parser accepts works here too — paren-tuples, nested
tuples, cons patterns, list patterns. No 'rec' allowed for pattern
bindings (real OCaml's restriction).
let (a, b) = (1, 2) in a + b = 3
let (a, b, c) = (10, 20, 30) in a + b + c = 60
let pair = (5, 7) in
let (x, y) = pair in x * y = 35
Also retroactively cleaned up Printf's iter-97 width-pos packing
hack ('width * 1000000 + spec_pos') — it's now
'let (width, spec_pos) = parse_width_loop after_flags in ...' like
real OCaml.
The Printf walker now parses optional flags + width digits between
'%' and the spec letter:
- left-align (default is right-align)
0 zero-pad (default is space-pad; only honoured when not left-aligned)
Nd... decimal width digits (any number)
After formatting the argument into a base string with the existing
spec dispatch (%d/%i/%u/%s/%f/%c/%b/%x/%X/%o), the result is padded
to the requested width.
Workaround: width and spec_pos are returned packed as
width * 1000000 + spec_pos
because the parser does not yet support tuple destructuring in let
('let (a, b) = expr in body' fails with 'expected ident'). TODO: lift
that limitation; for now the encoding round-trips losslessly for any
practical width.
Printf.sprintf '%5d' 42 = ' 42'
Printf.sprintf '%-5d|' 42 = '42 |'
Printf.sprintf '%05d' 42 = '00042'
Printf.sprintf '%4s' 'hi' = ' hi'
Printf.sprintf 'hi=%-3d, hex=%04x' 9 15 = 'hi=9 , hex=000f'
The previous List.sort was O(n^2) insertion sort. Replaced with a
straightforward mergesort:
split lst -> alternating-take into ([odd], [even])
merge xs ys -> classic two-finger merge under cmp
sort cmp xs -> base cases [], [x]; otherwise split + recursive
sort on each half + merge
Tuple destructuring on the split result is expressed via nested
match — let-tuple-destructuring would be cleaner but works today.
This benefits sort_uniq (which calls sort first), Set.Make.add via
sort etc., and any user program using List.sort. Stable_sort is
already aliased to sort.
Three things in this commit:
1. Integer / is now truncate-toward-zero on ints, IEEE on floats. The
eval-op handler for '/' checks (number? + (= (round x) x)) on both
sides; if both integral, applies host floor/ceil based on sign;
otherwise falls through to host '/'.
2. Fixes Int.rem, which was returning 0 because (a - b * (a / b))
was using float division: 17 - 5 * 3.4 = 0.0. Now Int.rem 17 5 = 2.
3. Int module fleshed out:
max_int / min_int / zero / one / minus_one,
succ / pred / neg, add / sub / mul / div / rem,
equal, compare.
Also adds globals: max_int, min_int, abs_float, float_of_int,
int_of_float (the latter two are identity in our dynamic runtime).
17 / 5 = 3
-17 / 5 = -3 (trunc toward zero)
Int.rem 17 5 = 2
Int.compare 5 3 = 1
Eight new Array functions, all in OCaml syntax inside runtime.sx,
delegating to the corresponding List operation on the cell's
underlying list:
sort cmp a -> a := List.sort cmp !a (* mutates the cell *)
stable_sort = sort
fast_sort = sort
append a b -> ref (List.append !a !b)
sub a pos n -> ref (take n (drop pos !a))
exists p -> List.exists p !a
for_all p -> List.for_all p !a
mem x a -> List.mem x !a
Round-trip:
let a = Array.of_list [3;1;4;1;5;9;2;6] in
Array.sort compare a;
Array.to_list a = [1;1;2;3;4;5;6;9]
Five '+++++.' groups, cumulative accumulator 5+10+15+20+25 = 75.
This is a brainfuck *subset* — only > < + - . (no [ ] looping). That's
intentional: the goal is to stress imperative idioms that the recently
added Array module + array indexing syntax + s.[i] make ergonomic, all
in one program.
Exercises:
Array.make 256 0
arr.(!ptr)
arr.(!ptr) <- arr.(!ptr) + 1
prog.[!pc]
ref / ! / :=
while + nested if/else if/else if for op dispatch
25 baseline programs total.
Counts primes <= 50, expected 15.
Stresses the recently-added Array module + the new array-indexing
syntax together with nested control flow:
let sieve = Array.make (n + 1) true in
sieve.(0) <- false;
sieve.(1) <- false;
for i = 2 to n do
if sieve.(i) then begin
let j = ref (i * i) in
while !j <= n do
sieve.(!j) <- false;
j := !j + i
done
end
done;
...
Exercises: Array.make, arr.(i), arr.(i) <- v, nested for/while,
begin..end blocks, ref/!/:=, integer arithmetic. 24 baseline
programs total.
parse-atom-postfix's '.()' branch now disambiguates between let-open
and array-get based on whether the head is a module path (':con' or
':field' chain rooted in ':con'). Module paths still emit
(:let-open M EXPR); everything else emits (:array-get ARR I).
Eval handles :array-get by reading the cell's underlying list at
index. The '<-' assignment handler now also accepts :array-get lhs
and rewrites the cell with one position changed.
Idiomatic OCaml array code now works:
let a = Array.make 5 0 in
for i = 0 to 4 do a.(i) <- i * i done;
a.(3) + a.(4) = 25
let a = Array.init 4 (fun i -> i + 1) in
a.(0) + a.(1) + a.(2) + a.(3) = 10
List.(length [1;2;3]) = 3 (* unchanged: List is a module *)
Array module (runtime.sx, OCaml syntax):
Backed by a 'ref of list'. make/length/get/init build the cell;
set rewrites the underlying list with one cell changed (O(n) but
works for short arrays in baseline programs). Includes
iter/iteri/map/mapi/fold_left/to_list/of_list/copy/blit/fill.
(op) operator sections (parser.sx, parse-atom):
When the token after '(' is a binop (any op with non-zero
precedence in the binop table) and the next token is ')', emit
(:fun ('a' 'b') (:op OP a b)) — i.e. (+) becomes fun a b -> a + b.
Recognises every binop including 'mod', 'land', '^', '@', '::',
etc.
Lets us write:
List.fold_left (+) 0 [1;2;3;4;5] = 15
let f = ( * ) in f 6 7 = 42
List.map ((-) 10) [1;2;3] = [9;8;7]
let a = Array.make 5 7 in
Array.set a 2 99;
Array.fold_left (+) 0 a = 127
Inline CSV-like text:
a,1,extra
b,2,extra
c,3,extra
d,4,extra
Two-stage String.split_on_char: first on '\n' for rows, then on ','
for fields per row. List.fold_left accumulates int_of_string of the
second field across rows. Result = 1+2+3+4 = 10.
Exercises char escapes inside string literals ('\n'), nested
String.split_on_char, List.fold_left with a non-trivial closure body,
and int_of_string. 23 baseline programs total.
Tokenizer already classified backtick-uppercase as a ctor identical
to a nominal one, but it had never been exercised by the suite. This
commit adds three smoke tests confirming that nullary, n-ary, and
list-of-polyvariant patterns all match:
let x = polyvar(Foo) in match x with polyvar(Foo) -> 1 | polyvar(Bar) -> 2
let x = polyvar(Pair) (5, 7) in
match x with polyvar(Pair) (a, b) -> a + b | _ -> 0
List.map (fun x -> match x with polyvar(On) -> 1 | polyvar(Off) -> 0)
[polyvar(On); polyvar(Off); polyvar(On)]
(In the actual SX, polyvar(X) is the literal backtick-X — backticks
in this commit message are escaped to avoid shell interpretation.)
Since OCaml-on-SX is dynamic, there's no structural row inference,
but matching by tag works.
sort_uniq:
Sort with the user comparator, then walk the sorted list dropping
any element equal to its predecessor. Output is sorted and unique.
List.sort_uniq compare [3;1;2;1;3;2;4] = [1;2;3;4]
find_map:
Walk until the user fn returns Some v; return that. If all None,
return None.
List.find_map (fun x -> if x > 5 then Some (x * 2) else None)
[1;2;3;6;7]
= Some 12
Both defined in OCaml syntax in runtime.sx — no host primitive
needed since they're pure list traversals over existing operations.
Six new String functions, all in OCaml syntax inside runtime.sx:
iter : index-walk with side-effecting f
iteri : iter with index
fold_left : thread accumulator left-to-right
fold_right: thread accumulator right-to-left
to_seq : return a char list (lazy in real OCaml; eager here)
of_seq : concat a char list back to a string
Round-trip:
String.of_seq (List.rev (String.to_seq "hello")) = "olleh"
Note: real OCaml's Seq is lazy. We return a plain list because the
existing stdlib already provides exhaustive list operations and we
don't yet have lazy sequences. If a baseline needs Seq.unfold or
similar, we'll graduate to a proper Seq module then.
frequency.ml exercises the recently-added Hashtbl.iter / fold +
Hashtbl.find_opt + s.[i] indexing + for-loop together: build a
char-count table for 'abracadabra' then take the max via
Hashtbl.fold. Expected = 5 (a x 5). Total 25 baseline programs.
Format module added as a thin alias of Printf — sprintf, printf, and
asprintf all delegate to Printf.sprintf. The dynamic runtime doesn't
distinguish boxes/breaks, so format strings work the same as in
Printf and most Format-using OCaml programs now compile.
Tokenizer already had 'lazy' as a keyword. This commit wires it through:
parser : parse-prefix emits (:lazy EXPR), like the existing 'assert'
handler.
eval : creates a one-element cell with state ('Thunk' expr env).
host : _lazy_force flips the cell to ('Forced' v) on first call
and returns the cached value thereafter.
runtime : module Lazy = struct let force lz = _lazy_force lz end.
Memoisation confirmed by tracking a side-effect counter through two
forces of the same lazy:
let counter = ref 0 in
let lz = lazy (counter := !counter + 1; 42) in
let a = Lazy.force lz in
let b = Lazy.force lz in
(a + b) * 100 + !counter = 8401 (= 84*100 + 1)
New host primitive _hashtbl_to_list returns the entries as a list of
OCaml tuples — ('tuple' k v) form, matching the AST representation
that the pattern-match VM (:ptuple) expects. Without that exact
shape, '(k, v) :: rest' patterns fail to match.
Hashtbl.iter / Hashtbl.fold in runtime walk that list with the user
fn. This closes a long-standing gap: previously Hashtbl was opaque
once values were written (we could only find_opt one key at a time).
let t = Hashtbl.create 4 in
Hashtbl.add t "a" 1; Hashtbl.add t "b" 2; Hashtbl.add t "c" 3;
Hashtbl.fold (fun _ v acc -> acc + v) t 0 = 6
Replaces the stub sprintf in runtime.sx with a real implementation:
walk fmt char-by-char accumulating a prefix; on recognised %X return a
one-arg fn that formats the arg and recurses on the rest of fmt. The
function self-curries to the spec count — there's no separate arity
machinery, just a closure chain.
Specs: %d (int), %s (string), %f (float), %c (char/string in our model),
%b (bool), %% (literal). Unknown specs pass through.
Same expression returns a string (no specs) or a function (>=1 spec) —
OCaml proper would reject this; works fine in OCaml-on-SX's dynamic
runtime.
Also adds top-level aliases:
string_of_int = _string_of_int
string_of_float = _string_of_float
string_of_bool = if b then "true" else "false"
int_of_string = _int_of_string
Printf.sprintf "x=%d" 42 = "x=42"
Printf.sprintf "%s = %d" "answer" 42 = "answer = 42"
Printf.sprintf "%d%%" 50 = "50%"
Tokenizer already classified 'assert' as a keyword; this commit wires
it through:
parser : parse-prefix dispatches like 'not' — advance, recur, wrap
as (:assert EXPR).
eval : evaluate operand; nil on truthy, host-error 'Assert_failure'
on false. Caught cleanly by existing try/with.
assert true; 42 = 42
let x = 5 in assert (x = 5); x + 1 = 6
try (assert false; 0) with _ -> 99 = 99
Recursive Levenshtein edit distance with no memoization (the test
strings are short enough for the exponential-without-memo version to
fit in <2 minutes on contended hosts). Sums distances for five short
pairs:
('abc','abx') + ('ab','ba') + ('abc','axyc') + ('','abcd') + ('ab','')
= 1 + 2 + 2 + 4 + 2 = 11
Exercises:
* curried four-arg recursion
* s.[i] equality test (char comparison)
* min nested twice for the three-way recurrence
* mixed empty-string base cases
Side-quests required to land caesar.ml:
1. Top-level 'let r = expr in body' is now an expression decl, not a
broken decl-let. ocaml-parse-program's dispatch now checks
has-matching-in? at every top-level let; if matched, slices via
skip-let-rhs-boundary (which already opens depth on a leading let
with matching in) and ocaml-parse on the slice, wrapping as :expr.
2. runtime.sx: added String.make / String.init / String.map. Used by
caesar.ml's encode = String.init n (fun i -> shift_char s.[i] k).
3. baseline run.sh per-program timeout 240->480s (system load on the
shared host frequently exceeds 240s for large baselines).
caesar.ml exercises:
* the new top-level let-in expression dispatch
* s.[i] string indexing
* Char.code / Char.chr round-trip math
* String.init with a closure that captures k
Test value: Char.code r.[0] + Char.code r.[4] after ROT13(ROT13('hello')) = 104 + 111 = 215.
parse-atom-postfix now dispatches three cases after consuming '.':
.field -> existing field/module access
.(EXPR) -> existing local-open
.[EXPR] -> new string-get syntax (this commit)
Eval reduces (:string-get S I) to host (nth S I), which already returns
a one-character string for OCaml's char model.
Lets us write idiomatic OCaml string traversal:
let s = "hi" in
let n = ref 0 in
for i = 0 to String.length s - 1 do
n := !n + Char.code s.[i]
done;
!n (* = 209 *)
Side-quest emerged from adding roman.ml baseline (Roman numeral greedy
encoding): top-level 'let () = expr' was unsupported because
ocaml-parse-program's parse-decl-let consumed an ident strictly. Now
parse-decl-let recognises a leading '()' as a unit binding and
synthesises a __unit_NN name (matching how parse-let already handles
inner-let unit patterns).
roman.ml exercises:
* tuple list literal [(int * string); ...]
* recursive pattern match on tuple-cons
* String.length + List.fold_left
* the new top-level let () support (sanity in a comment, even though
the program ends with a bare expression for the test harness)
Bumped lib/ocaml/test.sh server timeout 180->360s — the recent surge in
test count plus a CPU-contended host was crowding out the sole epoch
reaching the deeper smarts.
In parse-atom-postfix, after consuming '.', if the next token is '(',
parse the inner expression and emit (:let-open M EXPR) instead of
:field. Cleanly composes with the existing :let-open evaluator and
loops to allow chained dot postfixes.
List.(length [1;2;3]) = 3
List.(map (fun x -> x + 1) [1;2;3]) = [2;3;4]
Option.(map (fun x -> x * 10) (Some 4)) = Some 40
Parser detects 'let open' as a separate let-form, parses M as a path
(Ctor(.Ctor)*) directly via inline AST construction (no source slicing
since cur-pos is only available in ocaml-parse-program), and emits
(:let-open PATH BODY).
Eval resolves the path to a module dict and merges its bindings into
the env for body evaluation. Now:
let open List in map (fun x -> x * 2) [1;2;3] = [2;4;6]
let open Option in map (fun x -> x + 1) (Some 5) = Some 6
ocaml-eval-module now handles :def-mut and :def-rec-mut decls so
'module M = struct let rec a n = ... and b n = ... end' works. The
def-rec-mut version uses cell-based mutual recursion exactly as the
top-level version.
Graph BFS using Queue + Hashtbl visited-set + List.assoc_opt + List.iter.
Returns 6 for a graph where A reaches B/C/D/E/F. Demonstrates 4 stdlib
modules (Queue, Hashtbl, List) cooperating in a real algorithm.
let NAME [PARAMS] : T = expr and (expr : T) parse and skip the type
source. Runtime no-op since SX is dynamic. Works in inline let,
top-level let, and parenthesised expressions:
let x : int = 5 ;; x + 1 -> 6
let f (x : int) : int = x + 1 in f 41 -> 42
(5 : int) -> 5
((1 + 2) : int) * 3 -> 9
Parser: in parse-decl-type, dispatch on the post-= token:
'|' or Ctor -> sum type
'{' -> record type
otherwise -> type alias (skip to boundary)
AST (:type-alias NAME PARAMS) with body discarded. Runtime no-op since
SX has no nominal types.
poly_stack.ml baseline exercises:
module type ELEMENT = sig type t val show : t -> string end
module IntElem = struct type t = int let show x = ... end
module Make (E : ELEMENT) = struct ... use E.show ... end
module IntStack = Make(IntElem)
Demonstrates the substrate handles signature decls + abstract types +
functor parameter with sig constraint.
parse-try now consumes optional 'when GUARD-EXPR' before -> and emits
(:case-when PAT GUARD BODY). Eval try clause loop dispatches on case /
case-when and falls through on guard false — same semantics as match.
Examples:
try raise (E 5) with | E n when n > 0 -> n | _ -> 0 = 5
try raise (E (-3)) with | E n when n > 0 -> n | _ -> 0 = 0
try raise (E 5) with | E n when n > 100 -> n | E n -> n + 1000 = 1005
parse-function now consumes optional 'when GUARD-EXPR' before -> and
emits (:case-when PAT GUARD BODY) — same handling as match clauses.
function-style sign extraction now works:
(function | n when n > 0 -> 1 | n when n < 0 -> -1 | _ -> 0)
Group anagrams by canonical (sorted-chars) key using Hashtbl +
List.sort. Demonstrates char-by-char traversal via String.get + for-loop +
ref accumulator + Hashtbl as a multi-valued counter.
Untyped lambda calculus interpreter inside OCaml-on-SX:
type term = Var | Abs of string * term | App | Num of int
type value = VNum of int | VClos of string * term * env
let rec eval env t = match t with ...
(\x.\y.x) 7 99 = 7. The substrate handles two ADTs, recursive eval,
closure-based env, and pattern matching all written as a single
self-contained OCaml program — strong validation.
ocaml-type-of-program now handles :def-mut (sequential generalize) and
:def-rec-mut (pre-bind tvs, infer rhs, unify, generalize all, infer
body — same algorithm as the inline let-rec-mut version).
Mutual top-level recursion now type-checks:
let rec even n = ... and odd n = ...;; even 10 : Bool
let rec map f xs = ... and length lst = ...;; map :
('a -> 'b) -> 'a list -> 'b list
Memoized fibonacci using Hashtbl.find_opt + Hashtbl.add.
fib(25) = 75025. Demonstrates mutable Hashtbl through the OCaml
stdlib API in real recursive code.
4-queens via recursive backtracking + List.fold_left. Returns 2 (the
two solutions of 4-queens). Per-program timeout in run.sh bumped to
240s — the tree-walking interpreter is slow on heavy recursion but
correct.
The substrate handles full backtracking + safe-check recursion +
list-driven candidate enumeration end-to-end.
Counter-style record with two mutable fields. Validates the new
r.f <- v field mutation end-to-end through type decl + record literal
+ field access + field assignment + sequence operator.
type counter = { mutable count : int; mutable last : int }
let bump c = c.count <- c.count + 1 ; c.last <- c.count
After 5 bumps: count=5, last=5, sum=10.
<- added to op-table at level 1 (same as :=). Eval short-circuits on
<- to mutate the lhs's field via host SX dict-set!. The lhs must be a
:field expression; otherwise raises.
Tested:
let r = { x = 1; y = 2 } in r.x <- 5; r.x (5)
let r = { x = 0 } in for i = 1 to 5 do r.x <- r.x + i done; r.x (15)
let r = { name = ...; age = 30 } in r.name <- "Alice"; r.name
The 'mutable' keyword in record type decls is parsed-and-discarded;
runtime semantics: every field is mutable. Phase 2 closes this gap
without changing the dict-based record representation.
type r = { x : int; mutable y : string } parses to
(:type-def-record NAME PARAMS FIELDS) with FIELDS each (NAME) or
(:mutable NAME). Parser dispatches on { after = to parse field list.
Field-type sources are skipped (HM registration TBD). Runtime no-op
since records already work as dynamic dicts.
bash lib/ocaml/conformance.sh now runs lib/ocaml/baseline/run.sh and
aggregates pass/fail counts under a 'baseline' suite. Full-suite
scoreboard now reports both unit-test results and end-to-end OCaml
program runs in a single artifact.
Polymorphic binary search tree with insert + in-order traversal.
Exercises parametric ADT (type 'a tree = Leaf | Node of 'a * 'a tree
* 'a tree), recursive match, List.append, List.fold_left.
Classic fizzbuzz using ref-cell accumulator, for-loop, mod, if/elseif
chain, String.concat, Int.to_string. Output verified via String.length
of the comma-joined result for n=15: 57.
print_string / print_endline / print_int / print_newline now route to
SX display primitive (not the non-existent print/println). print_endline
appends '\n'.
let _ = expr ;; at top level confirmed working via the
wildcard-param parser.
ocaml-infer-let-mut: each rhs inferred in parent env, generalized
sequentially before adding to body env.
ocaml-infer-let-rec-mut: pre-bind all names with fresh tvs; infer
each rhs against the joint env, unify each with its tv, then
generalize all and infer body.
Mutual recursion now type-checks:
let rec even n = if n = 0 then true else odd (n - 1)
and odd n = if n = 0 then false else even (n - 1)
in even : Int -> Bool
Option: join, to_result, some, none.
Result: value, iter, fold.
Bytes: length, get, of_string, to_string, concat, sub — thin alias of
String (SX has no separate immutable byte type).
Ordering fix: Bytes module placed after String so its closures capture
String in scope. Earlier draft put Bytes before String which made
String.* lookups fail with 'not a record/module' (treated as nullary
ctor).
Recursive-descent calculator parses '(1 + 2) * 3 + 4' = 13. Two parser
bugs fixed:
1. parse-let now handles inline 'let rec a () = ... and b () = ... in
body' via new (:let-rec-mut BINDINGS BODY) and (:let-mut BINDINGS
BODY) AST shapes; eval handles both.
2. has-matching-in? lookahead no longer stops at 'and' — 'and' is
internal to let-rec, not a decl boundary. Without this fix, the
inner 'let rec a () = ... and b () = ...' inside a let-decl rhs
would have been treated as the start of a new top-level decl.
Baseline exercises mutually-recursive functions, while-loops, ref-cell
imperative parsing, and ADT-based AST construction.
Parser fix: at-app-start? and parse-app's loop recognise prefix !
as a deref of the next app arg. So 'List.rev !b' parses as
'(:app List.rev (:deref b))' instead of stalling at !.
Buffer module backed by a ref holding string list:
create _ = ref []
add_string b s = b := s :: !b
contents b = String.concat "" (List.rev !b)
add_char/length/clear/reset
Uses Map.Make(StrOrd) + List.fold_left to count word frequencies;
exercises the full functor pipeline with a real-world idiom:
let inc_count m word =
match StrMap.find_opt word m with
| None -> StrMap.add word 1 m
| Some n -> StrMap.add word (n + 1) m
let count words = List.fold_left inc_count StrMap.empty words
10/10 baseline programs pass.
Both written in OCaml inside lib/ocaml/runtime.sx:
module Map = struct module Make (Ord) = struct
let empty = []
let add k v m = ... (* sorted insert via Ord.compare *)
let find_opt / find / mem / remove / bindings / cardinal
end end
module Set = struct module Make (Ord) = struct
let empty = []
let mem / add / remove / elements / cardinal
end end
Sorted association list / sorted list backing — linear ops but
correct. Strong substrate-validation: Map.Make is a non-trivial
functor implemented entirely on top of the OCaml-on-SX evaluator.
os_type="SX", word_size=64, max_array_length, max_string_length,
executable_name="ocaml-on-sx", big_endian=false, unix=true,
win32=false, cygwin=false. Constants-only for now — argv/getenv_opt/
command would need host platform integration.
ocaml-hm-parse-type-src recognises primitive type names (int/bool/
string/float/unit), tyvars 'a, and simple parametric T list / T option.
Replaces the previous int-by-default placeholder in
ocaml-hm-register-type-def!.
So 'type tag = TStr of string | TInt of int' correctly registers
TStr : string -> tag and TInt : int -> tag. Pattern-match on tag
gives proper field types in the body. Multi-arg / function types
still fall back to a fresh tv.
Parser: when | follows a pattern inside parens, build (:por ALT1 ALT2
...). Eval: try alternatives, succeed on first match. Top-level |
remains the clause separator — parens-only avoids ambiguity without
lookahead.
Examples now work:
match n with | (1 | 2 | 3) -> 100 | _ -> 0
match c with | (Red | Green) -> 1 | Blue -> 2
module type S = sig DECLS end is parsed-and-discarded — sig..end
balanced skipping in parse-decl-module-type. AST (:module-type-def
NAME). Runtime no-op (signatures are type-level only).
Allows real OCaml programs with module type decls to parse and run
without stripping the sig blocks.
Parser: { f1 = pat; f2 = pat; ... } in pattern position emits
(:precord (FIELDNAME PAT)...). Mixed with the existing { in
expression position via the at-pattern-atom? whitelist.
Eval: :precord matches against a dict; required fields must be present
and each pat must match the field's value. Can mix literal+var:
'match { x = 1; y = y } with | { x = 1; y = y } -> y' matches only
when x is 1.
A tiny arithmetic-expression evaluator using:
type expr = Lit of int | Add of expr*expr | Mul of expr*expr | Neg of expr
let rec eval e = match e with | Lit n -> n | Add (a,b) -> ...
Exercises type-decl + multi-arg ctor + recursive match end-to-end.
Per-program timeout in run.sh bumped to 120s.
lib/ocaml/baseline/{factorial,list_ops,option_match,module_use,sum_squares}.ml
exercised through ocaml-run-program (file-read F). lib/ocaml/baseline/
run.sh runs them and compares against expected.json — all 5 pass.
To make module_use.ml (with nested let-in) parse, parser's
skip-let-rhs-boundary! now uses has-matching-in? lookahead: a let at
depth 0 in a let-decl rhs opens a nested block IFF a matching in
exists before any decl-keyword. Without that in, the let is a new
top-level decl (preserves test 274 'let x = 1 let y = 2').
This is the first piece of Phase 5.1 'vendor a slice of OCaml
testsuite' — handcrafted fixtures for now, real testsuite TBD.
ocaml-hm-ctors is now a mutable list cell; user type-defs register
their constructors via ocaml-hm-register-type-def!. New
ocaml-type-of-program processes top-level decls in order:
- type-def: register ctors with the scheme inferred from PARAMS+CTORS
- def/def-rec: generalize and bind in the type env
- exception-def: no-op for typing
- expr: return inferred type
Examples:
type color = Red | Green | Blue;; Red : color
type shape = Circle of int | Square of int;;
let area s = match s with
| Circle r -> r * r
| Square s -> s * s;;
area : shape -> Int
Caveat: ctor arg types parsed as raw source strings; registry defaults
to int for any single-arg ctor. Proper type-source parsing pending.
ocaml-infer-let-rec pre-binds the function name to a fresh tv before
inferring rhs (which may recursively call the name), unifies the
inferred rhs type with the tv, generalizes, then infers body.
Builtin env types :: : 'a -> 'a list -> 'a list and @ : 'a list ->
'a list -> 'a list — needed because :op compiles to (:app (:app (:var
OP) L) R) and previously these var lookups failed.
Examples now infer:
let rec fact n = if ... in fact : Int -> Int
let rec len lst = ... in len : 'a list -> Int
let rec map f xs = ... in map : ('a -> 'b) -> 'a list -> 'b list
1 :: [2; 3] : Int list
let rec sum lst = ... in sum [1;2;3] : Int
Scoreboard refreshed: 358/358 across 14 suites.
ocaml-hm-ctor-env registers None/Some : 'a -> 'a option, Ok/Error :
'a -> ('a, 'b) result. :con NAME instantiates the scheme; :pcon NAME
ARG-PATS walks arg patterns through the constructor's arrow type,
unifying each.
Pretty-printer renders 'Int option' and '(Int, 'b) result'.
Examples now infer:
fun x -> Some x : 'a -> 'a option
match Some 5 with | None -> 0 | Some n -> n : Int
fun o -> match o with | None -> 0 | Some n -> n : Int option -> Int
Ok 1 : (Int, 'b) result
Error "oops" : ('a, String) result
User type-defs would extend the registry — pending.
ocaml-infer-pat covers :pwild, :pvar, :plit, :pcons, :plist, :ptuple,
:pas. Returns {:type T :env ENV2 :subst S} where ENV2 has the pattern's
bound names threaded through.
ocaml-infer-match unifies each clause's pattern type with the scrutinee,
runs the body in the env extended with pattern bindings, and unifies
all body types via a fresh result tv.
Examples:
fun lst -> match lst with | [] -> 0 | h :: _ -> h : Int list -> Int
match (1, 2) with | (a, b) -> a + b : Int
Constructor patterns (:pcon) fall through to a fresh tv for now —
proper handling needs a ctor type registry from 'type' declarations.
compare is a host builtin returning -1/0/1 (Stdlib.compare semantics)
deferred to host SX </>. List.sort is insertion-sort in OCaml: O(n²)
but works correctly. List.stable_sort = sort.
Tested: ascending int sort, descending via custom comparator (b - a),
empty list, string sort.
Backing store is a one-element list cell holding a SX dict; keys
coerced to strings via str so int/string keys work uniformly. API:
create, add, replace, find, find_opt, mem, length.
_hashtbl_create / _hashtbl_add / _hashtbl_replace / _hashtbl_find_opt /
_hashtbl_mem / _hashtbl_length primitives wired in eval.sx; OCaml-side
Hashtbl module wraps them in lib/ocaml/runtime.sx.
Tuple type (hm-con "*" TYPES); list type (hm-con "list" (TYPE)).
ocaml-infer-tuple threads substitution through each item left-to-right.
ocaml-infer-list unifies all items with a fresh 'a (giving 'a list for
empty []).
Pretty-printer renders 'Int * Int' for tuples and 'Int list' for lists,
matching standard OCaml notation.
Examples:
fun x y -> (x, y) : 'a -> 'b -> 'a * 'b
fun x -> [x; x] : 'a -> 'a list
[] : 'a list
List: concat/flatten, init, find/find_opt, partition, mapi/iteri,
assoc/assoc_opt. Option: iter/fold/to_list. Result: get_ok/get_error/
map_error/to_option.
Fixed skip-to-boundary! in parser to track let..in / begin..end /
struct..end / for/while..done nesting via a depth counter — without
this, nested-let inside a top-level decl body trips over the
decl-boundary detector. Stdlib functions like List.init / mapi / iteri
use begin..end to make their nested-let intent explicit.
exception NAME [of TYPE] parses to (:exception-def NAME [ARG-SRC]).
Runtime is a no-op: raise/match already work on tagged ctor values, so
'exception E of int;; try raise (E 5) with | E n -> n' end-to-end with
zero new eval logic.
Parser: type [PARAMS] NAME = | Ctor [of T1 [* T2]*] | ...
- PARAMS: optional 'a or ('a, 'b) tyvar list
- AST: (:type-def NAME PARAMS CTORS) with each CTOR (NAME ARG-SOURCES)
- Argument types captured as raw source strings (treated opaquely at
runtime since ctor dispatch is dynamic)
Runtime is a no-op — constructors and pattern matching already work
dynamically. Phase 5 will use these decls to register ctor types for
HM checking.
Pattern parser top wraps cons-pat with 'as ident' -> (:pas PAT NAME).
Match clause parser consumes optional 'when GUARD-EXPR' before -> and
emits (:case-when PAT GUARD BODY) instead of :case.
Eval: :pas matches inner pattern then binds the alias name; case-when
checks the guard after a successful match and falls through to the next
clause if the guard is false.
Or-patterns deferred — ambiguous with clause separator without
parens-only support.
Parser: { f = e; f = e; ... } -> (:record (F E)...). { base with f = e;
... } -> (:record-update BASE (F E)...). Eval builds a dict from field
bindings; record-update merges the new fields over the base dict — the
same dict representation already used for modules.
{ also added to at-app-start? so records are valid arg atoms. Field
access via the existing :field postfix unifies record/module access.
Record patterns deferred to a later iteration.
lib/ocaml/conformance.sh runs the full test suite, classifies each
result by description prefix into one of 14 suites (tokenize, parser,
eval-core, phase2-refs/loops/function/exn, phase3-adt, phase4-modules,
phase5-hm, phase6-stdlib, let-and, phase1-params, misc), and emits
scoreboard.json + scoreboard.md.
Per the briefing: "Once the scoreboard exists (Phase 5.1), it is your
north star." Real OCaml testsuite vendoring deferred — needs more
stdlib + ADT decls to make .ml files runnable.
Parser: try-consume-param! handles ident, wildcard _ (fresh __wild_N
name), unit () (fresh __unit_N), typed (x : T) (skips signature).
parse-fun and parse-let (inline) reuse the helper; top-level
parse-decl-let inlines a similar test.
test.sh timeout bumped from 60s to 180s — the growing suite was hitting
the cap and reporting spurious failures.
OCaml-on-SX is the deferred second consumer for lib/guest/hm.sx step 8.
lib/ocaml/infer.sx assembles Algorithm W on top of the shipped algebra:
- Var: lookup + hm-instantiate.
- Fun: fresh-tv per param, auto-curried via recursion.
- App: unify against hm-arrow, fresh-tv for result.
- Let: generalize rhs over (ftv(t) - ftv(env)) — let-polymorphism.
- If: unify cond with Bool, both branches with each other.
- Op (+, =, <, etc.): builtin signatures (int*int->int monomorphic,
=/<> polymorphic 'a->'a->bool).
Tests pass for: literals, fun x -> x : 'a -> 'a, let id ... id 5/id true,
fun f x -> f (f x) : ('a -> 'a) -> 'a -> 'a (twice).
Pending: tuples, lists, pattern matching, let-rec, modules in HM.
Parser collects multiple bindings via 'and', emitting (:def-rec-mut
BINDINGS) for let-rec chains and (:def-mut BINDINGS) for non-rec.
Single bindings keep the existing (:def …) / (:def-rec …) shapes.
Eval (def-rec-mut): allocate placeholder cell per binding, build joint
env where each name forwards through its cell, then evaluate each rhs
against the joint env and fill the cells. Even/odd mutual-rec works.
lib/ocaml/runtime.sx defines the stdlib in OCaml syntax (not SX): every
function exercises the parser, evaluator, match engine, and module
machinery built in earlier phases. Loaded once via ocaml-load-stdlib!,
cached in ocaml-stdlib-env, layered under user code via ocaml-base-env.
List: length, rev, rev_append, map, filter, fold_left/right, append,
iter, mem, for_all, exists, hd, tl, nth.
Option: map, bind, value, get, is_none, is_some.
Result: map, bind, is_ok, is_error.
Substrate validation: this stdlib is a nontrivial OCaml program — its
mere existence proves the substrate works.
Parser: module F (M) (N) ... = struct DECLS end -> (:functor-def NAME
PARAMS DECLS). module N = expr (non-struct) -> (:module-alias NAME
BODY-SRC). Functor params accept (P) or (P : Sig) — signatures
parsed-and-skipped via skip-optional-sig.
Eval: ocaml-make-functor builds curried host-SX closures from module
dicts to a module dict. ocaml-resolve-module-path extended for :app so
F(A), F(A)(B), and Outer.Inner all resolve to dicts.
Phase 4 LOC ~290 cumulative (still well under 2000).
Parser: open Path and include Path top-level decls; Path is Ctor (.Ctor)*.
Eval resolves via ocaml-resolve-module-path (same :con-as-module-lookup
escape hatch used by :field). open extends the env with the module's
bindings; include also merges into the surrounding module's exports
(when inside a struct...end).
Path resolver lets M.Sub.x work for nested modules. Phase 4 LOC ~165.
module M = struct DECLS end parsed by sub-tokenising the body source
between struct and the matching end (nesting tracked via struct/begin/
sig/end). Field access is a postfix layer above parse-atom, binding
tighter than application: f r.x -> (:app f (:field r "x")).
Eval (:module-def NAME DECLS) builds a dict via ocaml-eval-module
running decls in a sub-env. (:field EXPR NAME) looks up dict fields,
treating (:con NAME) heads as module-name lookups instead of nullary
ctors so M.x works with M as a module.
Phase 4 LOC so far: ~110 lines (well under 2000 budget).
Parser: try EXPR with | pat -> handler | ... -> (:try EXPR CLAUSES).
Eval delegates to SX guard with else matching the raised value against
clause patterns; re-raises on no-match. raise/failwith/invalid_arg
shipped as builtins. failwith "msg" raises ("Failure" msg) so
| Failure msg -> ... patterns match.
Sugar for fun + match. AST (:function CLAUSES) -> unary closure that
runs ocaml-match-clauses on its arg. let rec recognises :function as a
recursive rhs and ties the knot via cell, so
let rec map f = function | [] -> [] | h::t -> f h :: map f t
works. ocaml-match-eval refactored to share clause-walk with function.
Parser: for i = lo to|downto hi do body done, while cond do body done.
AST: (:for NAME LO HI :ascend|:descend BODY) and (:while COND BODY).
Eval re-binds the loop var per iteration; both forms evaluate to unit.
ref is a builtin boxing its arg in a one-element list. Prefix ! parses
to (:deref ...) and reads via (nth cell 0). := joins the binop
precedence table at level 1 right-assoc and mutates via set-nth!.
Closures share the underlying cell.
Two-phase grammar: parse-expr-no-seq (prior entry) + parse-expr wraps
it with ;-chaining. List bodies keep parse-expr-no-seq so ; remains a
separator inside [...]. Match clause bodies use the seq variant and stop
at | — real OCaml semantics. Trailing ; before end/)/|/in/then/else/eof
permitted.
Patterns: wildcard, literal, var, ctor (nullary + arg, flattens tuple
args so Pair(a,b) -> (:pcon "Pair" PA PB)), tuple, list literal, cons
:: (right-assoc), unit. Match: leading | optional, (:match SCRUT
CLAUSES) with each clause (:case PAT BODY). Body parsed via parse-expr
because | is below level-1 binop precedence.
ocaml-parse-program: program = decls + bare exprs, ;;-separated.
Each decl is (:def …), (:def-rec …), or (:expr …). Body parsing
re-feeds the source slice through ocaml-parse — shared-state refactor
deferred.
Ships the algebra for HM-style type inference, riding on
lib/guest/match.sx (terms + unify) and ast.sx (canonical AST):
• Type constructors: hm-tv, hm-arrow, hm-con, hm-int, hm-bool, hm-string
• Schemes: hm-scheme / hm-monotype + accessors
• Free type-vars: hm-ftv, hm-ftv-scheme, hm-ftv-env
• Substitution: hm-apply, hm-apply-scheme, hm-apply-env, hm-compose
• Generalize / Instantiate (with shared fresh-tv counter)
• hm-fresh-tv (counter is a (list N) the caller threads)
• hm-infer-literal (the only fully-closed inference rule)
24 self-tests in lib/guest/tests/hm.sx covering every function above.
The lambda / app / let inference rules — the substitution-threading
core of Algorithm W — intentionally live in HOST CODE rather than the
kit, because each host's AST shape and substitution-threading idiom
differ subtly enough that forcing one shared assembly here proved
brittle in practice (an earlier inline-assembled hm-infer faulted with
"Not callable: nil" only when defined in the kit, despite working when
inline-eval'd or in a separate file — a load/closure interaction not
worth chasing inside this step's budget). The host gets the algebra
plus a spec; assembly stays close to the AST it reasons over.
PARTIAL — algebra + literal rule shipped; full Algorithm W deferred
to host consumers (haskell/infer.sx, lib/ocaml/types.sx when
OCaml-on-SX Phase 5 lands per the brief's sequencing note). Haskell
infer.sx untouched; haskell scoreboard still 156/156 baseline.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
apl-permutations was doing (append acc <new-perms>) which is
O(|acc|) and acc grows ~N! big — total cost O(N!²).
Swapped to (append <new-perms> acc) — append is O(|first|)
so cost is O((n+1)·N!_prev) per layer, total O(N!). q(7)
went from 32s to 12s; q(8)=92 now finishes well within the
300s timeout, so the queens(8) test is restored.
497/497. Phase 8 complete.
Configurable layout pass that inserts virtual open / close / separator
tokens based on indentation. Supports both styles the brief calls out:
• Haskell-flavour: layout opens AFTER a reserved keyword
(let/where/do/of) and resolves to the next token's column. Module
prelude wraps the whole input in an implicit block. Explicit `{`
after the keyword suppresses virtual layout.
• Python-flavour: layout opens via an :open-trailing-fn predicate
fired AFTER the trigger token (e.g. trailing `:`) — and resolves
to the column of the next token, which in real source is on a
fresh line. No module prelude.
Public entry: (layout-pass cfg tokens). Token shape: dict with at
least :type :value :line :col; everything else passes through. Newline
filler tokens are NOT used — line-break detection is via :line.
lib/guest/tests/layout.sx — 6 tests covering both flavours:
haskell-do-block / haskell-explicit-brace / haskell-do-inline /
haskell-module-prelude / python-if-block / python-nested.
Per the brief's gotcha note ("Don't ship lib/guest/layout.sx unless
the haskell scoreboard equals baseline") — haskell/layout.sx is left
UNTOUCHED. The kit isn't yet a drop-in replacement for the full
Haskell 98 algorithm (Note 5, multi-stage pre-pass, etc.) and forcing
a port would risk the 156 currently passing programs. Haskell
scoreboard remains at 156/156 baseline because no haskell file
changed. The synthetic Python-ish fixture is the second consumer per
the brief's wording.
PARTIAL — kit + synthetic fixture shipped; haskell port deferred until
the kit grows the missing Haskell-98 wrinkles.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
programs-e2e.sx exercises the classic-algorithm shapes from
lib/apl/tests/programs/*.apl via the full pipeline (apl-run on
embedded source strings). Tests include factorial-via-∇,
triangular numbers, sum-of-squares, prime-mask building blocks
(divisor counts via outer mod), named-fn composition,
dyadic max-of-two, and a single Newton sqrt step.
The original one-liners (e.g. primes' inline ⍵←⍳⍵) need parser
features we haven't built (compress-as-fn, inline assign) — the
e2e tests use multi-statement equivalents. No file-reading
primitive in OCaml SX, so source is embedded.
Side-fix: ⌿ (first-axis reduce) and ⍀ (first-axis scan) were
silently skipped by the tokenizer — added to apl-glyph-set
and apl-parse-op-glyphs.
Parser: apl-collect-fn-bindings pre-scans stmt-groups for
`name ← { ... }` patterns and populates apl-known-fn-names.
is-fn-tok? consults this list; collect-segments-loop emits
(:fn-name nm) for known names so they parse as functions.
Resolver: apl-resolve-{monadic,dyadic} handle :fn-name by
looking up env, asserting the binding is a dfn, returning
a closure that dispatches to apl-call-dfn{-m,}.
Recursion still works: `fact ← {0=⍵:1 ⋄ ⍵×∇⍵-1} ⋄ fact 5` → 120.
Three small unblockers in one iteration:
- tokenizer: read-digits! now consumes optional ".digits" suffix,
so 3.7 and ¯2.5 are single number tokens.
- tokenizer: ⎕ followed by ← emits a single :name "⎕←" token
(instead of splitting on the assign glyph). Parser registers
⎕← in apl-quad-fn-names; apl-monadic-fn maps to apl-quad-print.
- eval-ast: :str AST nodes evaluate to char arrays. Single-char
strings become rank-0 scalars; multi-char become rank-1 vectors
of single-char strings.
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.