First baseline using Map.Make on a string-keyed map:
module StringOrd = struct
type t = string
let compare = String.compare
end
module SMap = Map.Make (StringOrd)
let count_words text =
let words = String.split_on_char ' ' text in
List.fold_left (fun m w ->
let n = match SMap.find_opt w m with
| Some n -> n
| None -> 0
in
SMap.add w (n + 1) m
) SMap.empty words
For 'the quick brown fox jumps over the lazy dog' ('the' appears
twice), SMap.cardinal -> 8.
Complements bag.ml (Hashtbl-based) and unique_set.ml (Set.Make)
with a sorted Map view of the same kind of counting problem. 35
baseline programs total.
Either module (mirrors OCaml 4.12+ stdlib):
left x / right x
is_left / is_right
find_left / find_right (return Option)
map_left / map_right (single-side mappers)
fold lf rf e (case dispatch)
equal eq_l eq_r a b
compare cmp_l cmp_r a b (Left < Right)
Constructors are bare 'Left x' / 'Right x' (OCaml 4.12+ exposes them
directly without an explicit type-decl).
Hashtbl.copy:
build a fresh cell with _hashtbl_create
walk _hashtbl_to_list and re-add each (k, v)
mutating one copy doesn't touch the other
(Hashtbl.length t + Hashtbl.length t2 = 3 after fork-and-add
verifies that adds to t2 don't appear in t)
Defines a JSON-like algebraic data type:
type json =
| JNull
| JBool of bool
| JInt of int
| JStr of string
| JList of json list
Recursively serialises to a string via match-on-constructor, then
measures the length:
JList [JInt 1; JBool true; JNull; JStr 'hi'; JList [JInt 2; JInt 3]]
-> '[1,true,null,"hi",[2,3]]' length 24
Exercises:
- five-constructor ADT (one nullary, three single-arg, one list-arg)
- recursive match
- String.concat ',' (List.map to_string xs)
- string-cat with embedded escaped quotes
34 baseline programs total.
In-place Fisher-Yates shuffle using:
Random.init 42 deterministic seed
let a = Array.of_list xs
for i = n - 1 downto 1 do reverse iteration
let j = Random.int (i + 1)
let tmp = a.(i) in
a.(i) <- a.(j);
a.(j) <- tmp
done
Sum is invariant under permutation, so the test value (55 for
[1..10] = 1+2+...+10) verifies the shuffle is a valid permutation
regardless of which permutation the seed yields.
Exercises Random.init / Random.int + Array.of_list / to_list /
length / arr.(i) / arr.(i) <- v + downto loop + multi-statement
sequencing within for-body.
33 baseline programs total.
pi_leibniz.ml: Leibniz formula for pi.
pi/4 = 1 - 1/3 + 1/5 - 1/7 + ...
pi ~= 4 * sum_{k=0}^{n-1} (-1)^k / (2k+1)
For n=1000, pi ~= 3.140593. Multiply by 100 and int_of_float -> 314.
Side-quest: int_of_float was wrongly defined as identity in
iteration 94. Fixed to:
let int_of_float f =
if f < 0.0 then _float_ceil f else _float_floor f
(truncate toward zero, mirroring real OCaml's int_of_float). The
identity definition was a stub from when integer/float dispatch was
not yet split — now they're separate, the stub is wrong.
Float.to_int still uses floor since OCaml's docs say the result is
unspecified for nan / out-of-range; close enough for our scope.
32 baseline programs total.
Both take an inner predicate / comparator and walk both lists in
lockstep:
equal eq a b short-circuits on first mismatch
compare cmp a b -1 if a is a strict prefix
1 if b is
0 if both empty
otherwise first non-zero element comparison
Mirrors real OCaml's signatures.
List.equal (=) [1;2;3] [1;2;3] = true
List.equal (=) [1;2;3] [1;2;4] = false
List.compare compare [1;2;3] [1;2;4] = -1
List.compare compare [1;2] [1;2;3] = -1
List.compare compare [] [] = 0
Bool module:
equal a b = a = b
compare a b = 0 if equal, 1 if a, -1 if b (false < true)
to_string 'true' / 'false'
of_string s = s = 'true'
not_ wraps host not
to_int true=1, false=0
Option additions (take eq/cmp parameter for the inner value):
equal eq a b None=None, otherwise eq the inner values
compare cmp a b None < Some _; otherwise cmp inner
Option.equal (=) (Some 1) (Some 1) = true
Option.equal (=) (Some 1) None = false
Option.compare compare (Some 5) (Some 3) = 1
bag.ml: split a sentence on spaces, count each word in a Hashtbl,
return the maximum count via Hashtbl.fold.
count_words 'the quick brown fox jumps over the lazy dog the fox'
-> Hashtbl with 'the' = 3 as the max
-> 3
Exercises String.split_on_char + Hashtbl.find_opt/replace +
Hashtbl.fold (k v acc -> ...). Together with frequency.ml from
iter 84 we now have two Hashtbl-counting baselines exercising
slightly different idioms. 29 baseline programs total.
String additions:
equal a b = a = b
compare a b = -1 / 0 / 1 via host < / >
cat a b = a ^ b
empty = '' (constant)
Defines:
type frac = { num : int; den : int }
let rec gcd a b = if b = 0 then a else gcd b (a mod b)
let make n d = (* canonicalise: gcd-reduce and
force den > 0 *)
let add x y = make (x.num * y.den + y.num * x.den) (x.den * y.den)
let mul x y = make (x.num * y.num) (x.den * y.den)
Test:
let r = add (make 1 2) (make 1 3) in (* 5/6 *)
let s = mul (make 2 3) (make 3 4) in (* 1/2 *)
let t = add r s in (* 5/6 + 1/2 = 4/3 *)
t.num + t.den (* = 7 *)
Exercises records, recursive gcd, mod, abs, integer division (the
truncate-toward-zero semantics from iter 94 are essential here —
make would diverge from real OCaml's behaviour with float division).
28 baseline programs total.
Real OCaml's Seq.t is 'unit -> Cons of elt * Seq.t | Nil' — a lazy
thunk that lets you build infinite sequences. Ours is just a list,
which gives the right shape for everything in baseline programs that
don't rely on laziness (taking from infinite sequences would force
memory).
API: empty, cons, return, is_empty, iter, iteri, map, filter,
filter_map, fold_left, length, take, drop, append, to_list,
of_list, init, unfold.
unfold takes a step fn 'acc -> Option (elt * acc)' and threads
through until it returns None:
Seq.fold_left (+) 0
(Seq.unfold (fun n -> if n > 4 then None
else Some (n, n + 1))
1)
= 1 + 2 + 3 + 4 = 10
First baseline that exercises the functor pipeline end to end:
module IntOrd = struct
type t = int
let compare a b = a - b
end
module IntSet = Set.Make (IntOrd)
let unique_count xs =
let s = List.fold_left (fun s x -> IntSet.add x s) IntSet.empty xs in
IntSet.cardinal s
Counts unique elements in [3;1;4;1;5;9;2;6;5;3;5;8;9;7;9]:
{1,2,3,4,5,6,7,8,9} -> 9
The input has 15 elements with 9 unique values. The 'type t = int'
declaration in IntOrd is required by real OCaml; OCaml-on-SX is
dynamic and would accept it without, but we include it for source
fidelity. 27 baseline programs total.
Functors were already wired through ocaml-make-functor in eval.sx
(curried host closure consuming module dicts) but had no explicit
tests for the user-defined Ord application path. This commit adds
three smoke tests that confirm:
module IntOrd = struct let compare a b = a - b end
module S = Set.Make (IntOrd)
S.elements (fold-add [5;1;3;1;5]) sums to 9 (dedupe + sort)
S.mem 2 (S.add 1 (S.add 2 (S.add 3 S.empty))) = true
M.cardinal (M.add 1 'a' (M.add 2 'b' M.empty)) = 2
The Ord parameter is properly threaded through the functor body —
elements are sorted in compare order and dedupe works.
Three parser changes:
1. at-app-start? returns true on op '~' or '?' so the app loop
keeps consuming labeled args.
2. The app arg parser handles:
~name:VAL drop label, parse VAL as the arg
?name:VAL same
~name punning -- treat as (:var name)
?name same
3. try-consume-param! drops '~' or '?' and treats the following
ident as a regular positional param name.
Caveats:
- Order in the call must match definition order; we don't reorder
by label name.
- Optional args don't auto-wrap in Some, so the function body sees
the raw value for ?x:V.
Lets us write idiomatic-looking OCaml even though the runtime is
positional underneath:
let f ~x ~y = x + y in f ~x:3 ~y:7 = 10
let x = 4 in let y = 5 in f ~x ~y = 20 (punning)
let f ?x ~y = x + y in f ?x:1 ~y:2 = 3
User-implemented mergesort that exercises features added across the
last few iterations:
let rec split lst = match lst with
| x :: y :: rest ->
let (a, b) = split rest in (* iter 98 let-tuple destruct *)
(x :: a, y :: b)
| ...
let rec merge xs ys = match xs with
| x :: xs' ->
match ys with (* nested match-in-match *)
| y :: ys' -> ...
...
List.fold_left (+) 0 (sort [...]) (* iter 89 (op) section *)
Sum of [3;1;4;1;5;9;2;6;5;3;5] = 44 regardless of order, so the
result is also a smoke test of the implementation correctness — if
merge_sort drops or duplicates an element the sum diverges. 26
baseline programs total.
parse-decl-let lives in the outer ocaml-parse-program scope and does
not have access to parse-pattern (which is local to ocaml-parse).
Source-slicing approach instead:
1. detect '(IDENT, ...)' in collect-params
2. scan tokens to the matching ')' (tracking nested parens)
3. slice the pattern source string from src
4. push (synth_name, pat_src) onto tuple-srcs
Then after collecting params, the rhs source string gets wrapped with
'match SN with PAT_SRC -> (RHS_SRC)' for each tuple-param,
innermost-first, and the final string is fed through ocaml-parse.
End result is the same AST shape as the iteration-102 inner-let
case: a function whose body destructures a synthetic name.
let f (a, b) = a + b ;; f (3, 7) = 10
let g x (a, b) = x + a + b ;; g 1 (2, 3) = 6
let h (a, b) (c, d) = a * b + c * d
;; h (1, 2) (3, 4) = 14
Mirrors iteration 101's parse-fun change inside parse-let's
parse-one!:
- same '(IDENT, ...)' detection on collect-params
- same __pat_N synth name for the function param
- same innermost-first match-wrapping
Difference: for inner-let the wrapping is applied to the rhs of the
let-binding (which is the function value), not directly to a fun
body.
let f (a, b) = a + b in f (3, 7) = 10
let g x (a, b) = x + a + b in g 1 (2, 3) = 6
let h (a, b) (c, d) = a * b + c * d
in h (1, 2) (3, 4) = 14
parse-fun's collect-params now detects '(IDENT, ...)' as a
tuple-pattern parameter (lookahead at peek-tok-at 1/2 distinguishes
from '(x : T)' and '()' cases that try-consume-param! already
handles). For each tuple param it:
1. parse-pattern to get the full pattern AST
2. generate a synthetic __pat_N name as the actual fun parameter
3. push (synth_name, pattern) onto tuple-binds
After parsing the body, wraps it innermost-first with one
'match __pat_N with PAT -> ...' per tuple-param. The user-visible
result is a (:fun (params...) body) where params are all simple
names but the body destructures.
Also retroactively simplifies Hashtbl.keys/values from
'fun pair -> match pair with (k, _) -> k' to plain
'fun (k, _) -> k', closing the iteration-99 workaround.
(fun (a, b) -> a + b) (3, 7) = 10
List.map (fun (a, b) -> a * b)
[(1, 2); (3, 4); (5, 6)] = [2; 12; 30]
List.map (fun (k, _) -> k)
[("a", 1); ("b", 2)] = ["a"; "b"]
(fun a (b, c) d -> a + b + c + d) 1 (2, 3) 4 = 10
Linear-congruential PRNG with mutable seed (_state ref). API:
init s seed the PRNG
self_init () default seed (1)
int bound 0 <= n < bound
bool () fair coin
float bound uniform in [0, bound)
bits () 30 bits
Stepping rule:
state := (state * 1103515245 + 12345) mod 2147483647
result := |state| mod bound
Same seed reproduces the same sequence. Real OCaml's Random uses
Lagged Fibonacci; ours is simpler but adequate for shuffles and
Monte Carlo demos in baseline programs.
Random.init 42; Random.int 100 = 48
Random.init 1; Random.int 10 = 0
Two new host primitives:
_hashtbl_remove t k -> dissoc the key from the underlying dict
_hashtbl_clear t -> reset the cell to {}
Eight new OCaml-syntax helpers in runtime.sx Hashtbl module:
bindings t = _hashtbl_to_list t
keys t = List.map (fun (k, _) -> k) (...)
values t = List.map (fun (_, v) -> v) (...)
to_seq t = bindings t
to_seq_keys / to_seq_values
remove / clear / reset
The keys/values implementations use a 'fun pair -> match pair with
(k, _) -> k' indirection because parse-fun does not currently allow
tuple patterns directly on parameters. Same restriction we worked
around in iteration 98's let-pattern desugaring.
Also: a detour attempting to add top-level 'let (a, b) = expr'
support was started but reverted — parse-decl-let in the outer
ocaml-parse-program scope does not have access to parse-pattern
(which is local to ocaml-parse). Will need a slice + re-parse trick
later.
When 'let' is followed by '(', parse-let now reads a full pattern
(via the existing parse-pattern used by match), expects '=', then
'in', and desugars to:
let PATTERN = EXPR in BODY => match EXPR with PATTERN -> BODY
This reuses the entire pattern-matching machinery, so any pattern
the match parser accepts works here too — paren-tuples, nested
tuples, cons patterns, list patterns. No 'rec' allowed for pattern
bindings (real OCaml's restriction).
let (a, b) = (1, 2) in a + b = 3
let (a, b, c) = (10, 20, 30) in a + b + c = 60
let pair = (5, 7) in
let (x, y) = pair in x * y = 35
Also retroactively cleaned up Printf's iter-97 width-pos packing
hack ('width * 1000000 + spec_pos') — it's now
'let (width, spec_pos) = parse_width_loop after_flags in ...' like
real OCaml.
The Printf walker now parses optional flags + width digits between
'%' and the spec letter:
- left-align (default is right-align)
0 zero-pad (default is space-pad; only honoured when not left-aligned)
Nd... decimal width digits (any number)
After formatting the argument into a base string with the existing
spec dispatch (%d/%i/%u/%s/%f/%c/%b/%x/%X/%o), the result is padded
to the requested width.
Workaround: width and spec_pos are returned packed as
width * 1000000 + spec_pos
because the parser does not yet support tuple destructuring in let
('let (a, b) = expr in body' fails with 'expected ident'). TODO: lift
that limitation; for now the encoding round-trips losslessly for any
practical width.
Printf.sprintf '%5d' 42 = ' 42'
Printf.sprintf '%-5d|' 42 = '42 |'
Printf.sprintf '%05d' 42 = '00042'
Printf.sprintf '%4s' 'hi' = ' hi'
Printf.sprintf 'hi=%-3d, hex=%04x' 9 15 = 'hi=9 , hex=000f'
The previous List.sort was O(n^2) insertion sort. Replaced with a
straightforward mergesort:
split lst -> alternating-take into ([odd], [even])
merge xs ys -> classic two-finger merge under cmp
sort cmp xs -> base cases [], [x]; otherwise split + recursive
sort on each half + merge
Tuple destructuring on the split result is expressed via nested
match — let-tuple-destructuring would be cleaner but works today.
This benefits sort_uniq (which calls sort first), Set.Make.add via
sort etc., and any user program using List.sort. Stable_sort is
already aliased to sort.
Three things in this commit:
1. Integer / is now truncate-toward-zero on ints, IEEE on floats. The
eval-op handler for '/' checks (number? + (= (round x) x)) on both
sides; if both integral, applies host floor/ceil based on sign;
otherwise falls through to host '/'.
2. Fixes Int.rem, which was returning 0 because (a - b * (a / b))
was using float division: 17 - 5 * 3.4 = 0.0. Now Int.rem 17 5 = 2.
3. Int module fleshed out:
max_int / min_int / zero / one / minus_one,
succ / pred / neg, add / sub / mul / div / rem,
equal, compare.
Also adds globals: max_int, min_int, abs_float, float_of_int,
int_of_float (the latter two are identity in our dynamic runtime).
17 / 5 = 3
-17 / 5 = -3 (trunc toward zero)
Int.rem 17 5 = 2
Int.compare 5 3 = 1
Eight new Array functions, all in OCaml syntax inside runtime.sx,
delegating to the corresponding List operation on the cell's
underlying list:
sort cmp a -> a := List.sort cmp !a (* mutates the cell *)
stable_sort = sort
fast_sort = sort
append a b -> ref (List.append !a !b)
sub a pos n -> ref (take n (drop pos !a))
exists p -> List.exists p !a
for_all p -> List.for_all p !a
mem x a -> List.mem x !a
Round-trip:
let a = Array.of_list [3;1;4;1;5;9;2;6] in
Array.sort compare a;
Array.to_list a = [1;1;2;3;4;5;6;9]
Five '+++++.' groups, cumulative accumulator 5+10+15+20+25 = 75.
This is a brainfuck *subset* — only > < + - . (no [ ] looping). That's
intentional: the goal is to stress imperative idioms that the recently
added Array module + array indexing syntax + s.[i] make ergonomic, all
in one program.
Exercises:
Array.make 256 0
arr.(!ptr)
arr.(!ptr) <- arr.(!ptr) + 1
prog.[!pc]
ref / ! / :=
while + nested if/else if/else if for op dispatch
25 baseline programs total.
Counts primes <= 50, expected 15.
Stresses the recently-added Array module + the new array-indexing
syntax together with nested control flow:
let sieve = Array.make (n + 1) true in
sieve.(0) <- false;
sieve.(1) <- false;
for i = 2 to n do
if sieve.(i) then begin
let j = ref (i * i) in
while !j <= n do
sieve.(!j) <- false;
j := !j + i
done
end
done;
...
Exercises: Array.make, arr.(i), arr.(i) <- v, nested for/while,
begin..end blocks, ref/!/:=, integer arithmetic. 24 baseline
programs total.
parse-atom-postfix's '.()' branch now disambiguates between let-open
and array-get based on whether the head is a module path (':con' or
':field' chain rooted in ':con'). Module paths still emit
(:let-open M EXPR); everything else emits (:array-get ARR I).
Eval handles :array-get by reading the cell's underlying list at
index. The '<-' assignment handler now also accepts :array-get lhs
and rewrites the cell with one position changed.
Idiomatic OCaml array code now works:
let a = Array.make 5 0 in
for i = 0 to 4 do a.(i) <- i * i done;
a.(3) + a.(4) = 25
let a = Array.init 4 (fun i -> i + 1) in
a.(0) + a.(1) + a.(2) + a.(3) = 10
List.(length [1;2;3]) = 3 (* unchanged: List is a module *)
Array module (runtime.sx, OCaml syntax):
Backed by a 'ref of list'. make/length/get/init build the cell;
set rewrites the underlying list with one cell changed (O(n) but
works for short arrays in baseline programs). Includes
iter/iteri/map/mapi/fold_left/to_list/of_list/copy/blit/fill.
(op) operator sections (parser.sx, parse-atom):
When the token after '(' is a binop (any op with non-zero
precedence in the binop table) and the next token is ')', emit
(:fun ('a' 'b') (:op OP a b)) — i.e. (+) becomes fun a b -> a + b.
Recognises every binop including 'mod', 'land', '^', '@', '::',
etc.
Lets us write:
List.fold_left (+) 0 [1;2;3;4;5] = 15
let f = ( * ) in f 6 7 = 42
List.map ((-) 10) [1;2;3] = [9;8;7]
let a = Array.make 5 7 in
Array.set a 2 99;
Array.fold_left (+) 0 a = 127
Inline CSV-like text:
a,1,extra
b,2,extra
c,3,extra
d,4,extra
Two-stage String.split_on_char: first on '\n' for rows, then on ','
for fields per row. List.fold_left accumulates int_of_string of the
second field across rows. Result = 1+2+3+4 = 10.
Exercises char escapes inside string literals ('\n'), nested
String.split_on_char, List.fold_left with a non-trivial closure body,
and int_of_string. 23 baseline programs total.
Tokenizer already classified backtick-uppercase as a ctor identical
to a nominal one, but it had never been exercised by the suite. This
commit adds three smoke tests confirming that nullary, n-ary, and
list-of-polyvariant patterns all match:
let x = polyvar(Foo) in match x with polyvar(Foo) -> 1 | polyvar(Bar) -> 2
let x = polyvar(Pair) (5, 7) in
match x with polyvar(Pair) (a, b) -> a + b | _ -> 0
List.map (fun x -> match x with polyvar(On) -> 1 | polyvar(Off) -> 0)
[polyvar(On); polyvar(Off); polyvar(On)]
(In the actual SX, polyvar(X) is the literal backtick-X — backticks
in this commit message are escaped to avoid shell interpretation.)
Since OCaml-on-SX is dynamic, there's no structural row inference,
but matching by tag works.
sort_uniq:
Sort with the user comparator, then walk the sorted list dropping
any element equal to its predecessor. Output is sorted and unique.
List.sort_uniq compare [3;1;2;1;3;2;4] = [1;2;3;4]
find_map:
Walk until the user fn returns Some v; return that. If all None,
return None.
List.find_map (fun x -> if x > 5 then Some (x * 2) else None)
[1;2;3;6;7]
= Some 12
Both defined in OCaml syntax in runtime.sx — no host primitive
needed since they're pure list traversals over existing operations.
Six new String functions, all in OCaml syntax inside runtime.sx:
iter : index-walk with side-effecting f
iteri : iter with index
fold_left : thread accumulator left-to-right
fold_right: thread accumulator right-to-left
to_seq : return a char list (lazy in real OCaml; eager here)
of_seq : concat a char list back to a string
Round-trip:
String.of_seq (List.rev (String.to_seq "hello")) = "olleh"
Note: real OCaml's Seq is lazy. We return a plain list because the
existing stdlib already provides exhaustive list operations and we
don't yet have lazy sequences. If a baseline needs Seq.unfold or
similar, we'll graduate to a proper Seq module then.
frequency.ml exercises the recently-added Hashtbl.iter / fold +
Hashtbl.find_opt + s.[i] indexing + for-loop together: build a
char-count table for 'abracadabra' then take the max via
Hashtbl.fold. Expected = 5 (a x 5). Total 25 baseline programs.
Format module added as a thin alias of Printf — sprintf, printf, and
asprintf all delegate to Printf.sprintf. The dynamic runtime doesn't
distinguish boxes/breaks, so format strings work the same as in
Printf and most Format-using OCaml programs now compile.
Tokenizer already had 'lazy' as a keyword. This commit wires it through:
parser : parse-prefix emits (:lazy EXPR), like the existing 'assert'
handler.
eval : creates a one-element cell with state ('Thunk' expr env).
host : _lazy_force flips the cell to ('Forced' v) on first call
and returns the cached value thereafter.
runtime : module Lazy = struct let force lz = _lazy_force lz end.
Memoisation confirmed by tracking a side-effect counter through two
forces of the same lazy:
let counter = ref 0 in
let lz = lazy (counter := !counter + 1; 42) in
let a = Lazy.force lz in
let b = Lazy.force lz in
(a + b) * 100 + !counter = 8401 (= 84*100 + 1)
New host primitive _hashtbl_to_list returns the entries as a list of
OCaml tuples — ('tuple' k v) form, matching the AST representation
that the pattern-match VM (:ptuple) expects. Without that exact
shape, '(k, v) :: rest' patterns fail to match.
Hashtbl.iter / Hashtbl.fold in runtime walk that list with the user
fn. This closes a long-standing gap: previously Hashtbl was opaque
once values were written (we could only find_opt one key at a time).
let t = Hashtbl.create 4 in
Hashtbl.add t "a" 1; Hashtbl.add t "b" 2; Hashtbl.add t "c" 3;
Hashtbl.fold (fun _ v acc -> acc + v) t 0 = 6
Replaces the stub sprintf in runtime.sx with a real implementation:
walk fmt char-by-char accumulating a prefix; on recognised %X return a
one-arg fn that formats the arg and recurses on the rest of fmt. The
function self-curries to the spec count — there's no separate arity
machinery, just a closure chain.
Specs: %d (int), %s (string), %f (float), %c (char/string in our model),
%b (bool), %% (literal). Unknown specs pass through.
Same expression returns a string (no specs) or a function (>=1 spec) —
OCaml proper would reject this; works fine in OCaml-on-SX's dynamic
runtime.
Also adds top-level aliases:
string_of_int = _string_of_int
string_of_float = _string_of_float
string_of_bool = if b then "true" else "false"
int_of_string = _int_of_string
Printf.sprintf "x=%d" 42 = "x=42"
Printf.sprintf "%s = %d" "answer" 42 = "answer = 42"
Printf.sprintf "%d%%" 50 = "50%"
Tokenizer already classified 'assert' as a keyword; this commit wires
it through:
parser : parse-prefix dispatches like 'not' — advance, recur, wrap
as (:assert EXPR).
eval : evaluate operand; nil on truthy, host-error 'Assert_failure'
on false. Caught cleanly by existing try/with.
assert true; 42 = 42
let x = 5 in assert (x = 5); x + 1 = 6
try (assert false; 0) with _ -> 99 = 99
Recursive Levenshtein edit distance with no memoization (the test
strings are short enough for the exponential-without-memo version to
fit in <2 minutes on contended hosts). Sums distances for five short
pairs:
('abc','abx') + ('ab','ba') + ('abc','axyc') + ('','abcd') + ('ab','')
= 1 + 2 + 2 + 4 + 2 = 11
Exercises:
* curried four-arg recursion
* s.[i] equality test (char comparison)
* min nested twice for the three-way recurrence
* mixed empty-string base cases
Side-quests required to land caesar.ml:
1. Top-level 'let r = expr in body' is now an expression decl, not a
broken decl-let. ocaml-parse-program's dispatch now checks
has-matching-in? at every top-level let; if matched, slices via
skip-let-rhs-boundary (which already opens depth on a leading let
with matching in) and ocaml-parse on the slice, wrapping as :expr.
2. runtime.sx: added String.make / String.init / String.map. Used by
caesar.ml's encode = String.init n (fun i -> shift_char s.[i] k).
3. baseline run.sh per-program timeout 240->480s (system load on the
shared host frequently exceeds 240s for large baselines).
caesar.ml exercises:
* the new top-level let-in expression dispatch
* s.[i] string indexing
* Char.code / Char.chr round-trip math
* String.init with a closure that captures k
Test value: Char.code r.[0] + Char.code r.[4] after ROT13(ROT13('hello')) = 104 + 111 = 215.
parse-atom-postfix now dispatches three cases after consuming '.':
.field -> existing field/module access
.(EXPR) -> existing local-open
.[EXPR] -> new string-get syntax (this commit)
Eval reduces (:string-get S I) to host (nth S I), which already returns
a one-character string for OCaml's char model.
Lets us write idiomatic OCaml string traversal:
let s = "hi" in
let n = ref 0 in
for i = 0 to String.length s - 1 do
n := !n + Char.code s.[i]
done;
!n (* = 209 *)
Side-quest emerged from adding roman.ml baseline (Roman numeral greedy
encoding): top-level 'let () = expr' was unsupported because
ocaml-parse-program's parse-decl-let consumed an ident strictly. Now
parse-decl-let recognises a leading '()' as a unit binding and
synthesises a __unit_NN name (matching how parse-let already handles
inner-let unit patterns).
roman.ml exercises:
* tuple list literal [(int * string); ...]
* recursive pattern match on tuple-cons
* String.length + List.fold_left
* the new top-level let () support (sanity in a comment, even though
the program ends with a bare expression for the test harness)
Bumped lib/ocaml/test.sh server timeout 180->360s — the recent surge in
test count plus a CPU-contended host was crowding out the sole epoch
reaching the deeper smarts.
In parse-atom-postfix, after consuming '.', if the next token is '(',
parse the inner expression and emit (:let-open M EXPR) instead of
:field. Cleanly composes with the existing :let-open evaluator and
loops to allow chained dot postfixes.
List.(length [1;2;3]) = 3
List.(map (fun x -> x + 1) [1;2;3]) = [2;3;4]
Option.(map (fun x -> x * 10) (Some 4)) = Some 40
Parser detects 'let open' as a separate let-form, parses M as a path
(Ctor(.Ctor)*) directly via inline AST construction (no source slicing
since cur-pos is only available in ocaml-parse-program), and emits
(:let-open PATH BODY).
Eval resolves the path to a module dict and merges its bindings into
the env for body evaluation. Now:
let open List in map (fun x -> x * 2) [1;2;3] = [2;4;6]
let open Option in map (fun x -> x + 1) (Some 5) = Some 6
ocaml-eval-module now handles :def-mut and :def-rec-mut decls so
'module M = struct let rec a n = ... and b n = ... end' works. The
def-rec-mut version uses cell-based mutual recursion exactly as the
top-level version.
Graph BFS using Queue + Hashtbl visited-set + List.assoc_opt + List.iter.
Returns 6 for a graph where A reaches B/C/D/E/F. Demonstrates 4 stdlib
modules (Queue, Hashtbl, List) cooperating in a real algorithm.