Parser now reads 'expr, \`expr, ,expr, ,@expr as the four standard
shorthands. Quote uses existing $quote operative; quasiquote /
unquote / unquote-splicing recognised but not yet expanded at runtime
(left for first consumer to drive). 218 tests total across six suites.
Hygiene-by-default was already present: user operatives close over
static-env and bind formals + body $define!s in (extend STATIC-ENV),
caller's env untouched. $let evaluates values in caller env, binds
in fresh child env, runs body there. $define-in! explicitly targets
an env. Full scope-set / frame-stamp hygiene is research-grade
and documented as deferred future work in the reflective API notes.
Previously dl-magic-query always pre-saturated the source db so it
gave correct results for stratified programs (where the rewriter
doesn't propagate magic to aggregate inner-goals or negated rels).
Pure positive programs paid the full bottom-up cost twice.
Add dl-rules-need-presaturation? — checks whether any rule body
contains an aggregate or negation. Only pre-saturate in that case.
Pure positive programs (the common case for magic-sets) keep their
full goal-directed efficiency.
276/276; identical answers on the existing aggregate-of-IDB test.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`dl-set-strategy!` accepted any keyword silently — typos like
`:semi_naive` or `:semiNaive` were stored uninspected and the
saturator then used the default. The user never learned their
setting was wrong.
Validator added: strategy must be one of `:semi-naive`, `:naive`,
`:magic` (the values currently recognised by the saturator and
magic-sets driver). Unknown values raise with a clear message that
lists the accepted set.
1 regression test; conformance 276/276.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The renamer for anonymous `_` variables started at counter 0 and
produced `_anon1, _anon2, ...` unconditionally. A user writing the
same naming convention would see their variables shadowed:
(dl-eval "p(a, b). p(c, d). q(_anon1) :- p(_anon1, _)."
"?- q(X).")
=> () ; should be ({:X a} {:X c})
The `_` got renamed to `_anon1` too, collapsing the two positions
of `p` to a single var (forcing args to be equal — which neither
tuple satisfies).
Fix: scan each rule (and query goal) for the highest `_anon<N>`
already present and start the renamer past it. New helpers
`dl-max-anon-num` / `dl-max-anon-num-list` / `dl-try-parse-int`
walk the rule tree; `dl-make-anon-renamer` now takes a `start`
argument; `dl-rename-anon-rule` and the query-time renamer in
`dl-query` both compute the start from the input.
1 regression test; conformance 275/275.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dl-magic-query could silently diverge from dl-query when an
aggregate's inner-goal relation was IDB. The rewriter passes
aggregate body lits through unchanged (no magic propagation
generated for them), so the inner relation was empty in the magic
db and the aggregate returned 0. Repro:
(dl-eval-magic
"u(a). u(b). u(c). u(d). banned(b). banned(d).
active(X) :- u(X), not(banned(X)).
n(N) :- count(N, X, active(X))."
"?- n(N).")
=> ({:N 0}) ; should be ({:N 2})
dl-magic-query now pre-saturates the source db before copying facts
into the magic db. This guarantees equivalence with dl-query for
every stratified program; the magic benefit still comes from
goal-directed re-derivation of the query relation under the seed
(which matters for large recursive joins). The existing test cases
happened to dodge this because their aggregate inner-goals were all
EDB.
1 new regression test; conformance 274/274.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The canonical Datalog idiom for "no X has any Y":
orphan(X) :- person(X), not(parent(X, _)).
was rejected by the safety check with "negation refers to unbound
variable(s) (\"_anon1\")". The parser renames each anonymous `_`
to a fresh `_anon*` symbol so multiple `_` occurrences don't unify
with each other, and the negation safety walk then demanded all
free vars in the negated lit be bound by an earlier positive body
lit — including the renamed anonymous vars.
Anonymous vars in a negation are existentially quantified within
the negation, not requirements from outside. Added dl-non-anon-vars
to strip `_anon*` names from the `needed` set before the binding
check in dl-process-neg!. Real vars (like `X` in the orphan idiom)
still must be bound by an earlier positive body lit, just as before.
2 new regression tests (orphan idiom + multi-anon "solo" pattern);
conformance 273/273.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Datalog has no function symbols in argument positions, but the
existing dl-add-fact! / dl-add-rule! validators only checked that
literals were ground (no free variables). A compound like `+(1, 2)`
contains no variables, so:
p(+(1, 2)).
=> stored as the unreduced tuple `(p (+ 1 2))`
double(*(X, 2)) :- n(X). n(3).
=> saturates `double((* 3 2))` instead of `double(6)`
Added dl-simple-term? (number / string / symbol) and an
args-simple? walker, used by:
- dl-add-fact!: all args must be simple terms
- dl-add-rule!: rule head args must be simple terms (variables
are symbols, so they pass)
Compounds remain legal in body literals where they encode `is` /
arithmetic / aggregate sub-goals. Error messages name the offending
literal and point the user at the body-only mechanism.
2 new regression tests; conformance 271/271.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Quoted atoms with uppercase- or underscore-leading names were
misclassified as variables. `p('Hello World').` flowed through the
tokenizer's "atom" branch and through the parser's string->symbol,
producing a symbol named "Hello World". dl-var? inspects the first
character — "H" is uppercase, so the fact was rejected as non-ground
("expected ground literal").
Tokenizer now emits "string" for any '...' quoted form. Quoted atoms
become opaque string constants — matching how Datalog idiomatically
treats them, and avoiding a per-symbol "quoted" marker that would
have rippled through unification and dl-var?. The trade-off is that
'a' and a are no longer the same value (string vs symbol); for
Datalog this is the safer default.
Updated the existing "quoted atom" tokenize test, added a regression
case for an uppercase-named quoted atom, and a parse-level test that
verifies the AST. Conformance 269/269.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Type-mixed comparisons were silently inconsistent:
<("hello", 5) => no result, no error (silent false)
<(a, 5) => raises "Expected number, got symbol"
Both should fail loudly with a comprehensible message. Added
dl-compare-typeok?: <, <=, >, >= now require both operands to share
a primitive type (both numbers or both strings) and raise a clear
"comparison <op> requires same-type operands" error otherwise.
`!=` is exempted because it's the polymorphic inequality test
built on dl-tuple-equal? — cross-type pairs are legitimately unequal
and the existing semantics for that case match user intuition.
2 new regression tests; conformance 267/267.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A dict in a rule body that isn't `{:neg <positive-lit>}` (the only
recognised dict shape) used to silently fall through every dispatch
clause in dl-rule-check-safety, contributing zero bound variables.
The user would then see a confusing "head variable(s) X do not
appear in any positive body literal" pointing at the head — not at
the actual bug in the body. Typos like `{:negs ...}` are the typical
trigger.
dl-process-lit! now flags both:
- a dict that lacks :neg
- a bare number / string / symbol used as a body lit
with a clear error naming the offending literal.
1 new regression test; conformance 265/265.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`is(R, /(X, 0))` was silently producing IEEE infinity:
(dl-eval "p(10). q(R) :- p(X), is(R, /(X, 0))." "?- q(R).")
=> ({:R inf})
That value then flowed through comparisons (anything < inf, anything
> inf) and aggregations (sum of inf, max of inf) producing nonsense
results downstream. `dl-eval-arith` now checks the divisor before
the host `/` and raises "division by zero in <expr>" — surfacing
the bug at its source rather than letting infinity propagate.
1 new test; conformance 264/264.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`count(N, Y, p(X))` silently returned `N = 1` because `Y` was never
bound by the goal — every match contributed the same unbound symbol
which dl-val-member? deduped to a single entry. Similarly:
sum(S, Y, p(X)) => raises "expected number, got symbol"
findall(L, Y, p(X)) => L = (Y) (a list containing the unbound symbol)
count(N, Y, p(X)) => N = 1 (silent garbage)
Added a third validator in dl-eval-aggregate: the agg-var must
syntactically appear among the goal's variables. Error names the
variable and the goal and explains why the result would be
meaningless.
1 new test; conformance 263/263.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A "mixed" relation has both user-asserted facts AND rules with the
same head. Previously dl-retract! wiped every rule-head relation
wholesale before re-saturating — the saturator only re-derives the
IDB portion, so explicit EDB facts vanished even for a no-op retract
of a non-existent tuple. Repro:
(let ((db (dl-program "p(a). p(b). p(X) :- q(X). q(c).")))
(dl-retract! db (quote (p z)))
(dl-query db (quote (p X))))
went from {a, b, c} to just {c}.
Fix: track :edb-keys provenance in the db.
- dl-make-db now allocates an :edb-keys dict.
- dl-add-fact! (public) marks (rel-key, tuple-key) in :edb-keys.
- New internal dl-add-derived! does the append without marking.
- Saturator (semi-naive + naive driver) now calls dl-add-derived!.
- dl-retract! strips only the IDB-derived portion of rule-head
relations (anything not in :edb-keys) and preserves the EDB
portion through the re-saturate pass.
2 new regression tests; conformance 262/262.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nested `not(not(P))` silently misparsed: outer `not(...)` is
recognised as negation, but the inner `not(banned(X))` was parsed
as a positive call to a relation called `not`. With no `not`
relation present, the inner match was empty, the outer negation
succeeded vacuously, and `vip(X) :- u(X), not(not(banned(X))).`
collapsed to `vip(X) :- u(X).` — a silent double-negation = identity
fallacy.
Fix in `dl-rule-check-safety`: the positive-literal branch and
`dl-process-neg!` both reject any body literal whose relation
name is in `dl-reserved-rel-names`. Error message names the
relation and points the user at stratified negation through an
intermediate relation.
1 regression test; conformance 260/260.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bug: dl-eval-aggregate accepted non-variable agg-vars and non-
literal goals silently, producing weird/incorrect counts:
- `count(N, 5, p(X))` would compute count over the single
constant 5 (always 1), ignoring p entirely.
- `count(N, X, 42)` would crash with "unknown body-literal
shape" at saturation time rather than at rule-add time.
Fix: dl-eval-aggregate now validates up front that the second
arg is a variable (the value to aggregate) and the third arg is
a positive literal (the goal). Errors are descriptive and
include the offending argument.
2 new aggregate tests.
Bug: dl-walk would infinite-loop on a circular substitution
(e.g. A→B and B→A simultaneously). The walk endlessly chased
the cycle. This couldn't be produced through dl-unify (which has
cycle-safe behavior via existing bindings), but raw dl-bind calls
or external manipulation of the subst dict could create it.
Fix: dl-walk now threads a visited-names list through the
recursion. If a variable name is already in the list, the walk
stops and returns the current term unchanged. Normal chained
walks are unaffected (A→B→C→42 still resolves to 42).
1 new unify test verifies circular substitutions don't hang.
Classic O(n) greedy gas-station algorithm:
walk once, tracking
total = sum of (gas[i] - cost[i]) -- if negative, no answer
curr = running tank since start -- on negative, advance
start past i+1 and reset
if total < 0 then -1 else start
For gas = [1;2;3;4;5], cost = [3;4;5;1;2], unique start = 3.
Tests `total` + `curr` parallel accumulators, reset-on-failure
pattern.
202 baseline programs total.
Greedy BFS-frontier style — track the farthest reach within the
current jump's reachable range, and bump the jump counter when i
runs into the current frontier:
while !i < n - 1 do
farthest := max(farthest, i + arr.(i));
if !i = !cur_end then begin
jumps := !jumps + 1;
cur_end := !farthest
end;
i := !i + 1
done
For [2; 3; 1; 1; 2; 4; 2; 0; 1; 1] (n = 10), the optimal jump
sequence 0 -> 1 -> 4 -> 5 -> 9 uses 4 jumps.
Tests greedy-with-frontier pattern, three parallel refs
(jumps, cur_end, farthest), mixed for-style index loop using ref.
201 baseline programs total.
Pascal-recursion combination enumerator:
let rec choose k xs =
if k = 0 then [[]]
else match xs with
| [] -> []
| h :: rest ->
List.map (fun c -> h :: c) (choose (k - 1) rest)
@ choose k rest
C(9, 4) = |choose 4 [1; ...; 9]| = 126
Tests pure-functional enumeration with List.map + closure over h,
@ append, [] | h :: rest pattern match on shrinking input.
200 baseline programs total -- milestone.
Monotonic decreasing stack — for each day i, pop entries from
the stack whose temperature is strictly less than today's; their
answer is (i - popped_index).
temps = [73; 74; 75; 71; 69; 72; 76; 73]
answer = [ 1; 1; 4; 2; 1; 1; 0; 0]
sum = 10
Complementary to next_greater.ml (iter 256) — same monotonic-stack
skeleton but stores the distance to the next greater element
rather than its value.
Tests `match !stack with | top :: rest when …` pattern with
guard inside a while-cont-flag loop.
198 baseline programs total.
DP recurrence for popcount that avoids host bitwise operations:
result[i] = result[i / 2] + (i mod 2)
Drops the low bit (i / 2 stands in for i lsr 1) and adds it back
if it was 1 (i mod 2 stands in for i land 1).
sum over 0..100 of popcount(i) = 319
Tests pure-arithmetic popcount, accumulating ref + DP array,
classic look-back to half-index pattern.
197 baseline programs total.
Binary search in a rotated sorted array. Standard sorted-half
test at each step:
if arr.(lo) <= arr.(mid) then
left half [lo, mid] is sorted -> check whether target is in it
else
right half [mid, hi] is sorted -> check whether target is in it
For [4; 5; 6; 7; 0; 1; 2]:
search 0 -> index 4
search 7 -> index 3
search 3 -> -1 (absent)
Encoded fingerprint: 4 + 3*10 + (-1)*100 = -66.
First baseline returning a negative top-level value; the runner
uses literal grep -qF so leading minus parses fine.
196 baseline programs total.
Task-scheduler closed-form min total intervals:
m = max letter frequency
k = number of letters tied at frequency m
answer = max((m - 1) * (n + 1) + k, total_tasks)
For "AAABBC" with cooldown n = 2:
freq A = 3, freq B = 2, freq C = 1 -> m = 3, k = 1
formula = (3 - 1) * (2 + 1) + 1 = 7
total tasks = 6
answer = 7
Witness schedule: A, B, C, A, B, idle, A.
Tests String.iter with side-effecting count update via
Char.code arithmetic, fixed-size 26-bucket histogram.
195 baseline programs total.
Classic two-pointer / sliding window: expand right, then shrink
left while the window still satisfies the >= constraint, recording
the smallest valid length.
for r = 0 to n - 1 do
sum := !sum + arr.(r);
while !sum >= target do
... record (r - !l + 1) if smaller ...
sum := !sum - arr.(!l);
l := !l + 1
done
done
For [2; 3; 1; 2; 4; 3], target 7 -> window [4, 3] of length 2.
Sentinel n+1 marks "not found"; final guard reduces to 0.
Tests for + inner while shrinking loop, ref-tracked sum updated
on both expansion and contraction.
194 baseline programs total.
Sweep-line algorithm via separately-sorted starts / ends arrays:
while i < n do
if starts[i] < ends[j] then begin busy++; rooms = max; i++ end
else begin busy--; j++ end
done
intervals: (0,30) (5,10) (15,20) (10,25) (5,12)
(20,35) (0,5) (8,18)
At time 8, meetings (0,30), (5,10), (5,12), (8,18) are all active
simultaneously -> answer = 4.
Tests local helper bound via let (`let bubble a = ...`) for
in-place sort, dual-pointer sweep on parallel ordered event streams.
193 baseline programs total.
Two-pass partition DP for max profit with at most 2 transactions:
left[i] = max single-trans profit in prices[0..i]
(forward scan tracking running min)
right[i] = max single-trans profit in prices[i..n-1]
(backward scan tracking running max)
answer = max over i of (left[i] + right[i])
For [3; 3; 5; 0; 0; 3; 1; 4]:
optimal partition i = 2:
left[2] = sell@5 after buy@3 = 2
right[2] = sell@4 after buy@0 in [2..7] = 4
total = 6
Tests parallel forward + backward passes on parallel DP arrays,
mixed ref + array state, for downto + for ascending scans on
the same data.
190 baseline programs total.
Expand-around-center linear-time palindrome counting:
for c = 0 to 2*n - 2 do
let l = ref (c / 2) in
let r = ref ((c + 1) / 2) in
while !l >= 0 && !r < n && s.[!l] = s.[!r] do
count := !count + 1; l := !l - 1; r := !r + 1
done
done
The 2n-1 centers cover both odd (c even -> l = r) and even
(c odd -> l = r - 1) palindromes.
For "aabaa":
5 singletons + 2 "aa" + 1 "aba" + 1 "aabaa" = 9
Complements lps_dp.ml (longest subsequence) and manacher.ml
(longest substring); this one *counts* all palindromic substrings.
187 baseline programs total.
Floyd's cycle detection on a numeric function f(x) = (2x + 5) mod 17.
Three phases:
Phase 1: advance slow/fast until collision inside the cycle
(fast double-steps, slow single-steps)
Phase 2: restart slow from x0; advance both by 1 until they
meet — count is mu (length of tail before cycle)
Phase 3: advance fast around cycle once until it meets slow
— count is lam (cycle length)
For x0 = 1, the orbit visits 1, 7, 2, 9, 6, 0, 5, 15 then returns
to 1 — pure cycle of length 8, mu = 0, lam = 8. Encoded as
mu*100 + lam = 8.
Tests three sequential while loops sharing ref state,
double-step `fast := f (f !fast)`, meeting-condition flag.
185 baseline programs total.
Two-pointer merge advancing the smaller-head pointer k times,
without materializing the merged array:
while !count < k do
let pick_a =
if !i = m then false (* a exhausted, take from b *)
else if !j = n then true (* b exhausted, take from a *)
else a.(!i) <= b.(!j)
in
if pick_a then ... else ...;
count := !count + 1
done
For a = [1;3;5;7;9;11;13], b = [2;4;6;8;10;12]:
merged order: 1,2,3,4,5,6,7,8,9,10,11,12,13
8th element = 8.
Tests nested if/else if/else flowing into a bool, dual-ref
two-pointer loop, separate count counter for k-th constraint.
184 baseline programs total.
Recursive permutation generator via fold-pick-recurse:
let rec permutations xs = match xs with
| [] -> [[]]
| _ ->
List.fold_left (fun acc x ->
let rest = List.filter (fun y -> y <> x) xs in
let subs = permutations rest in
acc @ List.map (fun p -> x :: p) subs
) [] xs
For permutations of [1; 2; 3; 4] (24 total), count those whose
first element is less than the last:
match p with
| [a; _; _; b] when a < b -> count := !count + 1
| _ -> ()
By symmetry, exactly half satisfy a < b = 12.
Tests List.filter, recursive fold with append, fixed-length
list pattern [a; _; _; b] with multiple wildcards + when guard.
183 baseline programs total.
Recursive regex matcher with Leetcode-style semantics:
. matches any single character
<c>* matches zero or more of <c>
let rec is_match s i p j =
if j = String.length p then i = String.length s
else
let first = i < String.length s
&& (p.[j] = '.' || p.[j] = s.[i])
in
if j + 1 < String.length p && p.[j+1] = '*' then
is_match s i p (j + 2) (* skip * group *)
|| (first && is_match s (i + 1) p j) (* consume one *)
else
first && is_match s (i + 1) p (j + 1)
Patterns vs texts:
.a.b | aabb axb "" abcd abc aaabbbc x -> 1 match
a.*b | aabb axb "" abcd abc aaabbbc x -> 2 matches
x* | aabb axb "" abcd abc aaabbbc x -> 2 matches
a*b*c | aabb axb "" abcd abc aaabbbc x -> 2 matches
total = 7
Complements wildcard_match.ml which uses LIKE-style * / ?.
182 baseline programs total.
Two-phase palindrome-partition DP for the minimum-cuts variant:
Phase 1: is_pal[i][j] palindrome table via length-major fill
(single chars, then pairs, then expand inward).
Phase 2: cuts[i] = 0 if s[0..i] is itself a palindrome,
= min over j of (cuts[j-1] + 1)
where s[j..i] is a palindrome.
min_cut "aabba" = 1 ("a" | "abba")
Tests two sequential 2D DPs sharing the same is_pal matrix,
inline begin/end branches inside the length-major fill, mixed
bool and int 2D arrays.
181 baseline programs total.
Classic word-break DP — for each position i, check whether any
dictionary word ends at i with a prior reachable position:
dp[i] = exists w in dict with wl <= i and
dp[i - wl] && s.sub (i - wl) wl = w
Dictionary: apple, pen, pine, pineapple, cats, cat, and, sand, dog
Inputs:
applepenapple yes (apple pen apple)
pineapplepenapple yes (pineapple pen apple)
catsanddog yes (cats and dog)
catsandog no (no segmentation reaches the end)
applesand yes (apple sand)
Tests bool-typed Array, String.sub primitive, nested List.iter
over the dict inside for-loop over end positions, closure capture
of the outer dp.
179 baseline programs total.
Linear-time stack algorithm for largest rectangle in histogram:
for i = 0 to n do
let h = if i = n then 0 else heights.(i) in
while top-of-stack's height > h do
pop the top, compute its max-width rectangle:
width = (no-stack ? i : i - prev_top - 1)
area = height * width
update best
done;
if i < n then push i
done
Sentinel pass at i=n with h=0 flushes the remaining stack.
For [2; 1; 5; 6; 2; 3], bars at indices 2 (h=5) and 3 (h=6) form
a width-2 rectangle of height 5 = 10.
Tests guarded patterns with `when` inside while-cont-flag, nested
`match !stack with | [] -> i | t :: _ -> i - t - 1` for width
computation.
178 baseline programs total.
Standard O(n^2) length-major DP for longest palindromic
subsequence:
dp[i][j] = dp[i+1][j-1] + 2 if s[i] = s[j]
= max(dp[i+1][j], dp[i][j-1]) otherwise
lps "BBABCBCAB" = 7 (witness "BABCBAB" etc.)
Complementary to manacher.ml (longest palindromic *substring*,
also length 7 on that input by coincidence) — this is the
subsequence variant which doesn't require contiguity.
Tests length-major fill order, inline if for the length-2 base
case, double-nested for with derived j = i + len - 1.
177 baseline programs total.
Classic distinct-subsequences 2D DP:
dp[i][j] = dp[i-1][j] + (s[i-1] = t[j-1] ? dp[i-1][j-1] : 0)
dp[i][0] = 1 (empty t is a subseq of any prefix of s)
count_subseq "rabbbit" "rabbit" = 3
The three witnesses correspond to which 'b' in "rabbbit" is
dropped (positions 2, 3, or 4 zero-indexed of the run of bs).
Complements subseq_check.ml (just tests presence); this one counts
distinct embeddings.
Tests 2D DP with Array.init n (fun _ -> Array.make m 0), base row
initialization, mixed string + array indexing.
175 baseline programs total.