Files
rose-ash/plans/datalog-on-sx.md
giles 408fc27366
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 54s
datalog: dl-query accepts conjunctive goal lists (167/167)
dl-query now auto-dispatches on the first element's shape:
- positive literal (head is a symbol) or {:neg ...} dict → wrap
- list of literals → conjunctive query

dl-query-coerce normalizes; dl-query-user-vars collects the union
of user-named vars (deduped, '_' filtered) for projection. Old
single-literal callers unchanged.

  (dl-query db '(p X))                   ; single
  (dl-query db '((p X) (q X)))           ; conjunction
  (dl-query db (list '(n X) '(> X 2)))   ; with comparison

2 new api tests cover multi-goal AND and conjunction with comparison.
2026-05-08 09:17:15 +00:00

442 lines
25 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Datalog-on-SX: Datalog on the CEK/VM
Datalog is a declarative query language: a restricted subset of Prolog with no function
symbols, only relations. Programs are sets of facts and rules; queries ask what follows.
Evaluation is bottom-up (fixpoint iteration) rather than Prolog's top-down DFS — which
means no infinite loops, guaranteed termination, and efficient incremental updates.
The unique angle: Datalog is a natural companion to the Prolog implementation already in
progress (`lib/prolog/`). The parser and term representation can share infrastructure;
the evaluator is an entirely different fixpoint engine rather than a DFS solver.
End-state goal: **full core Datalog** (facts, rules, stratified negation, aggregation,
recursion) with a clean SX query API, and a demonstration of Datalog as a query engine
for rose-ash data (e.g. federation graph, content relationships).
## Ground rules
- **Scope:** only touch `lib/datalog/**` and `plans/datalog-on-sx.md`. Do **not** edit
`spec/`, `hosts/`, `shared/`, `lib/prolog/**`, or other `lib/<lang>/`.
- **Shared-file issues** go under "Blockers" below with a minimal repro; do not fix here.
- **SX files:** use `sx-tree` MCP tools only.
- **Architecture:** Datalog source → term AST → fixpoint evaluator. No transpiler to SX AST —
the evaluator is written in SX and works directly on term structures.
- **Reference:** Ramakrishnan & Ullman "A Survey of Deductive Database Systems";
Dalmau "Datalog and Constraint Satisfaction".
- **Commits:** one feature per commit. Keep `## Progress log` updated and tick boxes.
## Architecture sketch
```
Datalog source text
lib/datalog/tokenizer.sx — atoms, variables, numbers, strings, punct (?- :- , . ( ) [ ])
lib/datalog/parser.sx — facts: atom(args). rules: head :- body. queries: ?- goal.
│ No function symbols (only constants and variables in args).
lib/datalog/db.sx — extensional DB (EDB): ground facts; IDB: derived relations;
│ clause index by relation name/arity
lib/datalog/eval.sx — bottom-up fixpoint: semi-naive evaluation with delta sets;
│ stratification for negation; incremental update API
lib/datalog/query.sx — query API: (datalog-query db goal) → list of substitutions;
SX embedding: define facts/rules as SX data directly
```
Key differences from Prolog:
- **No function symbols** — args are atoms, numbers, strings, or variables only. No `f(a,b)`.
- **No cuts** — no procedural control.
- **Bottom-up** — derive all consequences of all rules before answering; no search tree.
- **Termination guaranteed** — no infinite derivation chains (no function symbols → finite Herbrand base).
- **Stratified negation** — `not(P)` legal iff P does not recursively depend on its own negation.
- **Aggregation** — `count`, `sum`, `min`, `max` over derived tuples (Datalog+).
## Roadmap
### Phase 1 — tokenizer + parser
- [x] Tokenizer: atoms (lowercase/quoted), variables (uppercase/`_`), numbers, strings,
punct (`( )`, `,`, `.`), operators (`:-`, `?-`, `<=`, `>=`, `!=`, `<`, `>`, `=`,
`+`, `-`, `*`, `/`), comments (`%`, `/* */`)
Note: no function symbol syntax (no nested `f(...)` in arg position) — but the
parser permits nested compounds for arithmetic; safety analysis (Phase 3) rejects
non-arithmetic nesting.
- [x] Parser:
- Facts: `parent(tom, bob).``{:head (parent tom bob) :body ()}`
- Rules: `ancestor(X,Z) :- parent(X,Y), ancestor(Y,Z).`
`{:head (ancestor X Z) :body ((parent X Y) (ancestor Y Z))}`
- Queries: `?- ancestor(tom, X).``{:query ((ancestor tom X))}`
(`:query` value is always a list of literals; `?- p, q.``{:query ((p) (q))}`)
- Negation: `not(parent(X,Y))` in body position → `{:neg (parent X Y)}`
- [x] Tests in `lib/datalog/tests/parse.sx` (18) and `lib/datalog/tests/tokenize.sx` (26).
Conformance harness: `bash lib/datalog/conformance.sh` → 44 / 44 passing.
### Phase 2 — unification + substitution
- [x] Ported (not shared) from `lib/prolog/` — term walk, no occurs check.
- [x] `dl-unify t1 t2 subst` → extended subst dict, or `nil` on failure.
- [x] `dl-walk`, `dl-bind`, `dl-apply-subst`, `dl-ground?`, `dl-vars-of`.
- [x] Substitutions are immutable dicts keyed by variable name (string).
Lists/tuples unify element-wise (used for arithmetic compounds too).
- [x] Tests in `lib/datalog/tests/unify.sx` (28). 72 / 72 conformance.
### Phase 3 — extensional DB + naive evaluation + safety analysis
- [x] EDB+IDB combined: `{:facts {<rel-name-string> -> (literal ...)}}`
relations indexed by name; tuples stored as full literals so they
unify directly. Dedup on insert via `dl-tuple-equal?`.
- [x] `dl-add-fact! db lit` (rejects non-ground) and `dl-add-rule! db rule`
(rejects unsafe). `dl-program source` parses + loads in one step.
- [x] Naive evaluation `dl-saturate! db`: iterate rules until no new tuples.
`dl-find-bindings` recursively joins body literals; `dl-match-positive`
unifies a literal against every tuple in the relation.
- [x] `dl-query db goal` → list of substitutions over `goal`'s vars,
deduplicated. `dl-relation db name` for derived tuples.
- [x] Safety analysis at `dl-add-rule!` time: every head variable except
`_` must appear in some positive body literal. Built-ins and negated
literals do not satisfy safety. Helpers `dl-positive-body-vars`,
`dl-rule-unsafe-head-vars` exposed for later phases.
- [x] Negation and arithmetic built-ins error cleanly at saturate time
(Phase 4 / Phase 7 will swap in real semantics).
- [x] Tests in `lib/datalog/tests/eval.sx` (15): transitive closure,
sibling, same-generation, grandparent, cyclic graph reach, six
safety cases. 87 / 87 conformance.
### Phase 4 — built-in predicates + body arithmetic
Almost every real query needs `<`, `=`, simple arithmetic, and string
comparisons in body position. These are not EDB lookups — they're
constraints that filter bindings.
- [x] Recognise built-in predicates in body: `(< X Y)`, `(<= X Y)`, `(> X Y)`,
`(>= X Y)`, `(= X Y)`, `(!= X Y)` and arithmetic forms `(is Z (+ X Y))`,
`(is Z (- X Y))`, `(is Z (* X Y))`, `(is Z (/ X Y))`. Live in
`lib/datalog/builtins.sx`.
- [x] `dl-eval-builtin` dispatches; `dl-eval-arith` recursively evaluates
`(+ a b)` etc. with full nesting. `=` unifies; `!=` rejects equal
ground terms.
- [x] Order-aware safety analysis (`dl-rule-check-safety`): walks body
left-to-right tracking which vars are bound. `is`'s RHS vars must
be already bound; LHS becomes bound. Comparisons require both
sides bound. `=` is special-cased — at least one side bound binds
the other. Negation vars must be bound (will be enforced fully in
Phase 7).
- [x] Wired through SX numeric primitives — no separate number tower.
- [x] Tests in `lib/datalog/tests/builtins.sx` (19): range filters,
arithmetic derivations, equality binding, eight safety violations
and three safe-shape tests. Conformance 106 / 106.
### Phase 5 — semi-naive evaluation (performance)
- [x] Delta sets `{rel-name -> tuples}` track newly derived tuples per iter.
`dl-snapshot-facts` builds the initial delta from the EDB.
- [x] Semi-naive rule: for each rule, walk every positive body literal
position; substitute that one with the per-relation delta and join
the rest against the previous-iteration DB (`dl-find-bindings-semi`).
Candidates are collected before mutating the DB so the "full" sides
see a consistent snapshot.
- [x] `dl-collect-rule-candidates` falls back to a naive single pass when
a rule has no positive body literal (e.g. `(p X) :- (= X 5).`).
- [x] `dl-saturate!` is now semi-naive by default; `dl-saturate-naive!`
kept for differential testing and a reference implementation.
- [x] Tests in `lib/datalog/tests/semi_naive.sx` (8) — every recursive
program from earlier suites is run under both saturators with
per-relation tuple counts compared (cheap, robust under bundled
conformance session). A chain-5 differential exercises multiple
semi-naive iterations against the recursive ancestor rule.
Larger chains hit prohibitive wall-clock under conformance CPU
contention with other agents — a future Blocker tracks switching
`dl-tuple-member?` from O(n²) list scan to a hash-set per relation.
### Phase 6 — magic sets (goal-directed bottom-up, opt-in)
Naive bottom-up derives **all** consequences before answering. Magic sets
rewrite the program so the fixpoint only derives tuples relevant to the
goal — a major perf win for "what's reachable from node X" queries on
large graphs.
- [ ] Adornments: annotate rule predicates with bound (`b`) / free (`f`)
patterns based on how they're called.
- [ ] Magic transformation: for each adorned predicate, generate a
`magic_<pred>` relation and rewrite rule bodies to filter through it.
- [ ] Sideways information passing strategy (SIPS): left-to-right by
default; pluggable.
- [ ] Optional pass — `(dl-set-strategy! db :magic)`; default semi-naive.
- [ ] Tests: equivalence vs naive on small inputs; perf win on a 10k-node
reachability query from a single root.
### Phase 7 — stratified negation
- [x] Dependency graph: `dl-build-dep-graph db` returns `{head -> ({:rel
:neg} ...)}`. Built-ins drop out (they're not relations).
- [x] Reachability via Floyd-Warshall in `dl-build-reach`; cycles
detected by `reach[A][B] && reach[B][A]`. Programs are
non-stratifiable iff any negative dependency falls inside an SCC.
`dl-check-stratifiable` returns nil on success or a clear message.
- [x] `dl-compute-strata` propagates stratum numbers iteratively:
`stratum(R) = max over deps of (stratum(dep) + (1 if negated else 0))`.
- [x] Saturator refactor: `dl-saturate-rules! db rules` is the semi-
naive worker; `dl-saturate! db` rejects non-stratifiable programs,
groups rules by head's stratum, and runs the worker on each
stratum in increasing order.
- [x] `not(P)` in body: `dl-match-negation` walks the inner literal
under the current subst and uses `dl-match-positive` — succeeds
iff zero matches. Order-aware safety in `dl-rule-check-safety`
(already present from Phase 4) requires negation vars to be
bound by an earlier positive literal.
- [x] Tests in `lib/datalog/tests/negation.sx` (10): EDB and IDB
negation, two-step strata, multi-level strata, with-arithmetic,
empty-result and always-fail cases, non-stratifiability
rejection, and a negation safety violation.
### Phase 8 — aggregation (Datalog+)
- [x] `(count R V Goal)`, `(sum R V Goal)`, `(min R V Goal)`,
`(max R V Goal)`, `(findall L V Goal)` — first arg is the result
variable, second is the aggregated variable, third is the goal
literal. `findall` returns the distinct-value list itself; the
others reduce. Live in `lib/datalog/aggregates.sx`.
- [x] `dl-eval-aggregate`: runs `dl-find-bindings` on the goal under the
current subst (which provides outer-context bindings), collects
distinct values of the aggregated var, applies the aggregate.
`count`/`sum` produce 0 when no matches; `min`/`max` produce no
binding (rule fails) when empty.
- [x] Group-by emerges naturally: outer-context vars in the goal are
substituted from the current subst, so `popular(P) :- post(P),
count(N, U, liked(U, P)), >=(N, 3).` correctly counts per-post.
- [x] Stratification: `dl-aggregate-dep-edge` returns a negation-like
edge so the aggregate's goal relation is fully derived before the
aggregate fires. Non-monotonicity respected.
- [x] Safety: aggregate body lit binds the result var; goal-internal
vars are existentially quantified and don't need outer binding.
- [x] Tests in `lib/datalog/tests/aggregates.sx` (10): count siblings,
sum prices, min/max scores, count over derived relation,
empty-input cases for each operator, popularity threshold with
group-by, distinct-counted-once.
### Phase 9 — SX embedding API
- [x] `(dl-program-data facts rules)` builds a db from SX data —
`facts` is a list of literals, `rules` is a list of either
dicts `{:head … :body …}` or lists `(<head…> <- <body…>)`.
Variables are SX symbols whose first char is uppercase or `_`,
matching the parser's convention.
```
(dl-program-data
'((parent tom bob) (parent bob ann))
'((ancestor X Y <- (parent X Y))
(ancestor X Z <- (parent X Y) (ancestor Y Z))))
```
- [x] `(dl-rule head body)` constructor for the dict form.
- [x] `(dl-query db '(ancestor tom X))` already worked — same query API
consumes the SX-data goal. Now also accepts a *list* of body
literals for conjunctive queries:
`(dl-query db '((p X) (q X)))`,
`(dl-query db (list '(n X) '(> X 2)))`. Auto-dispatched via
`dl-query-coerce` on first-element shape.
- [x] `(dl-assert! db '(parent ann pat))` → adds the fact and re-saturates.
- [x] `(dl-retract! db '(parent bob ann))` → drops matching tuples from
the EDB list, wipes every relation that has a rule (those are IDB),
and re-saturates from the surviving EDB.
- [x] Tests in `lib/datalog/tests/api.sx` (9): closure via data API,
dict-rule form, dl-rule constructor, dl-assert! incremental,
dl-retract! removes derived, cyclic-graph reach via data,
assert into empty db, fact-style rule (no arrow), coerce dict.
- [ ] Integration demo: federation graph query — `(ancestor actor1 actor2)` over
rose-ash ActivityPub follow relationships (Phase 10).
### Phase 10 — Datalog as a query language for rose-ash
- [x] Schema sketches in `lib/datalog/demo.sx`:
- **Federation**: `(follows A B)` → `(mutual A B)`, `(reachable A B)`,
`(foaf A C)` (friend-of-a-friend, distinct).
- **Content**: `(authored A P)`, `(liked U P)`, `(tagged P T)` →
`(post-likes P N)` via aggregation, `(popular P)` for likes ≥ 3,
`(interesting Me P)` joining follows + authored + popular.
- **Permissions**: `(member A G)`, `(subgroup C P)`, `(allowed G R)`
→ `(in-group A G)` over transitive subgroups, `(can-access A R)`.
- **Cooking-posts** (the canonical example): `(reach Me Them)` over
the follow graph, then `(cooking-post-by-network Me P)` joining
reach + authored + `(tagged P cooking)`.
- [ ] Loader `dl-load-from-db!` — out of scope for this loop
(would need to edit `shared/services/` outside `lib/datalog/`).
Programs in `demo.sx` already document the EDB shape expected
from such a loader. `dl-program-data` consumes the same shape.
- [x] Query examples covered by `lib/datalog/tests/demo.sx` (10):
mutuals, transitive reach, FOAF, popular posts, interesting feed,
post likes count, direct/subgroup/transitive group access, no
access without grant.
- [ ] Service endpoint `POST /internal/datalog` — out of scope as above.
Once exposed, server-side handler would be `dl-program-data` +
`dl-query`, returning JSON-encoded substitutions.
## Blockers
- **Saturation perf** improving but not free. Resolved hash-set
membership in `dl-add-fact!` and replaced recursive `(rest lits)` in
`dl-find-bindings` with indexed iteration. chain-15 drops from ~25s
to ~16s and chain-25 saturates in ~33s real / 11s user — most CPU
now in unification (assoc-based subst dict copies) and dict
lookups during walks. Future: a per-rule "compiled" body that
pre-resolves arg positions and intern variable indices, then
unification can use array slots instead of dict assoc.
## Progress log
_Newest first._
- 2026-05-08 — `dl-query` accepts a list of body literals for
conjunctive queries, in addition to a single positive literal.
`dl-query-coerce` dispatches based on the first element's shape:
positive lit (head is a symbol) or `:neg` dict → wrap as singleton;
list of lits → use as-is. `dl-query-user-vars` collects the union
of vars across all goals (deduped, `_` filtered) for projection.
2 new api tests: multi-goal AND, and conjunction with comparison.
- 2026-05-08 — Bug fix: `dl-check-stratifiable` now rejects recursion
through aggregation (e.g., `q(N) :- count(N, X, q(X))`). The
stratifier was already adding negation-like edges for aggregates,
but the cycle scan only looked at explicit `:neg` literals. Added
the matching aggregate branch to the body iteration. Also adds
doc-only `lib/datalog/datalog.sx` with the public-API surface
(since `load` is an epoch command and can't recurse from within an
`.sx` file). 3 new aggregate tests cover recursion-rejection,
negation-and-aggregation coexistence, and min-over-empty-derived.
- 2026-05-08 — Phase 10 demo + canonical query. Added the "cooking
posts by people I follow (transitively)" example from the plan:
`dl-demo-cooking-rules` defines `reach` over the follow graph
(recursive transitive closure) and `cooking-post-by-network` that
joins reach with `authored` and `(tagged P cooking)`. 3 demo
tests cover transitive network, direct-only follow, and
empty-network cases.
- 2026-05-08 — Phase 8 extension: `findall L V Goal` aggregate. Bind
L to the list of distinct V values for which Goal holds (or the
empty list when no matches). Implemented as a one-line case in
`dl-do-aggregate`. 3 new tests: EDB, derived relation, empty.
Useful for "give me all the X such that …" queries without
scalar reduction.
- 2026-05-08 — Phase 5d semantic fix: anonymous `_` variables are
renamed per occurrence at `dl-add-rule!` and `dl-query` time so
`(p X _) (p _ Y)` no longer unifies the two `_`s. New helpers
`dl-rename-anon-term`, `dl-rename-anon-lit`, `dl-make-anon-renamer`,
`dl-rename-anon-rule` in db.sx; eval.sx's dl-query renames the goal
before search and projects only user-named vars (`_` is filtered
out of the projection list). The "underscore in head" test now
correctly rejects `(p X _) :- q(X).` — after renaming, the head's
fresh anon var has no body binder. Two new eval tests verify
rule-level and goal-level independence. 155/155 expected.
- 2026-05-08 — Phase 5c perf: indexed `dl-find-bindings`. Replaced
the recursive `(rest lits)` walk with `dl-fb-aux lits db subst i n`
using `nth lits i`. Eliminates O(N²) list-copy per body of length
N. chain-15 saturation 25s → 16s; chain-25 finishes in 33s real
(vs. timeout previously). Bumped semi_naive tests: differential
on chain-10, semi-only count on chain-15 (was chain-5/chain-5).
153/153.
- 2026-05-08 — Phase 10 syntactic demo. New `lib/datalog/demo.sx`
with three programs over rose-ash-shaped data: federation
(`mutual`, `reachable`, `foaf`), content recommendation
(`post-likes` via count aggregation, `popular`, `interesting`),
and role-based permissions (`in-group` over transitive subgroups,
`can-access`). 10 demo tests pass against synthetic EDB tuples.
Postgres loader and `/internal/datalog` HTTP endpoint remain
out of scope for this loop (they need service-tree edits beyond
`lib/datalog/**`). Conformance now 153/153.
- 2026-05-08 — Phase 5b perf: hash-set membership in `dl-add-fact!`.
db gains a parallel `:facts-keys {<rel>: {<tuple-string>: true}}`
index alongside `:facts`. `dl-tuple-key` derives a stable string
key via `(str lit)` — `(p 30)` and `(p 30.0)` collide correctly
because SX prints them identically. Insertion is O(1) instead of
O(n). chain-7 saturation drops from ~12s to ~6s; chain-15 from
~50s to ~25s under shared CPU. Larger chains are still slow due
to body-join overhead in dl-find-bindings (Blocker updated).
`dl-retract!` updated to keep both indices consistent. 143/143.
- 2026-05-08 — Phase 9 done. New `lib/datalog/api.sx` exposes a
parser-free embedding: `dl-program-data facts rules` accepts SX
data lists, with rules in either dict form or list form using
`<-` as the rule arrow (since SX parses `:-` as a keyword).
`dl-rule head body` constructs the dict. `dl-assert! db lit` adds
a fact and re-saturates; `dl-retract! db lit` drops the fact from
EDB, wipes all rule-headed (IDB) relations, and re-saturates from
scratch — the simplest correct semantics until provenance tracking
arrives in a later phase. 9 API tests; conformance now 143/143.
- 2026-05-08 — Phase 8 done. New `lib/datalog/aggregates.sx` (~110
LOC): count / sum / min / max. Each is a body literal of shape
`(op R V Goal)` — `dl-eval-aggregate` runs `dl-find-bindings` on
the goal under the outer subst (so outer vars in the goal get
substituted, giving group-by-style aggregation), collects the
distinct values of `V`, and binds `R`. Empty input: count/sum
return 0; min/max produce no binding (rule fails). Stratifier
extended via `dl-aggregate-dep-edge` so the aggregate's goal
relation is fully derived before the aggregate fires. Safety check
treats goal-internal vars as existentials (no outer binding
required); only the result var becomes bound. Conformance now
134 / 134.
- 2026-05-08 — Phase 7 done (Phase 6 magic sets deferred — opt-in,
semi-naive default suffices for current test suite). New
`lib/datalog/strata.sx` (~290 LOC): dep graph build, Floyd-Warshall
reachability, SCC-via-mutual-reachability for non-stratifiability
detection, iterative stratum computation, rule grouping by head
stratum. eval.sx split: `dl-saturate-rules!` is the per-rule-set
semi-naive worker, `dl-saturate!` is now the stratified driver
(errors out on non-stratifiable programs). `dl-match-negation` in
eval.sx: succeeds iff inner positive match is empty. Stratum-keyed
dicts use `(str s)` since SX dicts only accept string/keyword keys.
10 negation tests cover EDB/IDB negation, multi-level strata,
non-stratifiability rejection, and a negation safety violation.
- 2026-05-08 — Phase 5 done. `lib/datalog/eval.sx` rewritten to
semi-naive default. `dl-saturate!` tracks a per-relation delta and
on each iteration walks every positive body position substituting
delta for that one literal — joining the rest against the full DB
snapshot. `dl-saturate-naive!` retained as the reference. Rules
with no positive body literal (e.g. `(p X) :- (= X 5).`) fall back
to a naive one-shot via `dl-collect-rule-candidates`. 8 tests
differentially compare the two saturators using per-relation tuple
counts (cheap). Chain-5 differential exercises multi-iteration
recursive saturation. Larger chains made conformance.sh time out
due to O(n) `dl-tuple-member?` × CPU sharing with other loop
agents — added a Blocker to swap to a hash-set for membership.
Also tightened `dl-tuple-member?` to use indexed iteration instead
of recursive `rest` (was creating a fresh list per step).
- 2026-05-07 — Phase 4 done. `lib/datalog/builtins.sx` (~280 LOC) adds
`(< X Y)`, `(<= X Y)`, `(> X Y)`, `(>= X Y)`, `(= X Y)`, `(!= X Y)`,
and `(is X expr)` with `+ - * /`. `dl-eval-builtin` dispatches;
`dl-eval-arith` recursively evaluates nested compounds. Safety
check is now order-aware — it walks body literals left-to-right
tracking the bound set, requires comparison/`is` inputs to be
already bound, and special-cases `=` (binds the var-side; both
sides must include at least one bound to bind the other). Phase 3's
simple safety check stays in db.sx as a forward-reference fallback;
builtins.sx redefines `dl-rule-check-safety` to the comprehensive
version. eval.sx's `dl-match-lit` now dispatches built-ins through
`dl-eval-builtin`. 19 builtins tests; conformance 106 / 106.
- 2026-05-07 — Phase 3 done. `lib/datalog/db.sx` (~250 LOC) holds facts
indexed by relation name plus the rules list, with `dl-add-fact!` /
`dl-add-rule!` (rejects non-ground facts and unsafe rules);
`lib/datalog/eval.sx` (~150 LOC) implements the naive bottom-up
fixpoint via `dl-find-bindings`/`dl-match-positive`/`dl-saturate!`
and `dl-query` (deduped projected substitutions). Safety analysis
rejects unsafe head vars at load time. Negation and arithmetic
built-ins raise clean errors (lifted in later phases). 15 eval
tests cover transitive closure, sibling, same-generation, cyclic
graph reach, and six safety violations. Conformance 87 / 87.
- 2026-05-07 — Phase 2 done. `lib/datalog/unify.sx` (~140 LOC):
`dl-var?` (case + underscore), `dl-walk`, `dl-bind`, `dl-unify` (returns
extended dict subst or `nil`), `dl-apply-subst`, `dl-ground?`, `dl-vars-of`.
Substitutions are immutable dicts; `assoc` builds extended copies. 28
unify tests; conformance now 72 / 72.
- 2026-05-07 — Phase 1 done. `lib/datalog/tokenizer.sx` (~190 LOC) emits
`{:type :value :pos}` tokens; `lib/datalog/parser.sx` (~150 LOC) produces
`{:head … :body …}` / `{:query …}` clauses, with nested compounds
permitted for arithmetic and `not(...)` desugared to `{:neg …}`. 44 / 44
via `bash lib/datalog/conformance.sh` (26 tokenize + 18 parse). Local
helpers namespace-prefixed (`dl-emit!`, `dl-peek`) after a host-primitive
shadow clash. Test harness uses a custom `dl-deep-equal?` that handles
out-of-order dict keys and number repr (`equal?` fails on dict key order
and on `30` vs `30.0`).