rose-ash/plans/datalog-on-sx.md

# Datalog-on-SX: Datalog on the CEK/VM

Datalog is a declarative query language: a restricted subset of Prolog with no function
symbols, only relations. Programs are sets of facts and rules; queries ask what follows.
Evaluation is bottom-up (fixpoint iteration) rather than Prolog's top-down DFS — which
means no infinite loops, guaranteed termination, and efficient incremental updates.

The unique angle: Datalog is a natural companion to the Prolog implementation already in
progress (`lib/prolog/`). The parser and term representation can share infrastructure;
the evaluator is an entirely different fixpoint engine rather than a DFS solver.

End-state goal: **full core Datalog** (facts, rules, stratified negation, aggregation,
recursion) with a clean SX query API, and a demonstration of Datalog as a query engine
for rose-ash data (e.g. federation graph, content relationships).

## Status (rolling)

`bash lib/datalog/conformance.sh` → **274/274 across 11 suites**
(tokenize, parse, unify, eval, builtins, semi_naive, negation, aggregates,
api, magic, demo). Source is ~3100 LOC, tests ~2900 LOC, public API
documented in `lib/datalog/datalog.sx`.

Phases 1–9 are functionally complete; Phase 10 covers the rose-ash
domain demos (in `lib/datalog/demo.sx` — federation, content,
permissions, cooking-posts, tag co-occurrence, shortest path, org chart).
The PostgreSQL loader and `/internal/datalog` HTTP endpoint listed in
Phase 10 require service-tree edits outside `lib/datalog/**` and are
flagged as out-of-scope for this loop.

## Ground rules

- **Scope:** only touch `lib/datalog/**` and `plans/datalog-on-sx.md`. Do **not** edit
  `spec/`, `hosts/`, `shared/`, `lib/prolog/**`, or other `lib/<lang>/`.
- **Shared-file issues** go under "Blockers" below with a minimal repro; do not fix here.
- **SX files:** use `sx-tree` MCP tools only.
- **Architecture:** Datalog source → term AST → fixpoint evaluator. No transpiler to SX AST —
  the evaluator is written in SX and works directly on term structures.
- **Reference:** Ramakrishnan & Ullman "A Survey of Deductive Database Systems";
  Dalmau "Datalog and Constraint Satisfaction".
- **Commits:** one feature per commit. Keep `## Progress log` updated and tick boxes.

## Architecture sketch

```
Datalog source text
    │
    ▼
lib/datalog/tokenizer.sx   — atoms, variables, numbers, strings, punct (?- :- , . ( ) [ ])
    │
    ▼
lib/datalog/parser.sx      — facts: atom(args). rules: head :- body. queries: ?- goal.
    │                        No function symbols (only constants and variables in args).
    ▼
lib/datalog/db.sx          — extensional DB (EDB): ground facts; IDB: derived relations;
    │                        clause index by relation name/arity
    ▼
lib/datalog/eval.sx        — bottom-up fixpoint: semi-naive evaluation with delta sets;
    │                        stratification for negation; incremental update API
    ▼
lib/datalog/query.sx       — query API: (datalog-query db goal) → list of substitutions;
                             SX embedding: define facts/rules as SX data directly
```

Key differences from Prolog:
- **No function symbols** — args are atoms, numbers, strings, or variables only. No `f(a,b)`.
- **No cuts** — no procedural control.
- **Bottom-up** — derive all consequences of all rules before answering; no search tree.
- **Termination guaranteed** — no infinite derivation chains (no function symbols → finite Herbrand base).
- **Stratified negation** — `not(P)` legal iff P does not recursively depend on its own negation.
- **Aggregation** — `count`, `sum`, `min`, `max` over derived tuples (Datalog+).

## Roadmap

### Phase 1 — tokenizer + parser
- [x] Tokenizer: atoms (lowercase/quoted), variables (uppercase/`_`), numbers, strings,
      punct (`( )`, `,`, `.`), operators (`:-`, `?-`, `<=`, `>=`, `!=`, `<`, `>`, `=`,
      `+`, `-`, `*`, `/`), comments (`%`, `/* */`)
      Note: no function symbol syntax (no nested `f(...)` in arg position) — but the
      parser permits nested compounds for arithmetic; safety analysis (Phase 3) rejects
      non-arithmetic nesting.
- [x] Parser:
      - Facts: `parent(tom, bob).` → `{:head (parent tom bob) :body ()}`
      - Rules: `ancestor(X,Z) :- parent(X,Y), ancestor(Y,Z).`
              → `{:head (ancestor X Z) :body ((parent X Y) (ancestor Y Z))}`
      - Queries: `?- ancestor(tom, X).` → `{:query ((ancestor tom X))}`
        (`:query` value is always a list of literals; `?- p, q.` → `{:query ((p) (q))}`)
      - Negation: `not(parent(X,Y))` in body position → `{:neg (parent X Y)}`
- [x] Tests in `lib/datalog/tests/parse.sx` (18) and `lib/datalog/tests/tokenize.sx` (26).
      Conformance harness: `bash lib/datalog/conformance.sh` → 44 / 44 passing.

### Phase 2 — unification + substitution
- [x] Ported (not shared) from `lib/prolog/` — term walk, no occurs check.
- [x] `dl-unify t1 t2 subst` → extended subst dict, or `nil` on failure.
- [x] `dl-walk`, `dl-bind`, `dl-apply-subst`, `dl-ground?`, `dl-vars-of`.
- [x] Substitutions are immutable dicts keyed by variable name (string).
      Lists/tuples unify element-wise (used for arithmetic compounds too).
- [x] Tests in `lib/datalog/tests/unify.sx` (28). 72 / 72 conformance.

### Phase 3 — extensional DB + naive evaluation + safety analysis
- [x] EDB+IDB combined: `{:facts {<rel-name-string> -> (literal ...)}}` —
      relations indexed by name; tuples stored as full literals so they
      unify directly. Dedup on insert via `dl-tuple-equal?`.
- [x] `dl-add-fact! db lit` (rejects non-ground) and `dl-add-rule! db rule`
      (rejects unsafe). `dl-program source` parses + loads in one step.
- [x] Naive evaluation `dl-saturate! db`: iterate rules until no new tuples.
      `dl-find-bindings` recursively joins body literals; `dl-match-positive`
      unifies a literal against every tuple in the relation.
- [x] `dl-query db goal` → list of substitutions over `goal`'s vars,
      deduplicated. `dl-relation db name` for derived tuples.
- [x] Safety analysis at `dl-add-rule!` time: every head variable except
      `_` must appear in some positive body literal. Built-ins and negated
      literals do not satisfy safety. Helpers `dl-positive-body-vars`,
      `dl-rule-unsafe-head-vars` exposed for later phases.
- [x] Negation and arithmetic built-ins error cleanly at saturate time
      (Phase 4 / Phase 7 will swap in real semantics).
- [x] Tests in `lib/datalog/tests/eval.sx` (15): transitive closure,
      sibling, same-generation, grandparent, cyclic graph reach, six
      safety cases. 87 / 87 conformance.

### Phase 4 — built-in predicates + body arithmetic
Almost every real query needs `<`, `=`, simple arithmetic, and string
comparisons in body position. These are not EDB lookups — they're
constraints that filter bindings.
- [x] Recognise built-in predicates in body: `(< X Y)`, `(<= X Y)`, `(> X Y)`,
      `(>= X Y)`, `(= X Y)`, `(!= X Y)` and arithmetic forms `(is Z (+ X Y))`,
      `(is Z (- X Y))`, `(is Z (* X Y))`, `(is Z (/ X Y))`. Live in
      `lib/datalog/builtins.sx`.
- [x] `dl-eval-builtin` dispatches; `dl-eval-arith` recursively evaluates
      `(+ a b)` etc. with full nesting. `=` unifies; `!=` rejects equal
      ground terms.
- [x] Order-aware safety analysis (`dl-rule-check-safety`): walks body
      left-to-right tracking which vars are bound. `is`'s RHS vars must
      be already bound; LHS becomes bound. Comparisons require both
      sides bound. `=` is special-cased — at least one side bound binds
      the other. Negation vars must be bound (will be enforced fully in
      Phase 7).
- [x] Wired through SX numeric primitives — no separate number tower.
- [x] Tests in `lib/datalog/tests/builtins.sx` (19): range filters,
      arithmetic derivations, equality binding, eight safety violations
      and three safe-shape tests. Conformance 106 / 106.

### Phase 5 — semi-naive evaluation (performance)
- [x] Delta sets `{rel-name -> tuples}` track newly derived tuples per iter.
      `dl-snapshot-facts` builds the initial delta from the EDB.
- [x] Semi-naive rule: for each rule, walk every positive body literal
      position; substitute that one with the per-relation delta and join
      the rest against the previous-iteration DB (`dl-find-bindings-semi`).
      Candidates are collected before mutating the DB so the "full" sides
      see a consistent snapshot.
- [x] `dl-collect-rule-candidates` falls back to a naive single pass when
      a rule has no positive body literal (e.g. `(p X) :- (= X 5).`).
- [x] `dl-saturate!` is now semi-naive by default; `dl-saturate-naive!`
      kept for differential testing and a reference implementation.
- [x] Tests in `lib/datalog/tests/semi_naive.sx` (8) — every recursive
      program from earlier suites is run under both saturators with
      per-relation tuple counts compared (cheap, robust under bundled
      conformance session). A chain-5 differential exercises multiple
      semi-naive iterations against the recursive ancestor rule.
      Larger chains hit prohibitive wall-clock under conformance CPU
      contention with other agents — a future Blocker tracks switching
      `dl-tuple-member?` from O(n²) list scan to a hash-set per relation.

### Phase 6 — magic sets (goal-directed bottom-up, opt-in)
Naive bottom-up derives **all** consequences before answering. Magic sets
rewrite the program so the fixpoint only derives tuples relevant to the
goal — a major perf win for "what's reachable from node X" queries on
large graphs.
- [x] Adornments: `dl-adorn-goal goal` and `dl-adorn-lit lit bound` in
      `lib/datalog/magic.sx`. Per-arg `b`/`f` based on whether the arg
      is a constant or a variable already in the bound set.
- [x] Magic transformation: `dl-magic-rewrite rules query-rel adn args`
      generates `{:rules <rewritten-rules> :seed <magic-seed>}`. Each
      original rule is gated with a `magic_<rel>^<adn>(bound)` filter,
      and propagation rules are emitted for each positive non-builtin
      body literal. Worklist over `(rel, adn)` pairs starts from the
      query and stops when no new pairs appear. EDB facts pass through
      unchanged.
- [x] Sideways information passing strategy (SIPS): left-to-right
      `dl-rule-sips rule head-adornment` walks body literals tracking
      the bound set, returning `({:lit :adornment} ...)`. Recognises
      `is`/aggregate result-vars as new binders; comparisons and
      negation pass through with computed adornments. (Pluggable
      strategies are future work.)
- [x] `dl-set-strategy! db strategy` hook + `dl-get-strategy db`. Default
      `:semi-naive`. `:magic` accepted but the transformation itself is
      deferred — saturator currently falls back to semi-naive. Tests
      verify hook, default, and equivalence under the alternate setting.
- [x] Equivalence test: rewritten ancestor program over the same EDB
      derives the same number of `ancestor` tuples and returns the
      same query answers as the unrewritten program (chain-3 case).
- [x] `dl-magic-query db query-goal` — top-level driver. Builds a
      fresh internal db with the caller's EDB facts, the magic seed,
      and the rewritten rules; saturates and queries. Caller's db is
      untouched. Equivalent to `dl-query` for fully-stratifiable
      programs (sole motivation is a perf alternative on goal-shaped
      queries against large recursive relations).
- [ ] Perf test: 10k-node reachability with magic vs semi-naive.
      Left to a future iteration — would need a benchmarking harness
      for large graphs and the conformance budget can't afford it.

### Phase 7 — stratified negation
- [x] Dependency graph: `dl-build-dep-graph db` returns `{head -> ({:rel
      :neg} ...)}`. Built-ins drop out (they're not relations).
- [x] Reachability via Floyd-Warshall in `dl-build-reach`; cycles
      detected by `reach[A][B] && reach[B][A]`. Programs are
      non-stratifiable iff any negative dependency falls inside an SCC.
      `dl-check-stratifiable` returns nil on success or a clear message.
- [x] `dl-compute-strata` propagates stratum numbers iteratively:
      `stratum(R) = max over deps of (stratum(dep) + (1 if negated else 0))`.
- [x] Saturator refactor: `dl-saturate-rules! db rules` is the semi-
      naive worker; `dl-saturate! db` rejects non-stratifiable programs,
      groups rules by head's stratum, and runs the worker on each
      stratum in increasing order.
- [x] `not(P)` in body: `dl-match-negation` walks the inner literal
      under the current subst and uses `dl-match-positive` — succeeds
      iff zero matches. Order-aware safety in `dl-rule-check-safety`
      (already present from Phase 4) requires negation vars to be
      bound by an earlier positive literal.
- [x] Tests in `lib/datalog/tests/negation.sx` (10): EDB and IDB
      negation, two-step strata, multi-level strata, with-arithmetic,
      empty-result and always-fail cases, non-stratifiability
      rejection, and a negation safety violation.

### Phase 8 — aggregation (Datalog+)
- [x] `(count R V Goal)`, `(sum R V Goal)`, `(min R V Goal)`,
      `(max R V Goal)`, `(findall L V Goal)` — first arg is the result
      variable, second is the aggregated variable, third is the goal
      literal. `findall` returns the distinct-value list itself; the
      others reduce. Live in `lib/datalog/aggregates.sx`.
- [x] `dl-eval-aggregate`: runs `dl-find-bindings` on the goal under the
      current subst (which provides outer-context bindings), collects
      distinct values of the aggregated var, applies the aggregate.
      `count`/`sum` produce 0 when no matches; `min`/`max` produce no
      binding (rule fails) when empty.
- [x] Group-by emerges naturally: outer-context vars in the goal are
      substituted from the current subst, so `popular(P) :- post(P),
      count(N, U, liked(U, P)), >=(N, 3).` correctly counts per-post.
- [x] Stratification: `dl-aggregate-dep-edge` returns a negation-like
      edge so the aggregate's goal relation is fully derived before the
      aggregate fires. Non-monotonicity respected.
- [x] Safety: aggregate body lit binds the result var; goal-internal
      vars are existentially quantified and don't need outer binding.
- [x] Tests in `lib/datalog/tests/aggregates.sx` (10): count siblings,
      sum prices, min/max scores, count over derived relation,
      empty-input cases for each operator, popularity threshold with
      group-by, distinct-counted-once.

### Phase 9 — SX embedding API
- [x] `(dl-program-data facts rules)` builds a db from SX data —
      `facts` is a list of literals, `rules` is a list of either
      dicts `{:head … :body …}` or lists `(<head…> <- <body…>)`.
      Variables are SX symbols whose first char is uppercase or `_`,
      matching the parser's convention.
      ```
      (dl-program-data
        '((parent tom bob) (parent bob ann))
        '((ancestor X Y <- (parent X Y))
          (ancestor X Z <- (parent X Y) (ancestor Y Z))))
      ```
- [x] `(dl-rule head body)` constructor for the dict form.
- [x] `(dl-query db '(ancestor tom X))` already worked — same query API
      consumes the SX-data goal. Now also accepts a *list* of body
      literals for conjunctive queries:
      `(dl-query db '((p X) (q X)))`,
      `(dl-query db (list '(n X) '(> X 2)))`. Auto-dispatched via
      `dl-query-coerce` on first-element shape.
- [x] `(dl-assert! db '(parent ann pat))` → adds the fact and re-saturates.
- [x] `(dl-retract! db '(parent bob ann))` → drops matching tuples from
      the EDB list, wipes every relation that has a rule (those are IDB),
      and re-saturates from the surviving EDB.
- [x] Tests in `lib/datalog/tests/api.sx` (9): closure via data API,
      dict-rule form, dl-rule constructor, dl-assert! incremental,
      dl-retract! removes derived, cyclic-graph reach via data,
      assert into empty db, fact-style rule (no arrow), coerce dict.
- [x] Integration demo: federation graph query — `(reachable A B)` /
      `(mutual A B)` / `(foaf A C)` over `(follows ACTOR-A ACTOR-B)` in
      `lib/datalog/demo.sx`. Tests in `lib/datalog/tests/demo.sx`.
      Wiring this to actual rose-ash ActivityPub data is Phase 10
      service work and is out of scope for this loop.

### Phase 10 — Datalog as a query language for rose-ash
- [x] Schema sketches in `lib/datalog/demo.sx`:
      - **Federation**: `(follows A B)` → `(mutual A B)`, `(reachable A B)`,
        `(foaf A C)` (friend-of-a-friend, distinct).
      - **Content**: `(authored A P)`, `(liked U P)`, `(tagged P T)` →
        `(post-likes P N)` via aggregation, `(popular P)` for likes ≥ 3,
        `(interesting Me P)` joining follows + authored + popular.
      - **Permissions**: `(member A G)`, `(subgroup C P)`, `(allowed G R)`
        → `(in-group A G)` over transitive subgroups, `(can-access A R)`.
      - **Cooking-posts** (the canonical example): `(reach Me Them)` over
        the follow graph, then `(cooking-post-by-network Me P)` joining
        reach + authored + `(tagged P cooking)`.
- [ ] Loader `dl-load-from-db!` — out of scope for this loop
      (would need to edit `shared/services/` outside `lib/datalog/`).
      Programs in `demo.sx` already document the EDB shape expected
      from such a loader. `dl-program-data` consumes the same shape.
- [x] Query examples covered by `lib/datalog/tests/demo.sx` (10):
      mutuals, transitive reach, FOAF, popular posts, interesting feed,
      post likes count, direct/subgroup/transitive group access, no
      access without grant.
- [ ] Service endpoint `POST /internal/datalog` — out of scope as above.
      Once exposed, server-side handler would be `dl-program-data` +
      `dl-query`, returning JSON-encoded substitutions.

## Blockers

- **Saturation perf**: three rounds done.
  - hash-set membership in `dl-add-fact!` (Phase 5b)
  - indexed iteration in `dl-find-bindings` (Phase 5c)
  - first-arg index per relation (Phase 5e) — when a body literal's
    first arg walks to a non-variable, dl-match-positive looks up
    by `(str arg)` instead of scanning the full relation.
  chain-25 saturation drops from ~33s to ~18s real (10s user).
  chain-50 still long (~120s+) due to dict-copy overhead in
  unification subst threading. Future: per-rule "compiled" body
  with pre-resolved var positions, slot-based subst representation
  to avoid `assoc` per binding.

## Progress log

_Newest first._

- 2026-05-11 — `dl-magic-query` could silently diverge from
  `dl-query` when an aggregate's inner-goal relation was IDB. The
  rewriter passes aggregate body lits through unchanged (no magic
  propagation for them), so the inner relation was empty in the
  magic db and the aggregate returned 0. Probe:
  `dl-eval-magic "u(a). u(b). u(c). u(d). banned(b). banned(d).
                  active(X) :- u(X), not(banned(X)).
                  n(N) :- count(N, X, active(X))." "?- n(N)."`
  returned `N=0` instead of `N=2`. Fix: `dl-magic-query` now
  pre-saturates the source db before copying facts into the magic
  db. This guarantees equivalence with `dl-query` for every
  stratified program; the magic benefit comes from goal-directed
  re-derivation of the query relation under the seed (which still
  matters for large recursive joins). The existing test suite's
  aggregate cases happened to dodge this because the inner goals
  were all EDB. 1 new regression test; 274/274.

- 2026-05-11 — Anonymous `_` in a negated literal was incorrectly
  flagged by the safety check. The canonical idiom
  `orphan(X) :- person(X), not(parent(X, _))` was rejected with
  "negation refers to unbound variable(s) (\"_anon1\")" because the
  parser renames each `_` to a fresh `_anon*` symbol and the negation
  safety walk demanded all vars in the negated lit be bound by an
  earlier positive body literal. Anonymous vars in negation are
  existentially quantified — they shouldn't need outer binding.
  Added `dl-non-anon-vars` filter; `dl-process-neg!` now strips
  `_anon*` names from `needed` before the binding check. 2 new
  regression tests; 273/273.

- 2026-05-11 — Compound terms in fact-arg / rule-head positions were
  silently stored as unreduced expressions. `p(+(1, 2)).` resulted
  in a tuple `(p (+ 1 2))` (dl-ground? sees no free variables, so it
  passed). `double(*(X, 2)) :- n(X).` saturated to `double((* 3 2))`
  rather than `double(6)`. Datalog has no function symbols in arg
  positions — `dl-add-fact!` and `dl-add-rule!` now reject compound
  args via a new `dl-simple-term?` (number / string / symbol).
  Compounds remain legal in body literals where they encode `is` /
  arithmetic / aggregate sub-goals. 2 new regression tests; 271/271.

- 2026-05-11 — Quoted atoms with uppercase-or-underscore-leading
  names were misclassified as variables. `p('Hello World').` ran
  through the tokenizer's `"atom"` branch and through the parser's
  `string->symbol`, producing a symbol named "Hello World". dl-var?
  checks the first character — "H" is uppercase, so the fact was
  rejected as non-ground. Fix: tokenizer emits `"string"` for any
  `'...'` quoted form, so quoted atoms become opaque string constants
  (matching how Datalog idiomatically treats them — the alternative
  was a per-symbol "quoted" marker which would have rippled through
  unification and dl-var?). Updated the existing tokenize test and
  added one for `'Hello'`; also added a parse-level regression. 269/269.

- 2026-05-11 — Type-mixed comparisons were silently inconsistent:
  `<(X, 5)` with `X` bound to a string returned `()` (no result, no
  error), while `X` bound to a symbol raised "Expected number, got
  symbol". Both should fail loudly. Added `dl-compare-typeok?` —
  `<`, `<=`, `>`, `>=` now require both operands to share a primitive
  type (both numbers or both strings) and raise otherwise. `!=` is
  exempted since it's a polymorphic inequality test built on
  `dl-tuple-equal?`. 2 new regression tests; 267/267.

- 2026-05-11 — Body literal shape validation in
  `dl-rule-check-safety`: a dict that isn't `{:neg ...}` (e.g. typo'd
  `{:negs ...}`) used to silently fall through every dispatch clause,
  contributing zero bound vars; the user would then see a confusing
  "head var X unbound" error pointing at the head, not the malformed
  body. Same for body lits that are bare numbers / strings / symbols.
  Both shapes now raise a clear error naming the offending lit. 1 new
  regression test; 265/265.

- 2026-05-11 — Division by zero in `is` silently produced IEEE
  infinity instead of raising. `is(R, /(X, 0))` returned `R = inf`,
  which then flowed through comparisons and aggregations to produce
  nonsense results. `dl-eval-arith` now raises with a clear
  "division by zero in <expr>" message. 1 new test; 264/264.

- 2026-05-11 — Aggregate variable validation: `count(N, Y, p(X))`
  silently returned `N = 1` because `Y` was never bound in `p(X)` —
  every match contributed the same unbound symbol, which dl-val-member?
  deduped to a single entry. Similarly `sum(S, Y, p(X))` raised a
  confusing "expected number" error from the underlying `+`. Added
  a third validator in `dl-eval-aggregate`: the agg-var must appear
  in the goal literal. Error names the variable and the goal and
  explains the consequence. 1 new test; 263/263.

- 2026-05-11 — `dl-retract!` was silently destroying EDB facts in
  "mixed" relations (those with BOTH user-asserted facts AND a rule
  defining the same head). The retract pass wiped every rule-head
  relation wholesale and then re-saturated — but the saturator only
  re-derives the IDB portion, so explicit EDB facts vanished even
  for a no-op retract of a non-existent tuple. Probe:
  `(let ((db (dl-program "p(a). p(b). p(X) :- q(X). q(c).")))
     (dl-retract! db (quote (p z))) (dl-query db (quote (p X))))`
  went from `{a,b,c}` to just `{c}`.
  Fix: tracked `:edb-keys` provenance in the db. `dl-add-fact!` (public
  API) marks the tuple as EDB; saturator calls new internal
  `dl-add-derived!` which doesn't mark it. `dl-retract!` now strips
  only the IDB-derived portion of rule-head relations and preserves
  EDB-marked tuples through the re-saturate pass. 2 new regression
  tests; 262/262.

- 2026-05-11 — Eval-semantics bug-hunt: nested `not(not(P))` was
  silently misinterpreted. Outer-level `not(...)` is parsed as
  negation, but the inner `not(banned(X))` was parsed as a regular
  positive literal naming a relation called `not`. Since no `not`
  relation existed, the inner match was empty and the outer
  negation succeeded vacuously, making `vip(X) :- u(X), not(not(banned(X))).`
  equivalent to `vip(X) :- u(X).` (a silent double-negation = identity
  fallacy). Fix in `dl-rule-check-safety`: both the positive-literal
  branch and `dl-process-neg!` now flag any body literal whose head
  is in `dl-reserved-rel-names`. Error message names the relation and
  points the user at intermediate-relation stratified negation. 1 new
  regression test; 260/260.

- 2026-05-10 — Bug-hunt round on parser/safety surfaced 7 real
  bugs, each fixed with regression tests:
  - Reserved relation names (`not`, `count`, `<`, `is`, ...) were
    accepted as rule/fact heads — would silently shadow built-ins.
  - Negative number literals (`n(-1).`) failed to parse — users
    had to express them as `(- 0 1)` or via `is`.
  - Unterminated block comment `/* ...` silently consumed the
    rest of the input. Now raises with the position.
  - Same silent-consume bug in unterminated string / quoted-atom.
  - Empty-list rule head and non-list rule body weren't validated;
    they'd crash later in `rest`. dl-add-rule! now checks shape.
  - dl-magic-query with non-list / non-dict goal crashed cryptically.
  - Tokenizer silently swallowed unrecognised characters (`?`, `!`,
    `#`, `@`, etc.) — typos produced confusing downstream errors.

- 2026-05-08 — Phase 6 driver: `dl-magic-query db query-goal`.
  Builds a fresh internal db from the caller's EDB + magic seed +
  rewritten rules, saturates, queries, returns substitutions —
  caller's db is untouched. Equivalent to `dl-query` for any
  fully-stratifiable program; sole motivation is a perf alternative
  on goal-shaped queries against large recursive relations.
  2 new tests cover equivalence and non-mutation.

- 2026-05-08 — Phase 6 magic-sets rewriter. `dl-magic-rewrite rules
  query-rel adn args` returns `{:rules <rewritten> :seed <seed-fact>}`.
  Worklist over `(rel, adn)` pairs starts from the query, gates each
  original rule with a `magic_<rel>^<adn>(bound)` filter, and emits
  propagation rules for each positive non-builtin body literal so
  that magic spreads to body relations. EDB facts pass through.
  3 new tests cover seed structure, equivalence on chain-3 by
  ancestor-relation tuple count, and same-query-answers under
  the rewritten program. The plumbing for a `dl-saturate-magic!`
  driver and large-graph perf benchmarks is still future work.

- 2026-05-08 — Phase 6 building blocks for the magic-sets
  transformation: `dl-magic-rel-name`, `dl-magic-lit`,
  `dl-bound-args`. The rewriter that generates magic seed and
  propagation rules is still future work; with these primitives
  in place it's a straightforward worklist algorithm. 4 new tests.

- 2026-05-08 — Phase 6 adornments + SIPS in
  `lib/datalog/magic.sx`. Inspection helpers — `dl-adorn-goal` and
  `dl-adorn-lit` compute per-arg `b`/`f` patterns under a bound
  set; `dl-rule-sips rule head-adornment` walks body literals
  left-to-right propagating the bound set, recognising `is` and
  aggregate result-vars as new binders. Lays groundwork for a
  later magic-sets transformation. 10 new tests cover pure
  adornment, SIPS over a chain rule, head-fully-bound rules,
  comparisons, and `is`. Saturator does not yet consume these.

- 2026-05-08 — Comprehensive integration test in api suite: a
  single program exercising recursion (`reach` transitive closure)
  + stratified negation (`safe X Y :- reach X Y, not banned Y`) +
  aggregation (`reach_count` via count) + comparison (`>= N 2`)
  composed end-to-end via `dl-eval source query-source`. Confirms
  the full pipeline (parser → safety → stratifier → semi-naive +
  aggregate post-pass → query) on a non-trivial program.

- 2026-05-08 — Bug fix: aggregates work as top-level query goals.
  `dl-match-lit` (the naive matcher used by `dl-find-bindings`) was
  missing the `dl-aggregate?` dispatch — it was only present in
  `dl-fbs-aux` (semi-naive). Symptom: `(dl-query db '(count N X (p X)))`
  silently returned `()`. Also updated `dl-query-user-vars` to project
  only the result var (first arg) of an aggregate goal — the
  aggregated var and inner-goal vars are existentials and should not
  appear in the projected substitution. 2 new aggregate tests cover
  the regression.

- 2026-05-08 — Convenience: `dl-eval source query-source`. Parses
  both strings, builds a db, saturates, runs the query, returns
  the substitution list. Single-call user-friendly entry. 2 new
  api tests cover ancestor and multi-goal queries.

- 2026-05-08 — Phase 6 stub: `dl-set-strategy! db strategy` and
  `dl-get-strategy db` user-facing hooks. Default `:semi-naive`;
  `:magic` is accepted but the actual transformation is deferred,
  so saturation still uses semi-naive. Lets us tick the
  "Optional pass — guarded behind dl-set-strategy!" Phase 6 box.
  3 new eval tests.

- 2026-05-08 — Demo: weighted-DAG shortest path. `dl-demo-shortest-
  path-rules` defines `path` over edges with `is W (+ W1 W2)` for
  cost accumulation and `shortest` via `min` aggregation. 3 demo
  tests cover direct/multi-hop choice, multi-hop wins on cheaper
  route, and unreachable-empty. Added `dl-summary db` inspection
  helper returning `{<rel>: count}` (4 eval tests).

- 2026-05-08 — Phase 5e perf: first-arg index per relation. db gains
  `:facts-index {<rel>: {<first-arg-key>: tuples}}` mirroring the
  existing `:facts-keys` membership index. `dl-add-fact!` populates
  it; `dl-match-positive` walks the body literal's first arg under
  the current subst — if it's bound to a non-var, look up by
  `(str arg)` and iterate only the matching subset. chain-25
  saturation 33s → 18s real (~2x). chain-50 still slow (~120s+)
  but tractable; next bottleneck is subst dict copies during
  unification. Differential test bumped to chain-12, semi-only
  count to chain-25.

- 2026-05-08 — Demo: tag co-occurrence. `(cotagged P T1 T2)` — post
  has both T1 and T2 with T1 != T2 — and `(tag-pair-count T1 T2 N)`
  counting posts per distinct tag pair. Demonstrates count
  aggregation grouped by outer-context vars. 2 new demo tests.

- 2026-05-08 — `dl-query` accepts a list of body literals for
  conjunctive queries, in addition to a single positive literal.
  `dl-query-coerce` dispatches based on the first element's shape:
  positive lit (head is a symbol) or `:neg` dict → wrap as singleton;
  list of lits → use as-is. `dl-query-user-vars` collects the union
  of vars across all goals (deduped, `_` filtered) for projection.
  2 new api tests: multi-goal AND, and conjunction with comparison.

- 2026-05-08 — Bug fix: `dl-check-stratifiable` now rejects recursion
  through aggregation (e.g., `q(N) :- count(N, X, q(X))`). The
  stratifier was already adding negation-like edges for aggregates,
  but the cycle scan only looked at explicit `:neg` literals. Added
  the matching aggregate branch to the body iteration. Also adds
  doc-only `lib/datalog/datalog.sx` with the public-API surface
  (since `load` is an epoch command and can't recurse from within an
  `.sx` file). 3 new aggregate tests cover recursion-rejection,
  negation-and-aggregation coexistence, and min-over-empty-derived.

- 2026-05-08 — Phase 10 demo + canonical query. Added the "cooking
  posts by people I follow (transitively)" example from the plan:
  `dl-demo-cooking-rules` defines `reach` over the follow graph
  (recursive transitive closure) and `cooking-post-by-network` that
  joins reach with `authored` and `(tagged P cooking)`. 3 demo
  tests cover transitive network, direct-only follow, and
  empty-network cases.

- 2026-05-08 — Phase 8 extension: `findall L V Goal` aggregate. Bind
  L to the list of distinct V values for which Goal holds (or the
  empty list when no matches). Implemented as a one-line case in
  `dl-do-aggregate`. 3 new tests: EDB, derived relation, empty.
  Useful for "give me all the X such that …" queries without
  scalar reduction.

- 2026-05-08 — Phase 5d semantic fix: anonymous `_` variables are
  renamed per occurrence at `dl-add-rule!` and `dl-query` time so
  `(p X _) (p _ Y)` no longer unifies the two `_`s. New helpers
  `dl-rename-anon-term`, `dl-rename-anon-lit`, `dl-make-anon-renamer`,
  `dl-rename-anon-rule` in db.sx; eval.sx's dl-query renames the goal
  before search and projects only user-named vars (`_` is filtered
  out of the projection list). The "underscore in head" test now
  correctly rejects `(p X _) :- q(X).` — after renaming, the head's
  fresh anon var has no body binder. Two new eval tests verify
  rule-level and goal-level independence. 155/155 expected.

- 2026-05-08 — Phase 5c perf: indexed `dl-find-bindings`. Replaced
  the recursive `(rest lits)` walk with `dl-fb-aux lits db subst i n`
  using `nth lits i`. Eliminates O(N²) list-copy per body of length
  N. chain-15 saturation 25s → 16s; chain-25 finishes in 33s real
  (vs. timeout previously). Bumped semi_naive tests: differential
  on chain-10, semi-only count on chain-15 (was chain-5/chain-5).
  153/153.

- 2026-05-08 — Phase 10 syntactic demo. New `lib/datalog/demo.sx`
  with three programs over rose-ash-shaped data: federation
  (`mutual`, `reachable`, `foaf`), content recommendation
  (`post-likes` via count aggregation, `popular`, `interesting`),
  and role-based permissions (`in-group` over transitive subgroups,
  `can-access`). 10 demo tests pass against synthetic EDB tuples.
  Postgres loader and `/internal/datalog` HTTP endpoint remain
  out of scope for this loop (they need service-tree edits beyond
  `lib/datalog/**`). Conformance now 153/153.

- 2026-05-08 — Phase 5b perf: hash-set membership in `dl-add-fact!`.
  db gains a parallel `:facts-keys {<rel>: {<tuple-string>: true}}`
  index alongside `:facts`. `dl-tuple-key` derives a stable string
  key via `(str lit)` — `(p 30)` and `(p 30.0)` collide correctly
  because SX prints them identically. Insertion is O(1) instead of
  O(n). chain-7 saturation drops from ~12s to ~6s; chain-15 from
  ~50s to ~25s under shared CPU. Larger chains are still slow due
  to body-join overhead in dl-find-bindings (Blocker updated).
  `dl-retract!` updated to keep both indices consistent. 143/143.

- 2026-05-08 — Phase 9 done. New `lib/datalog/api.sx` exposes a
  parser-free embedding: `dl-program-data facts rules` accepts SX
  data lists, with rules in either dict form or list form using
  `<-` as the rule arrow (since SX parses `:-` as a keyword).
  `dl-rule head body` constructs the dict. `dl-assert! db lit` adds
  a fact and re-saturates; `dl-retract! db lit` drops the fact from
  EDB, wipes all rule-headed (IDB) relations, and re-saturates from
  scratch — the simplest correct semantics until provenance tracking
  arrives in a later phase. 9 API tests; conformance now 143/143.

- 2026-05-08 — Phase 8 done. New `lib/datalog/aggregates.sx` (~110
  LOC): count / sum / min / max. Each is a body literal of shape
  `(op R V Goal)` — `dl-eval-aggregate` runs `dl-find-bindings` on
  the goal under the outer subst (so outer vars in the goal get
  substituted, giving group-by-style aggregation), collects the
  distinct values of `V`, and binds `R`. Empty input: count/sum
  return 0; min/max produce no binding (rule fails). Stratifier
  extended via `dl-aggregate-dep-edge` so the aggregate's goal
  relation is fully derived before the aggregate fires. Safety check
  treats goal-internal vars as existentials (no outer binding
  required); only the result var becomes bound. Conformance now
  134 / 134.

- 2026-05-08 — Phase 7 done (Phase 6 magic sets deferred — opt-in,
  semi-naive default suffices for current test suite). New
  `lib/datalog/strata.sx` (~290 LOC): dep graph build, Floyd-Warshall
  reachability, SCC-via-mutual-reachability for non-stratifiability
  detection, iterative stratum computation, rule grouping by head
  stratum. eval.sx split: `dl-saturate-rules!` is the per-rule-set
  semi-naive worker, `dl-saturate!` is now the stratified driver
  (errors out on non-stratifiable programs). `dl-match-negation` in
  eval.sx: succeeds iff inner positive match is empty. Stratum-keyed
  dicts use `(str s)` since SX dicts only accept string/keyword keys.
  10 negation tests cover EDB/IDB negation, multi-level strata,
  non-stratifiability rejection, and a negation safety violation.

- 2026-05-08 — Phase 5 done. `lib/datalog/eval.sx` rewritten to
  semi-naive default. `dl-saturate!` tracks a per-relation delta and
  on each iteration walks every positive body position substituting
  delta for that one literal — joining the rest against the full DB
  snapshot. `dl-saturate-naive!` retained as the reference. Rules
  with no positive body literal (e.g. `(p X) :- (= X 5).`) fall back
  to a naive one-shot via `dl-collect-rule-candidates`. 8 tests
  differentially compare the two saturators using per-relation tuple
  counts (cheap). Chain-5 differential exercises multi-iteration
  recursive saturation. Larger chains made conformance.sh time out
  due to O(n) `dl-tuple-member?` × CPU sharing with other loop
  agents — added a Blocker to swap to a hash-set for membership.
  Also tightened `dl-tuple-member?` to use indexed iteration instead
  of recursive `rest` (was creating a fresh list per step).

- 2026-05-07 — Phase 4 done. `lib/datalog/builtins.sx` (~280 LOC) adds
  `(< X Y)`, `(<= X Y)`, `(> X Y)`, `(>= X Y)`, `(= X Y)`, `(!= X Y)`,
  and `(is X expr)` with `+ - * /`. `dl-eval-builtin` dispatches;
  `dl-eval-arith` recursively evaluates nested compounds. Safety
  check is now order-aware — it walks body literals left-to-right
  tracking the bound set, requires comparison/`is` inputs to be
  already bound, and special-cases `=` (binds the var-side; both
  sides must include at least one bound to bind the other). Phase 3's
  simple safety check stays in db.sx as a forward-reference fallback;
  builtins.sx redefines `dl-rule-check-safety` to the comprehensive
  version. eval.sx's `dl-match-lit` now dispatches built-ins through
  `dl-eval-builtin`. 19 builtins tests; conformance 106 / 106.

- 2026-05-07 — Phase 3 done. `lib/datalog/db.sx` (~250 LOC) holds facts
  indexed by relation name plus the rules list, with `dl-add-fact!` /
  `dl-add-rule!` (rejects non-ground facts and unsafe rules);
  `lib/datalog/eval.sx` (~150 LOC) implements the naive bottom-up
  fixpoint via `dl-find-bindings`/`dl-match-positive`/`dl-saturate!`
  and `dl-query` (deduped projected substitutions). Safety analysis
  rejects unsafe head vars at load time. Negation and arithmetic
  built-ins raise clean errors (lifted in later phases). 15 eval
  tests cover transitive closure, sibling, same-generation, cyclic
  graph reach, and six safety violations. Conformance 87 / 87.

- 2026-05-07 — Phase 2 done. `lib/datalog/unify.sx` (~140 LOC):
  `dl-var?` (case + underscore), `dl-walk`, `dl-bind`, `dl-unify` (returns
  extended dict subst or `nil`), `dl-apply-subst`, `dl-ground?`, `dl-vars-of`.
  Substitutions are immutable dicts; `assoc` builds extended copies. 28
  unify tests; conformance now 72 / 72.

- 2026-05-07 — Phase 1 done. `lib/datalog/tokenizer.sx` (~190 LOC) emits
  `{:type :value :pos}` tokens; `lib/datalog/parser.sx` (~150 LOC) produces
  `{:head … :body …}` / `{:query …}` clauses, with nested compounds
  permitted for arithmetic and `not(...)` desugared to `{:neg …}`. 44 / 44
  via `bash lib/datalog/conformance.sh` (26 tokenize + 18 parse). Local
  helpers namespace-prefixed (`dl-emit!`, `dl-peek`) after a host-primitive
  shadow clash. Test harness uses a custom `dl-deep-equal?` that handles
  out-of-order dict keys and number repr (`equal?` fails on dict key order
  and on `30` vs `30.0`).