datalog-plan: align roadmap with briefing — insert built-ins (P4) and magic sets (P6)
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 48s

This commit is contained in:
2026-05-07 23:42:34 +00:00
parent 9bc70fd2a9
commit 6457eb668c

View File

@@ -103,23 +103,59 @@ Key differences from Prolog:
sibling, same-generation, grandparent, cyclic graph reach, six
safety cases. 87 / 87 conformance.
### Phase 4 — semi-naive evaluation (performance)
### Phase 4 — built-in predicates + body arithmetic
Almost every real query needs `<`, `=`, simple arithmetic, and string
comparisons in body position. These are not EDB lookups — they're
constraints that filter bindings.
- [ ] Recognise built-in predicates in body: `(< X Y)`, `(<= X Y)`, `(> X Y)`,
`(>= X Y)`, `(= X Y)`, `(!= X Y)` and arithmetic forms `(is Z (+ X Y))`,
`(is Z (- X Y))`, `(is Z (* X Y))`, `(is Z (/ X Y))`.
- [ ] Built-in evaluation: at the join step, after binding variables from
EDB lookups, evaluate built-ins as constraints. If any built-in fails
or has unbound inputs, drop the candidate substitution.
- [ ] **Safety extension**: `is` binds its left operand iff right operand is
fully ground. `(< X Y)` requires both X and Y bound by some prior body
literal — reject unsafe at `dl-add-rule!` time.
- [ ] Wire arithmetic operators through to SX numeric primitives — no
separate Datalog number tower.
- [ ] Tests: range filters, arithmetic derivations, comparison-based
queries, safety violation on `(p X) :- (< X 5).`
### Phase 5 — semi-naive evaluation (performance)
- [ ] Delta sets: track newly derived tuples per iteration
- [ ] Semi-naive rule: only join against delta tuples from last iteration, not full relation
- [ ] Significant speedup for recursive rules — avoids re-deriving known tuples
- [ ] `dl-stratify` `db` → dependency graph + SCC analysis → stratum ordering
- [ ] Tests: verify semi-naive produces same results as naive; benchmark on large ancestor chain
### Phase 5stratified negation
### Phase 6magic sets (goal-directed bottom-up, opt-in)
Naive bottom-up derives **all** consequences before answering. Magic sets
rewrite the program so the fixpoint only derives tuples relevant to the
goal — a major perf win for "what's reachable from node X" queries on
large graphs.
- [ ] Adornments: annotate rule predicates with bound (`b`) / free (`f`)
patterns based on how they're called.
- [ ] Magic transformation: for each adorned predicate, generate a
`magic_<pred>` relation and rewrite rule bodies to filter through it.
- [ ] Sideways information passing strategy (SIPS): left-to-right by
default; pluggable.
- [ ] Optional pass — `(dl-set-strategy! db :magic)`; default semi-naive.
- [ ] Tests: equivalence vs naive on small inputs; perf win on a 10k-node
reachability query from a single root.
### Phase 7 — stratified negation
- [ ] Dependency graph analysis: which relations depend on which (positively or negatively)
- [ ] Stratification check: error if negation is in a cycle (non-stratifiable program)
- [ ] Evaluation: process strata in order — lower stratum fully computed before using its
complement in a higher stratum
- [ ] `not(P)` in rule body: at evaluation time, check P is NOT in the derived EDB
- [ ] Tests: non-member (`not(member(X,L))`), colored-graph (`not(same-color(X,Y))`),
stratification error detection
- [ ] `dl-stratify db` → SCC analysis → stratum ordering
- [ ] Evaluation: process strata in order — lower stratum fully computed
before using its complement in a higher stratum
- [ ] `not(P)` in rule body: at evaluation time, check P is NOT in the
derived EDB
- [ ] Safety extension: head vars in negative literals must also appear in
some positive body literal of the same rule
- [ ] Tests: non-member (`not(member(X,L))`), colored-graph
(`not(same-color(X,Y))`), stratification error detection
### Phase 6 — aggregation (Datalog+)
### Phase 8 — aggregation (Datalog+)
- [ ] `count(X, Goal)` → number of distinct X satisfying Goal
- [ ] `sum(X, Goal)` → sum of X values satisfying Goal
- [ ] `min(X, Goal)` / `max(X, Goal)` → min/max of X satisfying Goal
@@ -127,7 +163,7 @@ Key differences from Prolog:
- [ ] Aggregation breaks stratification — evaluate in a separate post-fixpoint pass
- [ ] Tests: social network statistics, grade aggregation, inventory sums
### Phase 7 — SX embedding API
### Phase 9 — SX embedding API
- [ ] `(dl-program facts rules)` → database from SX data directly (no parsing required)
```
(dl-program
@@ -141,7 +177,7 @@ Key differences from Prolog:
- [ ] Integration demo: federation graph query — `(ancestor actor1 actor2)` over
rose-ash ActivityPub follow relationships
### Phase 8 — Datalog as a query language for rose-ash
### Phase 10 — Datalog as a query language for rose-ash
- [ ] Schema: map SQLAlchemy model relationships to Datalog EDB facts
(e.g. `(follows user1 user2)`, `(authored user post)`, `(tagged post tag)`)
- [ ] Loader: `dl-load-from-db!` — query PostgreSQL, populate Datalog EDB