Files
rose-ash/plans/relations-on-sx.md
giles f1d65c0953
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 27s
relations: weakly-connected components (component, components partition, count) + 11 tests
tree.sx, reuses ureach-bfs. 158/158 across 9 suites.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 13:43:20 +00:00

238 lines
15 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# relations-on-sx: Cross-domain relationship graph on Datalog
rose-ash's internal `relations` service tracks parent/child and peer relationships
*across* domains — a blog post's comments, a thread's replies, a product's
variants, an order's line items, a resource's containment tree, a federated
content's origin. The questions are graph questions: who are X's children? its
ancestors? is A reachable from B? what's the chain that connects them? is there a
cycle?
That is recursive Datalog in one rule — the same bottom-up reachability `acl-on-sx`
uses for group/resource inheritance. Decisions come with a **trace**: not just
"yes, related," but the path that proves it. relations is an **internal-only**
service (no public URL); other domains call it to resolve hierarchy and linkage.
End-state: a Datalog-on-SX layer for typed relationship facts, with reachability,
path explanation, cycle detection, and a federation extension for cross-instance
links. Reuses `lib/datalog/` — does not reimplement the engine.
## Status (rolling)
`bash lib/relations/conformance.sh`**158/158** (Phases 14 complete + extensions)
## Ground rules
- **Scope:** only `lib/relations/**` and `plans/relations-on-sx.md`. Do **not** edit
`spec/`, `hosts/`, `shared/`, `lib/datalog/**`, or other `lib/<lang>/`. You may
**import** from `lib/datalog/` (public API in `lib/datalog/datalog.sx`); do not
copy or modify Datalog.
- **Shared-file issues** → "Blockers" with a minimal repro; do not fix here.
- **SX files:** `sx-tree` MCP tools only; `sx_validate` after every edit.
- **Architecture:** relationships are `rel(Src, Dst, Kind)` Datalog facts;
reachability/ancestry are recursive rules; the proof tree is the connecting path;
the lifecycle (assert/retract) is an SX layer over the db. Keep relations
*content-agnostic* — a node is an opaque id string; domains own what ids mean.
- **Shared with acl-sx:** both run on Datalog and both lean on the same recursive-
reachability shape (`reach(X,Y) :- edge(X,Y).` / `reach(X,Y) :- edge(X,Z), reach(Z,Y).`).
Watch for it; flag convergence in the Progress log, but **do not extract**
`plans/mod-on-sx.md` records why cross-subsystem extraction waits for the
architecture integrator with all consumers in view.
- **Commits:** one feature per commit. Keep Progress log updated and tick boxes.
## Architecture sketch
```
relate(src, dst, kind) query
│ │
▼ ▼
lib/relations/schema.sx lib/relations/engine.sx
— rel(Src,Dst,Kind) facts — children/parents/ancestors/descendants
— kind vocabulary — reachable?(A,B), cycle?(X)
│ ▲
▼ │
lib/relations/api.sx lib/relations/explain.sx
— relate / unrelate — path(A,B): the connecting chain
— registry over a live db (from the Datalog derivation)
lib/relations/federation.sx
— cross-instance links via fed-sx (replicated rel facts, peer-trust gated)
```
## Phase 1 — Schema + direct relations
- [x] `lib/relations/schema.sx``rel(Src, Dst, Kind)` fact projection; a small
kind vocabulary (`parent`, `member`, `reply`, `variant`, `origin`, …) kept open
- [x] `lib/relations/api.sx``(relations/relate src dst kind)` / `(unrelate …)`
over a live Datalog db (assert/retract); `(children-of db node kind)`,
`(parents-of db node kind)`, `(related db node kind)`
- [x] `lib/relations/tests/direct.sx` — assert/retract, direct children/parents,
kind filtering, unknown node → empty
- [x] `lib/relations/conformance.sh` + scoreboard
## Phase 2 — Reachability + cycles
- [x] recursive reachability rules: `ancestors`, `descendants`, `reachable?(A,B)`
(transitive closure over a kind, the acl inheritance shape)
- [x] `roots` / `leaves` (no parents / no children) for a kind
- [x] cycle detection: `cycle?(X)``reachable(X, X)`; `acyclic?(db, kind)`
- [x] `lib/relations/tests/reach.sx` — deep chains, diamonds, disconnected nodes,
self-loops, multi-kind isolation
## Phase 3 — Typed relations + path explanation
- [x] multiple kinds coexisting; mixed-kind vs single-kind reachability
- [x] `lib/relations/explain.sx``(path db a b kind)` returns the connecting
chain (the relationship equivalent of acl's proof tree), nil if unreachable
- [x] `(distance db a b kind)` (hops) + shortest-path selection
- [x] `lib/relations/tests/path.sx` — path correctness, shortest among many, no-path
## Phase 4 — Federation
- [x] cross-instance relationships — a peer asserts `rel(local, remote, kind)`;
replicate rel facts via fed-sx (mock the transport in tests)
- [x] trust gating — a peer's link binds locally only under a local trust fact
(mirror acl's non-transitive `trust`/gate-in-engine model; do NOT copy acl code,
re-derive the shape)
- [x] revocation — retract a replicated link; reachability re-saturates
- [x] `lib/relations/tests/fed.sx` — federated reachability chains, trust gating,
revocation
## Extensions (post-roadmap)
- [x] **shape queries**`siblings` (nodes sharing a parent), `out-degree`/
`in-degree`, weakly-connected `connected?` (undirected reachability). Computed in
SX over the fast direct `erel` queries (BFS) — deliberately NOT added as Datalog
closures, to keep the per-query saturation cheap. `lib/relations/tests/shape.sx`.
- [x] **tree/DAG queries**`common-ancestors` (ancestor-set intersection), `lca`
(lowest common ancestors — a set; tree → singleton, DAG → may be several),
`topo-order` (Kahn-style; nil for cyclic kinds). New `lib/relations/tree.sx`,
computed in SX over `reach`/`ancestors`/`rnode`. `lib/relations/tests/tree.sx`.
- [x] **route enumeration**`all-paths` (all simple directed paths a→b, not just
the shortest; cycle-safe DFS) in explain.sx. `lib/relations/tests/routes.sx`.
- [x] **bulk lifecycle**`relate-many!` (batch assert) + `unrelate-node!` (cascade
cleanup: retract every local edge touching a node, all kinds, both directions —
for domain object deletion; leaves federated peer links alone). api.sx,
`lib/relations/tests/bulk.sx`.
- [x] **weakly-connected components**`component` (the undirected cluster of a
node), `components` (partition of all nodes for a kind), `component-count`. In
tree.sx, reusing `ureach-bfs`. `lib/relations/tests/comp.sx`.
## Progress log
- **Extension: weakly-connected components** (158/158). `relations-component`
(the undirected cluster containing a node = `ureach-bfs` from it),
`relations-components` (greedy partition: pop a remaining node, take its
component, repeat) and `relations-component-count`, in tree.sx, + `relations/...`
wrappers. `lib/relations/tests/comp.sx` (11 tests: cluster from either end, self-
loop as its own component, partition contents, count, kind isolation, api).
Engine surface now feels SATURATED — base roadmap + 5 graph-algorithm extensions
cover direct/transitive/undirected reach, paths (shortest + all routes), cycles,
roots/leaves, siblings/degree, ancestors/LCA/topo, components, federation, and
bulk lifecycle. Pacing down.
- **Extension: bulk lifecycle** (147/147). `relations-relate-many!` (batch
`dl-assert!` over a list of (src dst kind) triples) and `relations-unrelate-node!`
(query `rel` for every edge with the node as src or dst, across all kinds, then
`dl-retract!` each — the cascade-cleanup a domain needs when it deletes the
object a node id names). Federated `peer_rel` links are a peer's assertion and
are deliberately left untouched. + `relations/relate-many!`/`unrelate-node!`
wrappers, `lib/relations/tests/bulk.sx` (12 tests: batch assert, cascade across
kinds/both directions, unrelated edges preserved, unknown-node no-op, api layer).
- **Extension: route enumeration** (135/135). `relations-all-paths(db,a,b,kind)`
in explain.sx — every simple (no repeated node) directed path a→b, not just the
shortest one `relations-path` returns; DFS that skips nodes already on the
current path so cyclic data terminates; a=b → `((a))`, no route → `()`. Reuses
engine's `relations-concat-map`/`-eng-member?`/`children-of`. + `relations/all-paths`
wrapper, `lib/relations/tests/routes.sx` (9 tests: two-route diamond, single
route, no route, self, route-through-cycle, route count, kind isolation).
- **Extension: tree/DAG queries** (126/126). New `lib/relations/tree.sx`:
`relations-common-ancestors` (intersection of the two ancestor sets),
`relations-lca` (common ancestors with no other common ancestor reachable below
them — a SET, since a DAG can have several lowest common ancestors; a tree gives
one), `relations-topo-order` (Kahn-style level-by-level: place every node whose
parents are all placed; nil for a cyclic kind) + `relations-nodes` (the `rnode`
set) and `relations/...` wrappers. All in SX over the engine's fast queries —
again no new Datalog closures. `tree.sx` (16 tests) covers diamond common
ancestors, LCA on tree vs converging-DAG, no-common-ancestor, topo validity
(parents precede children), and cyclic-kind → nil.
- **Extension: shape queries** (110/110). Added `relations-siblings`,
`relations-out-degree`/`-in-degree`, `relations-connected?` (+ `relations/...`
current-db wrappers) and `shape.sx` (18 tests). Design note: an earlier attempt
added `sibling`/`uedge`/`ureach` as Datalog rules in the global `relations-rules`;
because every `dl-query` re-saturates the whole program, the extra recursive
undirected closure taxed EVERY query in EVERY suite and the full run blew past
10 min. Reverted the ruleset to the Phase-4 set and compute these in SX instead:
siblings = children-of(parents-of(node)) node; connected? = undirected BFS
expanding `relations-related` (children parents) per frontier with a visited
set. No new saturation cost; other suites unaffected. NB: the full 110-test
conformance takes several minutes under shared-machine contention (sibling loops)
— run with `timeout 1200` in the background; individual suites run in seconds.
- **Phase 4 — federation** (92/92). Re-derived acl's trust-gate shape (not
copied). engine.sx now derives the whole engine from an EFFECTIVE relation
`erel` rather than raw `rel`: `erel(S,D,K) :- rel(S,D,K)` (local, always) and
`erel(S,D,K) :- peer_rel(P,S,D,K), trust(P)` (peer link, gated by a local trust
fact). reach/reach_any/rnode/has_parent/has_child all read `erel`, and the
direct-query fns moved into engine.sx to query `erel` too — so with no
peer_rel/trust facts `erel ≡ rel` and Phases 13 are unchanged. Trust is a body
literal, re-checked every saturation, so it is non-transitive (only a peer's own
links bind, only under local trust(P)) and revocation is immediate. New
federation.sx: `relations-peer-rel`/`relations-trust` constructors, a mock
fed-sx transport (`relations-fed-fetch`/`-collect` over a peer→links dict),
`relations-fed-build-db` (local facts + pulled peer links), and
`relations-fed-assert!`/`relations-revoke!` over a live db. fed.sx covers
untrusted-link-doesn't-bind, trusted-link-binds (child + federated reachability
+ connecting path through the federated edge), non-transitive trust (peerB's
link inert without trust(peerB)), link revocation, trust revocation (local edge
survives), transport pull with selective trust, and live fed-assert!. The shared
recursive-reachability shape with acl is flagged (Phase 2 note); the trust-gate
is the same convergence — still NOT extracted, per ground rules.
- **Phase 3 — typed relations + path explanation** (70/70). New `explain.sx`:
`relations-path(db,a,b,kind)` is relations' answer to acl's proof tree — the
`reach(K,a,b)` derivation read off as the node chain. lib/datalog/ keeps no
provenance, so the chain is re-derived breadth-first over the saturated edge set
(`relations-children-of` per frontier node) so the returned path is a SHORTEST
derivation; every consecutive pair is a real `rel` fact (no invented edges) and
a visited set makes cyclic data terminate. `relations-distance` = hops (0 for
a=b, nil if unreachable). Mixed-kind reachability added to engine.sx as a
kind-agnostic `reach_any` closure (`relations-descendants-any`,
`relations-reachable-any?`) — distinct from single-kind `reach`, so tests show a
parent+member graph where a→m is reachable cross-kind but not under `parent`
alone. api/explain grew `relations/path`, `/distance`, `/descendants-any`,
`/reachable-any?` current-db wrappers. path.sx covers shortest-among-many
(a→c→d beats a→b→c→d), direct edge, self path, no-path/disconnected, kind
isolation in paths, mixed vs single kind, and path-out-of-a-cycle. Note: the
dict-mode conformance driver has no per-suite timeout and the shared machine is
contended by sibling loops — a full run can take a few minutes; the path suite
alone runs in <1s.
- **Phase 2 — reachability + cycles** (46/46). New `engine.sx` is one Datalog
ruleset. Reachability is kind-parameterised — `reach(K,X,Y)` carries the kind as
its first arg so a transitive walk over `parent` never leaks through `reply`
edges (the acl inheritance shape, but closures can't cross kinds). `rnode`
collects touched nodes; `root`/`leaf` use stratified negation over
`has_parent`/`has_child`. Cycles are data, not errors: `cycle?(node,kind)`
`reach(K,node,node)` holds, `acyclic?(kind)` ⇔ no `reach(K,X,X)`. Engine fns:
`relations-descendants/-ancestors/-reachable?/-roots/-leaves/-cycle?/-acyclic?`;
api.sx grew matching `relations/...` current-db wrappers. `relations-rules` and
`relations-pluck` moved from api.sx into engine.sx (engine now loads before api
in conformance.conf). reach.sx covers diamonds, deep chains, disconnected
components, self-loops, c1<->c2 cycles, and multi-kind isolation. acl
convergence: the `reach(X,Y):-edge(X,Y)` / `reach(X,Y):-edge(X,Z),reach(Z,Y)`
closure shape is shared with acl's eff_grant/eff_deny inheritance — flagged, not
extracted (per ground rules).
- **Phase 1 — schema + direct relations** (22/22). `schema.sx`: `rel(Src,Dst,Kind)`
fact constructor + accessors, open kind vocabulary (`parent member reply variant
origin link`), `relations-fact-valid?`/`relations-known-kind?`. `api.sx`: db built
via `dl-program-data facts relations-rules` (Phase 1 rules empty — direct queries
need none); `relations-children-of`/`-parents-of`/`-related` are plain `dl-query`
on the `rel` relation, plucking the bound column from substitution dicts;
current-db convenience layer (`relations/load!`, `relations/relate`,
`relations/unrelate`, `relations/children`/`parents`/`related`) over `dl-assert!`/
`dl-retract!`, mirroring lib/acl/api.sx. Tests cover direct children/parents, leaf/
root empties, kind isolation (parent query skips reply edge), retract, the api
layer, and schema/constructor shape. Note: query result order is nondeterministic
— tests use an order-insensitive `set=?`.
## Blockers
(loop fills this in)