# relations-on-sx: Cross-domain relationship graph on Datalog rose-ash's internal `relations` service tracks parent/child and peer relationships *across* domains — a blog post's comments, a thread's replies, a product's variants, an order's line items, a resource's containment tree, a federated content's origin. The questions are graph questions: who are X's children? its ancestors? is A reachable from B? what's the chain that connects them? is there a cycle? That is recursive Datalog in one rule — the same bottom-up reachability `acl-on-sx` uses for group/resource inheritance. Decisions come with a **trace**: not just "yes, related," but the path that proves it. relations is an **internal-only** service (no public URL); other domains call it to resolve hierarchy and linkage. End-state: a Datalog-on-SX layer for typed relationship facts, with reachability, path explanation, cycle detection, and a federation extension for cross-instance links. Reuses `lib/datalog/` — does not reimplement the engine. ## Status (rolling) `bash lib/relations/conformance.sh` → **158/158** (Phases 1–4 complete + extensions) ## Ground rules - **Scope:** only `lib/relations/**` and `plans/relations-on-sx.md`. Do **not** edit `spec/`, `hosts/`, `shared/`, `lib/datalog/**`, or other `lib//`. You may **import** from `lib/datalog/` (public API in `lib/datalog/datalog.sx`); do not copy or modify Datalog. - **Shared-file issues** → "Blockers" with a minimal repro; do not fix here. - **SX files:** `sx-tree` MCP tools only; `sx_validate` after every edit. - **Architecture:** relationships are `rel(Src, Dst, Kind)` Datalog facts; reachability/ancestry are recursive rules; the proof tree is the connecting path; the lifecycle (assert/retract) is an SX layer over the db. Keep relations *content-agnostic* — a node is an opaque id string; domains own what ids mean. - **Shared with acl-sx:** both run on Datalog and both lean on the same recursive- reachability shape (`reach(X,Y) :- edge(X,Y).` / `reach(X,Y) :- edge(X,Z), reach(Z,Y).`). Watch for it; flag convergence in the Progress log, but **do not extract** — `plans/mod-on-sx.md` records why cross-subsystem extraction waits for the architecture integrator with all consumers in view. - **Commits:** one feature per commit. Keep Progress log updated and tick boxes. ## Architecture sketch ``` relate(src, dst, kind) query │ │ ▼ ▼ lib/relations/schema.sx lib/relations/engine.sx — rel(Src,Dst,Kind) facts — children/parents/ancestors/descendants — kind vocabulary — reachable?(A,B), cycle?(X) │ ▲ ▼ │ lib/relations/api.sx lib/relations/explain.sx — relate / unrelate — path(A,B): the connecting chain — registry over a live db (from the Datalog derivation) │ ▼ lib/relations/federation.sx — cross-instance links via fed-sx (replicated rel facts, peer-trust gated) ``` ## Phase 1 — Schema + direct relations - [x] `lib/relations/schema.sx` — `rel(Src, Dst, Kind)` fact projection; a small kind vocabulary (`parent`, `member`, `reply`, `variant`, `origin`, …) kept open - [x] `lib/relations/api.sx` — `(relations/relate src dst kind)` / `(unrelate …)` over a live Datalog db (assert/retract); `(children-of db node kind)`, `(parents-of db node kind)`, `(related db node kind)` - [x] `lib/relations/tests/direct.sx` — assert/retract, direct children/parents, kind filtering, unknown node → empty - [x] `lib/relations/conformance.sh` + scoreboard ## Phase 2 — Reachability + cycles - [x] recursive reachability rules: `ancestors`, `descendants`, `reachable?(A,B)` (transitive closure over a kind, the acl inheritance shape) - [x] `roots` / `leaves` (no parents / no children) for a kind - [x] cycle detection: `cycle?(X)` ⇔ `reachable(X, X)`; `acyclic?(db, kind)` - [x] `lib/relations/tests/reach.sx` — deep chains, diamonds, disconnected nodes, self-loops, multi-kind isolation ## Phase 3 — Typed relations + path explanation - [x] multiple kinds coexisting; mixed-kind vs single-kind reachability - [x] `lib/relations/explain.sx` — `(path db a b kind)` returns the connecting chain (the relationship equivalent of acl's proof tree), nil if unreachable - [x] `(distance db a b kind)` (hops) + shortest-path selection - [x] `lib/relations/tests/path.sx` — path correctness, shortest among many, no-path ## Phase 4 — Federation - [x] cross-instance relationships — a peer asserts `rel(local, remote, kind)`; replicate rel facts via fed-sx (mock the transport in tests) - [x] trust gating — a peer's link binds locally only under a local trust fact (mirror acl's non-transitive `trust`/gate-in-engine model; do NOT copy acl code, re-derive the shape) - [x] revocation — retract a replicated link; reachability re-saturates - [x] `lib/relations/tests/fed.sx` — federated reachability chains, trust gating, revocation ## Extensions (post-roadmap) - [x] **shape queries** — `siblings` (nodes sharing a parent), `out-degree`/ `in-degree`, weakly-connected `connected?` (undirected reachability). Computed in SX over the fast direct `erel` queries (BFS) — deliberately NOT added as Datalog closures, to keep the per-query saturation cheap. `lib/relations/tests/shape.sx`. - [x] **tree/DAG queries** — `common-ancestors` (ancestor-set intersection), `lca` (lowest common ancestors — a set; tree → singleton, DAG → may be several), `topo-order` (Kahn-style; nil for cyclic kinds). New `lib/relations/tree.sx`, computed in SX over `reach`/`ancestors`/`rnode`. `lib/relations/tests/tree.sx`. - [x] **route enumeration** — `all-paths` (all simple directed paths a→b, not just the shortest; cycle-safe DFS) in explain.sx. `lib/relations/tests/routes.sx`. - [x] **bulk lifecycle** — `relate-many!` (batch assert) + `unrelate-node!` (cascade cleanup: retract every local edge touching a node, all kinds, both directions — for domain object deletion; leaves federated peer links alone). api.sx, `lib/relations/tests/bulk.sx`. - [x] **weakly-connected components** — `component` (the undirected cluster of a node), `components` (partition of all nodes for a kind), `component-count`. In tree.sx, reusing `ureach-bfs`. `lib/relations/tests/comp.sx`. ## Progress log - **Extension: weakly-connected components** (158/158). `relations-component` (the undirected cluster containing a node = `ureach-bfs` from it), `relations-components` (greedy partition: pop a remaining node, take its component, repeat) and `relations-component-count`, in tree.sx, + `relations/...` wrappers. `lib/relations/tests/comp.sx` (11 tests: cluster from either end, self- loop as its own component, partition contents, count, kind isolation, api). Engine surface now feels SATURATED — base roadmap + 5 graph-algorithm extensions cover direct/transitive/undirected reach, paths (shortest + all routes), cycles, roots/leaves, siblings/degree, ancestors/LCA/topo, components, federation, and bulk lifecycle. Pacing down. - **Extension: bulk lifecycle** (147/147). `relations-relate-many!` (batch `dl-assert!` over a list of (src dst kind) triples) and `relations-unrelate-node!` (query `rel` for every edge with the node as src or dst, across all kinds, then `dl-retract!` each — the cascade-cleanup a domain needs when it deletes the object a node id names). Federated `peer_rel` links are a peer's assertion and are deliberately left untouched. + `relations/relate-many!`/`unrelate-node!` wrappers, `lib/relations/tests/bulk.sx` (12 tests: batch assert, cascade across kinds/both directions, unrelated edges preserved, unknown-node no-op, api layer). - **Extension: route enumeration** (135/135). `relations-all-paths(db,a,b,kind)` in explain.sx — every simple (no repeated node) directed path a→b, not just the shortest one `relations-path` returns; DFS that skips nodes already on the current path so cyclic data terminates; a=b → `((a))`, no route → `()`. Reuses engine's `relations-concat-map`/`-eng-member?`/`children-of`. + `relations/all-paths` wrapper, `lib/relations/tests/routes.sx` (9 tests: two-route diamond, single route, no route, self, route-through-cycle, route count, kind isolation). - **Extension: tree/DAG queries** (126/126). New `lib/relations/tree.sx`: `relations-common-ancestors` (intersection of the two ancestor sets), `relations-lca` (common ancestors with no other common ancestor reachable below them — a SET, since a DAG can have several lowest common ancestors; a tree gives one), `relations-topo-order` (Kahn-style level-by-level: place every node whose parents are all placed; nil for a cyclic kind) + `relations-nodes` (the `rnode` set) and `relations/...` wrappers. All in SX over the engine's fast queries — again no new Datalog closures. `tree.sx` (16 tests) covers diamond common ancestors, LCA on tree vs converging-DAG, no-common-ancestor, topo validity (parents precede children), and cyclic-kind → nil. - **Extension: shape queries** (110/110). Added `relations-siblings`, `relations-out-degree`/`-in-degree`, `relations-connected?` (+ `relations/...` current-db wrappers) and `shape.sx` (18 tests). Design note: an earlier attempt added `sibling`/`uedge`/`ureach` as Datalog rules in the global `relations-rules`; because every `dl-query` re-saturates the whole program, the extra recursive undirected closure taxed EVERY query in EVERY suite and the full run blew past 10 min. Reverted the ruleset to the Phase-4 set and compute these in SX instead: siblings = children-of(parents-of(node)) − node; connected? = undirected BFS expanding `relations-related` (children ∪ parents) per frontier with a visited set. No new saturation cost; other suites unaffected. NB: the full 110-test conformance takes several minutes under shared-machine contention (sibling loops) — run with `timeout 1200` in the background; individual suites run in seconds. - **Phase 4 — federation** (92/92). Re-derived acl's trust-gate shape (not copied). engine.sx now derives the whole engine from an EFFECTIVE relation `erel` rather than raw `rel`: `erel(S,D,K) :- rel(S,D,K)` (local, always) and `erel(S,D,K) :- peer_rel(P,S,D,K), trust(P)` (peer link, gated by a local trust fact). reach/reach_any/rnode/has_parent/has_child all read `erel`, and the direct-query fns moved into engine.sx to query `erel` too — so with no peer_rel/trust facts `erel ≡ rel` and Phases 1–3 are unchanged. Trust is a body literal, re-checked every saturation, so it is non-transitive (only a peer's own links bind, only under local trust(P)) and revocation is immediate. New federation.sx: `relations-peer-rel`/`relations-trust` constructors, a mock fed-sx transport (`relations-fed-fetch`/`-collect` over a peer→links dict), `relations-fed-build-db` (local facts + pulled peer links), and `relations-fed-assert!`/`relations-revoke!` over a live db. fed.sx covers untrusted-link-doesn't-bind, trusted-link-binds (child + federated reachability + connecting path through the federated edge), non-transitive trust (peerB's link inert without trust(peerB)), link revocation, trust revocation (local edge survives), transport pull with selective trust, and live fed-assert!. The shared recursive-reachability shape with acl is flagged (Phase 2 note); the trust-gate is the same convergence — still NOT extracted, per ground rules. - **Phase 3 — typed relations + path explanation** (70/70). New `explain.sx`: `relations-path(db,a,b,kind)` is relations' answer to acl's proof tree — the `reach(K,a,b)` derivation read off as the node chain. lib/datalog/ keeps no provenance, so the chain is re-derived breadth-first over the saturated edge set (`relations-children-of` per frontier node) so the returned path is a SHORTEST derivation; every consecutive pair is a real `rel` fact (no invented edges) and a visited set makes cyclic data terminate. `relations-distance` = hops (0 for a=b, nil if unreachable). Mixed-kind reachability added to engine.sx as a kind-agnostic `reach_any` closure (`relations-descendants-any`, `relations-reachable-any?`) — distinct from single-kind `reach`, so tests show a parent+member graph where a→m is reachable cross-kind but not under `parent` alone. api/explain grew `relations/path`, `/distance`, `/descendants-any`, `/reachable-any?` current-db wrappers. path.sx covers shortest-among-many (a→c→d beats a→b→c→d), direct edge, self path, no-path/disconnected, kind isolation in paths, mixed vs single kind, and path-out-of-a-cycle. Note: the dict-mode conformance driver has no per-suite timeout and the shared machine is contended by sibling loops — a full run can take a few minutes; the path suite alone runs in <1s. - **Phase 2 — reachability + cycles** (46/46). New `engine.sx` is one Datalog ruleset. Reachability is kind-parameterised — `reach(K,X,Y)` carries the kind as its first arg so a transitive walk over `parent` never leaks through `reply` edges (the acl inheritance shape, but closures can't cross kinds). `rnode` collects touched nodes; `root`/`leaf` use stratified negation over `has_parent`/`has_child`. Cycles are data, not errors: `cycle?(node,kind)` ⇔ `reach(K,node,node)` holds, `acyclic?(kind)` ⇔ no `reach(K,X,X)`. Engine fns: `relations-descendants/-ancestors/-reachable?/-roots/-leaves/-cycle?/-acyclic?`; api.sx grew matching `relations/...` current-db wrappers. `relations-rules` and `relations-pluck` moved from api.sx into engine.sx (engine now loads before api in conformance.conf). reach.sx covers diamonds, deep chains, disconnected components, self-loops, c1<->c2 cycles, and multi-kind isolation. acl convergence: the `reach(X,Y):-edge(X,Y)` / `reach(X,Y):-edge(X,Z),reach(Z,Y)` closure shape is shared with acl's eff_grant/eff_deny inheritance — flagged, not extracted (per ground rules). - **Phase 1 — schema + direct relations** (22/22). `schema.sx`: `rel(Src,Dst,Kind)` fact constructor + accessors, open kind vocabulary (`parent member reply variant origin link`), `relations-fact-valid?`/`relations-known-kind?`. `api.sx`: db built via `dl-program-data facts relations-rules` (Phase 1 rules empty — direct queries need none); `relations-children-of`/`-parents-of`/`-related` are plain `dl-query` on the `rel` relation, plucking the bound column from substitution dicts; current-db convenience layer (`relations/load!`, `relations/relate`, `relations/unrelate`, `relations/children`/`parents`/`related`) over `dl-assert!`/ `dl-retract!`, mirroring lib/acl/api.sx. Tests cover direct children/parents, leaf/ root empties, kind isolation (parent query skips reply edge), retract, the api layer, and schema/constructor shape. Note: query result order is nondeterministic — tests use an order-insensitive `set=?`. ## Blockers (loop fills this in)