Merge loops/acl into architecture: acl-on-sx Datalog ACL

Fine-grained, explainable, federation-aware access control as a thin layer over lib/datalog/. Four phases + hardening, 145/145 conformance: - Phase 1 direct grants, deny-overrides via stratified negation - Phase 2 inheritance (group/role member_of, resource child_of, role_grant) - Phase 3 explanation (proof-tree reconstruction) + append-only audit log - Phase 4 federation (trust-gated non-transitive delegation, revocation) - hardening: diamonds, cycles, multi-peer, validation, audit save/restore Surfaces the lib/guest/rules/ extraction seam (build-db/decide/explain/ revoke) for the second consumer (mod-on-sx). Records two substrate findings: append! no-ops on map-derived lists; JIT loops on deep proof reconstruction in warm processes (acl-explain only; acl-permit? unaffected). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 23:07:43 +00:00
parent 98f5e1bf14 9437f99e28
commit 4c8e732803
17 changed files with 1985 additions and 26 deletions
--- a/plans/acl-on-sx.md
+++ b/plans/acl-on-sx.md
@@ -15,7 +15,7 @@ and federation extension. Reuses `lib/datalog/` evaluator and term model where p

 ## Status (rolling)

-`bash lib/acl/conformance.sh` → **0/0** (not yet started)
+`bash lib/acl/conformance.sh` → **145/145** (all four phases + hardening)

 ## Ground rules

@@ -57,46 +57,225 @@ lib/acl/facts.sx                  — builds Datalog query

 ## Phase 1 — Direct grants

- [ ] `lib/acl/schema.sx` — sorts: subject {user, group, role, service}, action,
+- [x] `lib/acl/schema.sx` — sorts: subject {user, group, role, service}, action,
  resource {page, post, thread, peer}
- [ ] `lib/acl/facts.sx` — `actor`, `resource`, `grant`, `deny` predicates as Datalog
+- [x] `lib/acl/facts.sx` — `actor`, `resource`, `grant`, `deny` predicates as Datalog
  EDB
- [ ] `lib/acl/engine.sx` — `(permit? subj act res db)` reduces to Datalog query
- [ ] `lib/acl/api.sx` — public `(acl/permit? ...)` taking implicit current db
- [ ] `lib/acl/tests/direct.sx` — 15+ cases: direct grant, missing grant, explicit deny
- [ ] `lib/acl/scoreboard.{json,md}` baseline
- [ ] `lib/acl/conformance.sh` runs the suite
+- [x] `lib/acl/engine.sx` — `(permit? subj act res db)` reduces to Datalog query
+- [x] `lib/acl/api.sx` — public `(acl/permit? ...)` taking implicit current db
+- [x] `lib/acl/tests/direct.sx` — 15+ cases: direct grant, missing grant, explicit deny
+- [x] `lib/acl/scoreboard.{json,md}` baseline
+- [x] `lib/acl/conformance.sh` runs the suite

 ## Phase 2 — Inheritance

- [ ] `member_of(actor, group)` chain — group grants apply to members (transitive)
- [ ] `child_of(res, parent)` chain — parent grants apply to children (transitive)
- [ ] role expansion — role contains list of (action, resource) tuples
- [ ] deny-overrides — explicit deny wins over inherited allow
- [ ] `lib/acl/tests/inherit.sx` — 25+ cases: nested groups, deep resource trees,
+- [x] `member_of(actor, group)` chain — group grants apply to members (transitive)
+- [x] `child_of(res, parent)` chain — parent grants apply to children (transitive)
+- [x] role expansion — role contains list of (action, resource) tuples
+- [x] deny-overrides — explicit deny wins over inherited allow
+- [x] `lib/acl/tests/inherit.sx` — 25+ cases: nested groups, deep resource trees,
  conflict resolution, deny precedence
- [ ] document the deny-overrides choice in plan
+- [x] document the deny-overrides choice in plan
+
+### deny-overrides policy (the choice)
+
+Encoded as stratified negation: `permit(S,A,R) :- eff_grant(S,A,R), not
+eff_deny(S,A,R)`. Both `eff_grant` and `eff_deny` inherit through the *same*
+`member_of` (group/role) and `child_of` (resource) chains. Consequences:
+
+- An explicit deny on the exact (S,A,R) defeats any inherited allow.
+- A **group-level** or **ancestor-resource** deny inherits down and defeats a
+  member's/descendant's grant — deny is authoritative across the closure, not
+  only at the leaf. This is the fail-safe reading: the most permissive
+  interpretation of "deny wins" would let a narrow grant escape a broad deny;
+  we chose the opposite.
+- Deny is dimension-scoped: a deny on (S, edit, R) never blocks (S, read, R).
+
+Stratifiable because neither `eff_grant` nor `eff_deny` depends on `permit`;
+`permit` sits in a strictly higher stratum. Termination is guaranteed —
+recursion is only over EDB `member_of`/`child_of` (no function symbols), so
+cyclic membership/containment reaches a fixpoint rather than looping (tested).

 ## Phase 3 — Explanation + audit

- [ ] `(acl/explain subj act res)` → `{:allowed? T :proof <tree>}`
- [ ] proof tree extracts from Datalog's derivation
- [ ] `lib/acl/audit.sx` — append-only decision log (in-memory + serializer for disk)
- [ ] `(acl/audit-tail n)` for recent decisions
- [ ] `lib/acl/tests/explain.sx` — proof correctness, audit completeness
+- [x] `(acl/explain subj act res)` → `{:allowed? T :proof <tree>}`
+- [x] proof tree extracts from Datalog's derivation
+- [x] `lib/acl/audit.sx` — append-only decision log (in-memory + serializer for disk)
+- [x] `(acl/audit-tail n)` for recent decisions
+- [x] `lib/acl/tests/explain.sx` — proof correctness, audit completeness
+
+### proof reconstruction (the choice)
+
+`lib/datalog/` records derived facts but not provenance, so `lib/acl/explain.sx`
+reconstructs the proof by goal-directed search over the *saturated* db: for a
+ground goal, find the first ACL rule (in `acl-rules` order) whose body holds,
+take the first `dl-query` solution binding the rest, recurse on each body
+literal; negated literals become verified `:neg-ok` leaves. The Datalog
+derivation graph is a DAG (a fact may hold many ways) — we pick ONE **canonical
+proof: first-rule, first-solution**, with EDB/direct rules ordered first so
+proofs bottom out quickly. A depth cap (64) guards pathological cyclic data.
+`acl-explain` returns `{:allowed? :proof :reason}`; on denial `:reason` carries
+the blocking `eff_deny` proof (explicit or inherited) when one exists, else nil
+(no grant). Audit log is append-only with monotonic seq numbers (no wall-clock,
+for determinism); `acl-audit-decide!` is the logged path, `acl-permit?` stays
+pure.

 ## Phase 4 — Federation

- [ ] peer trust facts — `peer(addr, kind)`, `trust(peer, level)`
- [ ] delegated grants — `delegate(peer, actor, action, resource)`
- [ ] cross-instance permit chain — query asks local + queries trusted peers via fed-sx
- [ ] revocation propagation — fact retraction across federation
- [ ] `lib/acl/tests/fed.sx` — federated grant chains (mock fed-sx transport in tests)
+- [x] peer trust facts — `peer(addr, kind)`, `trust(peer, level)`
+- [x] delegated grants — `delegate(peer, actor, action, resource)`
+- [x] cross-instance permit chain — query asks local + queries trusted peers via fed-sx
+- [x] revocation propagation — fact retraction across federation
+- [x] `lib/acl/tests/fed.sx` — federated grant chains (mock fed-sx transport in tests)
+
+### federation policy (the choice)
+
+One engine rule carries federation:
+`eff_grant(S,A,R) :- delegate(Peer,S,A,R), trust(Peer,L), level_covers(L,A)`.
+
+- **Non-transitive trust.** A peer's `delegate` only grants if a *local*
+  `trust(Peer,L)` exists and that level `level_covers` the action. There is no
+  peer-to-peer trust propagation — trusting α never extends to peers α trusts.
+- **Trust re-checked every query.** `trust`/`level_covers` are body literals
+  evaluated at decision time, never baked in at ingestion. Revoking trust or
+  narrowing a level takes effect on the next `acl-permit?`.
+- **Deny still wins.** Federated grants are `eff_grant`, so local (and inherited)
+  deny overrides them exactly as for local grants.
+- **Composes with inheritance.** A delegate to a group flows to members; a
+  delegate on a parent resource flows to children (federated `eff_grant` feeds
+  the same recursion).
+- **Revocation = retraction.** `acl-revoke!` wraps `dl-retract!`; the next query
+  re-saturates. `acl-fed-assert!` wraps `dl-assert!` for newly-replicated facts.
+- **Transport is fed-sx's job.** `lib/acl/federation.sx` mocks the pull as a
+  dict {peer-addr → delegate-facts}; `acl-fed-build-db` merges local policy +
+  pulled delegates.

 ## Progress log

-(loop fills this in)
+- **Phase 1 complete (24/24).** ACL is a thin layer over `lib/datalog/`:
+  - `schema.sx` — sorts (subject/resource kinds, well-known actions) + EDB
+    predicate arity table + `acl-fact-valid?` validator. Schema is data, since
+    Datalog is untyped.
+  - `facts.sx` — `acl-actor`/`acl-resource-fact`/`acl-grant`/`acl-deny`
+    constructors returning Datalog fact tuples.
+  - `engine.sx` — owns the ruleset `acl-phase1-rules` and reduces decisions to
+    `dl-query`. `acl-build-db` = `dl-program-data facts rules`; `acl-permit?` =
+    non-empty `(permit S A R)` query.
+  - `api.sx` — `acl/load!` rebuilds an implicit current db; `acl/permit?` queries
+    it. (Slash-symbols like `acl/permit?` parse fine as single tokens.)
+  - **deny-overrides** encoded as `permit(S,A,R) :- grant(S,A,R), not deny(S,A,R)`.
+    Stratifies cleanly because `deny` is EDB-only (no rule derives it). Verified:
+    grant+deny on same (S,A,R) → denied.
+  - Conformance: `conformance.conf` (datalog preloads + acl modules) + thin
+    `conformance.sh` wrapper over `lib/guest/conformance.sh`. Scoreboard
+    generated by the shared driver.
+  - **Shared-plumbing note (for eventual `lib/guest/rules/`):** the
+    `build-db = dl-program-data(facts, rules)` + `decide = non-empty ground query`
+    shape is exactly what mod-sx (Prolog moderation) will also need. The reusable
+    seam is engine.sx's two functions — facts→db and ground-query→bool — both
+    pure pass-throughs to the rule engine. Not extracting yet (wait for mod-sx as
+    second consumer per ground rules).
+- **Phase 2 complete (54/54, +30 inherit).** Extended `acl-rules` with
+  `eff_grant`/`eff_deny` derived relations; `member_of` carries both group and
+  role membership, `child_of` carries resource trees, `role_grant` confers a
+  role's (action,resource) capabilities. Direct grants unchanged (base case of
+  `eff_grant`), Phase 1 suite still green. Constructors `acl-member-of`,
+  `acl-child-of`, `acl-role-grant` added; schema arity table extended. See the
+  deny-overrides policy section above. Verified cyclic membership terminates.
+  - **Shared-plumbing update:** the reusable seam is still just engine.sx's
+    `facts -> db` + `ground-query -> bool`. The inheritance *rules* are
+    ACL-specific (group/resource/role vocabulary); mod-sx will have its own. So
+    the `lib/guest/rules/` extraction stays at the build/decide level, not the
+    ruleset level.
+- **Phase 3 complete (89/89, +35 explain).** Added `explain.sx` (proof
+  reconstruction, see policy section above), `audit.sx` (append-only log), and
+  extended `api.sx` with `acl/explain`/`acl/audit`/`acl/audit-tail`. No engine
+  changes — explanation reads the same saturated db the decision uses.
+  - **Substrate gotcha:** the host `=` compares symbols by interned identity,
+    which is *unstable* across `dl-query` saturation/substitution within a
+    single image — the same two structurally-equal symbol-lists compared `=`
+    true once and false moments later in the REPL. Conformance runs in a fresh
+    process per suite so it's deterministic there, but test assertions now use a
+    name-based `acl-et-eq?` (compare symbols via `symbol->string`), matching the
+    datalog suite's `dl-api-deep=?` convention. Worth flagging to the kernel
+    owners but out of acl scope.
+- **Phase 4 complete (120/120, +31 fed).** Added `federation.sx` (mock
+  transport + `acl-fed-build-db`/`acl-revoke!`/`acl-fed-assert!`), one engine
+  rule (the trust-gated delegate rule), 4 fact constructors, 4 schema arities.
+  Federated proofs reconstruct for free — `explain.sx` iterates `acl-rules`, so
+  the delegate rule's EDB body (`delegate`/`trust`/`level_covers`) shows up as
+  proof leaves with no explain changes. **Roadmap done: all four phases green.**
+  - **Shared-plumbing final note (for `lib/guest/rules/`):** the durable
+    reusable seam across acl-sx and the coming mod-sx is exactly four
+    pass-throughs to the rule engine — `build-db(facts)`, `decide(ground-query)
+    -> bool`, `explain(goal) -> proof-tree`, and the revoke/assert mutators.
+    The *rulesets* and *vocabulary* are language-specific (ACL: grant/deny/
+    member_of/...; mod-sx: Prolog moderation predicates). When mod-sx lands,
+    extract those four functions (engine.sx + the generic half of explain.sx's
+    goal-directed reconstructor) into `lib/guest/rules/`, leaving each consumer
+    its own rules + fact constructors. Proof reconstruction is the non-obvious
+    reusable piece: it only needs the ruleset as data + a saturated db, both of
+    which any datalog-backed guest has.
+  - **dict-mode conformance is slow, not hung:** all suites load + run in one
+    process (~30-40s for 120 tests, no per-suite timeout). Do not kill early.
+  - **Tooling note:** sx-tree path-based edit tools (`sx_replace_node`,
+    `sx_read_subtree` with a path) ignored the path argument in this worktree
+    (always resolved to index 0 / `[0,..]`), in BOTH `(a b c)` and `(a,b,c)`
+    forms. `sx_write_file`, `sx_validate`, `sx_find_all`, `sx_summarise`,
+    `sx_eval` all work; used full-file rewrites instead of path edits throughout.
+
+## Hardening (post-roadmap)
+
+- **`lib/acl/tests/harden.sx` (+25).** Adversarial / cross-phase coverage beyond
+  the per-phase suites. **Prover-free by design** (see JIT blocker below): only
+  `acl-permit?` (compiled Datalog, safe at any depth) + pure data ops, never
+  `acl-explain`/`acl-prove-d`.
+  - Diamond hierarchies (resource and group): grant on one path + deny on
+    another → deny wins; both-grant → permit; deny does not leak to siblings.
+  - Chain inheritance (literal 4-link): top-group grant reaches leaf member and
+    intermediates; leaf-member deny overrides the top grant; deny on the leaf
+    doesn't block a higher level.
+  - Cycle termination: self-membership, self-child, and 2-node membership cycles
+    all reach a fixpoint and decide correctly.
+  - Federation conflicts: federated group-grant with a locally-denied member →
+    deny; multi-peer delegation (one trusted, or both trusted) → permit.
+  - Degenerate inputs: empty db permits nothing.
+  - Fact validation: `acl-validate-facts` surfaces wrong-arity + unknown
+    predicates; `acl-facts-valid?` on clean/empty sets.
+  - Audit save/restore: snapshot → clear → restore round-trips entries + seq;
+    seq continues without collision after restore; snapshot is an immutable copy.
+  - Proof reconstruction itself is covered by tests/explain.sx + tests/fed.sx
+    (both stay under the warm-process JIT depth threshold); the depth-cap safety
+    net is verified manually in a warm REPL image but excluded from conformance.
+- **New API:** `acl-validate-facts`/`acl-facts-valid?` (schema.sx, opt-in — build
+  stays lenient); `acl-audit-snapshot`/`acl-audit-restore!`/`acl-audit-copy`
+  (audit.sx).
+- **Substrate gotcha (recorded):** `append!` extends a list built with `(list)`
+  but **silently no-ops on a `map`/`rest`-derived list** in this runtime. Bit the
+  first cut of `acl-audit-restore!` (rebuilt the live log via `map`, so later
+  records didn't append). Fix: always rebuild mutable lists via `(list)` +
+  `append!` (`acl-audit-copy`). Worth flagging to kernel owners; out of acl scope.

 ## Blockers

-(loop fills this in)
+- **JIT loops on deep proof reconstruction (substrate, not acl).** Once the
+  kernel JIT-compiles the mutually-recursive prover (`acl-prove-d`/
+  `acl-prove-rules`/`acl-prove-build` in `explain.sx`) — which happens after a
+  process has run enough explains to cross the compile threshold — the compiled
+  version **loops indefinitely** on a `member_of`/`child_of` chain deeper than
+  ~3. Symptoms: `acl-explain` over a 4+-deep chain returns instantly in a cold /
+  warm-REPL image but **hangs** in a long-lived process. The per-phase explain
+  and fed suites pass only because their proofs stay ≤3 deep; a 5th suite that
+  explained deeper chains hung the whole conformance run (no per-suite timeout
+  in dict mode). Matches `[[project_jit_bytecode_bug]]` (ACTIVE).
+  - *Impact beyond tests:* `acl-explain` is unsafe for deep hierarchies in a
+    warm production OCaml server. `acl-permit?` is unaffected (it reduces to
+    compiled Datalog, no SX-side recursion) — only the SX proof reconstructor is.
+  - *Workaround in acl:* harden suite is prover-free; explain is exercised only
+    at shallow depth. *Real fix is in the kernel JIT* (out of acl scope) — e.g.
+    the `_jit_compiling` guard / disabling JIT for the recursive prover, or
+    fixing the bytecode loop. Recommend the kernel owners reproduce with:
+    `acl-explain` over a 6-deep `member_of` chain after ~70 prior explains.
+  - *Minimal repro recorded.* Until fixed, callers needing explanations for
+    deep hierarchies should flatten or cap depth, or run explain in a cold
+    worker.