Files
rose-ash/plans/acl-on-sx.md
giles 9437f99e28
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 35s
acl: hardening suite (+25) — diamonds, cycles, validation, audit save/restore
New adversarial/cross-phase coverage: diamond resource+group hierarchies
(deny wins per path), chain inheritance + leaf deny, cycle termination,
multi-peer delegation, fact validation, audit snapshot/restore round-trip.
Adds acl-validate-facts/acl-facts-valid? (schema) and acl-audit-snapshot/
restore!/copy (audit). Fixed acl-audit-restore! rebuilding the live log via
map (append! silently no-ops on map-derived lists).

Suite is prover-free: a substrate JIT bug loops the recursive proof
reconstructor on deep chains in warm processes (documented in Blockers);
acl-permit? is unaffected. 145/145.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 22:44:28 +00:00

16 KiB
Raw Blame History

acl-on-sx: Access Control on Datalog

rose-ash needs fine-grained, explainable, federation-aware access control. Subjects (users, groups, roles, services) × actions (read, edit, comment, moderate, federate) × resources (pages, posts, threads, peers). Decisions must come with a trace — not just permit/deny, but why.

Datalog's bottom-up rule engine produces transparent permit/deny chains: the proof tree is the audit trail. Inheritance over groups + resource hierarchies is recursive Datalog in one rule. Federation extends naturally — fed-sx replicates ACL facts, peers reason over the union.

End-state: a Datalog-on-SX layer specifically for ACL, with explanation API, audit log, and federation extension. Reuses lib/datalog/ evaluator and term model where possible.

Status (rolling)

bash lib/acl/conformance.sh145/145 (all four phases + hardening)

Ground rules

  • Scope: only touch lib/acl/** and plans/acl-on-sx.md. Do not edit spec/, hosts/, shared/, lib/datalog/**, or other lib/<lang>/. You may import from lib/datalog/ (its public API in lib/datalog/datalog.sx); do not copy or modify Datalog code.
  • Shared-file issues go under "Blockers" with a minimal repro; do not fix here.
  • SX files: use sx-tree MCP tools only.
  • Architecture: thin layer on top of lib/datalog/. Define schema, surface API, audit + federation hooks. The rule engine itself is Datalog's.
  • Watch for shared patterns going into lib/guest/ — both acl-sx and mod-sx need rule-engine plumbing. If you find shared shape, flag it for extraction (don't extract yet — wait for mod-sx to start).
  • Commits: one feature per commit. Keep Progress log updated and tick boxes.

Architecture sketch

ACL declarations (SX)            User query
        │                            │
        ▼                            ▼
lib/acl/schema.sx             lib/acl/api.sx
  — subject sorts                — (acl/permit? subj act res)
  — resource sorts               — (acl/explain subj act res)
  — action sorts                 — (acl/audit subj act res :allowed?)
  — fact schema                       │
        │                             ▼
        ▼                       lib/acl/engine.sx
lib/acl/facts.sx                  — builds Datalog query
  — actor(id, kind)               — invokes lib/datalog/
  — resource(id, kind)            — extracts proof tree
  — member_of(actor, group)            │
  — child_of(res, parent)              ▼
  — grant(actor, act, res)        lib/acl/audit.sx
  — deny (actor, act, res)          — persistent decision log
                                    — query API

Phase 1 — Direct grants

  • lib/acl/schema.sx — sorts: subject {user, group, role, service}, action, resource {page, post, thread, peer}
  • lib/acl/facts.sxactor, resource, grant, deny predicates as Datalog EDB
  • lib/acl/engine.sx(permit? subj act res db) reduces to Datalog query
  • lib/acl/api.sx — public (acl/permit? ...) taking implicit current db
  • lib/acl/tests/direct.sx — 15+ cases: direct grant, missing grant, explicit deny
  • lib/acl/scoreboard.{json,md} baseline
  • lib/acl/conformance.sh runs the suite

Phase 2 — Inheritance

  • member_of(actor, group) chain — group grants apply to members (transitive)
  • child_of(res, parent) chain — parent grants apply to children (transitive)
  • role expansion — role contains list of (action, resource) tuples
  • deny-overrides — explicit deny wins over inherited allow
  • lib/acl/tests/inherit.sx — 25+ cases: nested groups, deep resource trees, conflict resolution, deny precedence
  • document the deny-overrides choice in plan

deny-overrides policy (the choice)

Encoded as stratified negation: permit(S,A,R) :- eff_grant(S,A,R), not eff_deny(S,A,R). Both eff_grant and eff_deny inherit through the same member_of (group/role) and child_of (resource) chains. Consequences:

  • An explicit deny on the exact (S,A,R) defeats any inherited allow.
  • A group-level or ancestor-resource deny inherits down and defeats a member's/descendant's grant — deny is authoritative across the closure, not only at the leaf. This is the fail-safe reading: the most permissive interpretation of "deny wins" would let a narrow grant escape a broad deny; we chose the opposite.
  • Deny is dimension-scoped: a deny on (S, edit, R) never blocks (S, read, R).

Stratifiable because neither eff_grant nor eff_deny depends on permit; permit sits in a strictly higher stratum. Termination is guaranteed — recursion is only over EDB member_of/child_of (no function symbols), so cyclic membership/containment reaches a fixpoint rather than looping (tested).

Phase 3 — Explanation + audit

  • (acl/explain subj act res){:allowed? T :proof <tree>}
  • proof tree extracts from Datalog's derivation
  • lib/acl/audit.sx — append-only decision log (in-memory + serializer for disk)
  • (acl/audit-tail n) for recent decisions
  • lib/acl/tests/explain.sx — proof correctness, audit completeness

proof reconstruction (the choice)

lib/datalog/ records derived facts but not provenance, so lib/acl/explain.sx reconstructs the proof by goal-directed search over the saturated db: for a ground goal, find the first ACL rule (in acl-rules order) whose body holds, take the first dl-query solution binding the rest, recurse on each body literal; negated literals become verified :neg-ok leaves. The Datalog derivation graph is a DAG (a fact may hold many ways) — we pick ONE canonical proof: first-rule, first-solution, with EDB/direct rules ordered first so proofs bottom out quickly. A depth cap (64) guards pathological cyclic data. acl-explain returns {:allowed? :proof :reason}; on denial :reason carries the blocking eff_deny proof (explicit or inherited) when one exists, else nil (no grant). Audit log is append-only with monotonic seq numbers (no wall-clock, for determinism); acl-audit-decide! is the logged path, acl-permit? stays pure.

Phase 4 — Federation

  • peer trust facts — peer(addr, kind), trust(peer, level)
  • delegated grants — delegate(peer, actor, action, resource)
  • cross-instance permit chain — query asks local + queries trusted peers via fed-sx
  • revocation propagation — fact retraction across federation
  • lib/acl/tests/fed.sx — federated grant chains (mock fed-sx transport in tests)

federation policy (the choice)

One engine rule carries federation: eff_grant(S,A,R) :- delegate(Peer,S,A,R), trust(Peer,L), level_covers(L,A).

  • Non-transitive trust. A peer's delegate only grants if a local trust(Peer,L) exists and that level level_covers the action. There is no peer-to-peer trust propagation — trusting α never extends to peers α trusts.
  • Trust re-checked every query. trust/level_covers are body literals evaluated at decision time, never baked in at ingestion. Revoking trust or narrowing a level takes effect on the next acl-permit?.
  • Deny still wins. Federated grants are eff_grant, so local (and inherited) deny overrides them exactly as for local grants.
  • Composes with inheritance. A delegate to a group flows to members; a delegate on a parent resource flows to children (federated eff_grant feeds the same recursion).
  • Revocation = retraction. acl-revoke! wraps dl-retract!; the next query re-saturates. acl-fed-assert! wraps dl-assert! for newly-replicated facts.
  • Transport is fed-sx's job. lib/acl/federation.sx mocks the pull as a dict {peer-addr → delegate-facts}; acl-fed-build-db merges local policy + pulled delegates.

Progress log

  • Phase 1 complete (24/24). ACL is a thin layer over lib/datalog/:
    • schema.sx — sorts (subject/resource kinds, well-known actions) + EDB predicate arity table + acl-fact-valid? validator. Schema is data, since Datalog is untyped.
    • facts.sxacl-actor/acl-resource-fact/acl-grant/acl-deny constructors returning Datalog fact tuples.
    • engine.sx — owns the ruleset acl-phase1-rules and reduces decisions to dl-query. acl-build-db = dl-program-data facts rules; acl-permit? = non-empty (permit S A R) query.
    • api.sxacl/load! rebuilds an implicit current db; acl/permit? queries it. (Slash-symbols like acl/permit? parse fine as single tokens.)
    • deny-overrides encoded as permit(S,A,R) :- grant(S,A,R), not deny(S,A,R). Stratifies cleanly because deny is EDB-only (no rule derives it). Verified: grant+deny on same (S,A,R) → denied.
    • Conformance: conformance.conf (datalog preloads + acl modules) + thin conformance.sh wrapper over lib/guest/conformance.sh. Scoreboard generated by the shared driver.
    • Shared-plumbing note (for eventual lib/guest/rules/): the build-db = dl-program-data(facts, rules) + decide = non-empty ground query shape is exactly what mod-sx (Prolog moderation) will also need. The reusable seam is engine.sx's two functions — facts→db and ground-query→bool — both pure pass-throughs to the rule engine. Not extracting yet (wait for mod-sx as second consumer per ground rules).
  • Phase 2 complete (54/54, +30 inherit). Extended acl-rules with eff_grant/eff_deny derived relations; member_of carries both group and role membership, child_of carries resource trees, role_grant confers a role's (action,resource) capabilities. Direct grants unchanged (base case of eff_grant), Phase 1 suite still green. Constructors acl-member-of, acl-child-of, acl-role-grant added; schema arity table extended. See the deny-overrides policy section above. Verified cyclic membership terminates.
    • Shared-plumbing update: the reusable seam is still just engine.sx's facts -> db + ground-query -> bool. The inheritance rules are ACL-specific (group/resource/role vocabulary); mod-sx will have its own. So the lib/guest/rules/ extraction stays at the build/decide level, not the ruleset level.
  • Phase 3 complete (89/89, +35 explain). Added explain.sx (proof reconstruction, see policy section above), audit.sx (append-only log), and extended api.sx with acl/explain/acl/audit/acl/audit-tail. No engine changes — explanation reads the same saturated db the decision uses.
    • Substrate gotcha: the host = compares symbols by interned identity, which is unstable across dl-query saturation/substitution within a single image — the same two structurally-equal symbol-lists compared = true once and false moments later in the REPL. Conformance runs in a fresh process per suite so it's deterministic there, but test assertions now use a name-based acl-et-eq? (compare symbols via symbol->string), matching the datalog suite's dl-api-deep=? convention. Worth flagging to the kernel owners but out of acl scope.
  • Phase 4 complete (120/120, +31 fed). Added federation.sx (mock transport + acl-fed-build-db/acl-revoke!/acl-fed-assert!), one engine rule (the trust-gated delegate rule), 4 fact constructors, 4 schema arities. Federated proofs reconstruct for free — explain.sx iterates acl-rules, so the delegate rule's EDB body (delegate/trust/level_covers) shows up as proof leaves with no explain changes. Roadmap done: all four phases green.
    • Shared-plumbing final note (for lib/guest/rules/): the durable reusable seam across acl-sx and the coming mod-sx is exactly four pass-throughs to the rule engine — build-db(facts), decide(ground-query) -> bool, explain(goal) -> proof-tree, and the revoke/assert mutators. The rulesets and vocabulary are language-specific (ACL: grant/deny/ member_of/...; mod-sx: Prolog moderation predicates). When mod-sx lands, extract those four functions (engine.sx + the generic half of explain.sx's goal-directed reconstructor) into lib/guest/rules/, leaving each consumer its own rules + fact constructors. Proof reconstruction is the non-obvious reusable piece: it only needs the ruleset as data + a saturated db, both of which any datalog-backed guest has.
    • dict-mode conformance is slow, not hung: all suites load + run in one process (~30-40s for 120 tests, no per-suite timeout). Do not kill early.
    • Tooling note: sx-tree path-based edit tools (sx_replace_node, sx_read_subtree with a path) ignored the path argument in this worktree (always resolved to index 0 / [0,..]), in BOTH (a b c) and (a,b,c) forms. sx_write_file, sx_validate, sx_find_all, sx_summarise, sx_eval all work; used full-file rewrites instead of path edits throughout.

Hardening (post-roadmap)

  • lib/acl/tests/harden.sx (+25). Adversarial / cross-phase coverage beyond the per-phase suites. Prover-free by design (see JIT blocker below): only acl-permit? (compiled Datalog, safe at any depth) + pure data ops, never acl-explain/acl-prove-d.
    • Diamond hierarchies (resource and group): grant on one path + deny on another → deny wins; both-grant → permit; deny does not leak to siblings.
    • Chain inheritance (literal 4-link): top-group grant reaches leaf member and intermediates; leaf-member deny overrides the top grant; deny on the leaf doesn't block a higher level.
    • Cycle termination: self-membership, self-child, and 2-node membership cycles all reach a fixpoint and decide correctly.
    • Federation conflicts: federated group-grant with a locally-denied member → deny; multi-peer delegation (one trusted, or both trusted) → permit.
    • Degenerate inputs: empty db permits nothing.
    • Fact validation: acl-validate-facts surfaces wrong-arity + unknown predicates; acl-facts-valid? on clean/empty sets.
    • Audit save/restore: snapshot → clear → restore round-trips entries + seq; seq continues without collision after restore; snapshot is an immutable copy.
    • Proof reconstruction itself is covered by tests/explain.sx + tests/fed.sx (both stay under the warm-process JIT depth threshold); the depth-cap safety net is verified manually in a warm REPL image but excluded from conformance.
  • New API: acl-validate-facts/acl-facts-valid? (schema.sx, opt-in — build stays lenient); acl-audit-snapshot/acl-audit-restore!/acl-audit-copy (audit.sx).
  • Substrate gotcha (recorded): append! extends a list built with (list) but silently no-ops on a map/rest-derived list in this runtime. Bit the first cut of acl-audit-restore! (rebuilt the live log via map, so later records didn't append). Fix: always rebuild mutable lists via (list) + append! (acl-audit-copy). Worth flagging to kernel owners; out of acl scope.

Blockers

  • JIT loops on deep proof reconstruction (substrate, not acl). Once the kernel JIT-compiles the mutually-recursive prover (acl-prove-d/ acl-prove-rules/acl-prove-build in explain.sx) — which happens after a process has run enough explains to cross the compile threshold — the compiled version loops indefinitely on a member_of/child_of chain deeper than ~3. Symptoms: acl-explain over a 4+-deep chain returns instantly in a cold / warm-REPL image but hangs in a long-lived process. The per-phase explain and fed suites pass only because their proofs stay ≤3 deep; a 5th suite that explained deeper chains hung the whole conformance run (no per-suite timeout in dict mode). Matches [[project_jit_bytecode_bug]] (ACTIVE).
    • Impact beyond tests: acl-explain is unsafe for deep hierarchies in a warm production OCaml server. acl-permit? is unaffected (it reduces to compiled Datalog, no SX-side recursion) — only the SX proof reconstructor is.
    • Workaround in acl: harden suite is prover-free; explain is exercised only at shallow depth. Real fix is in the kernel JIT (out of acl scope) — e.g. the _jit_compiling guard / disabling JIT for the recursive prover, or fixing the bytecode loop. Recommend the kernel owners reproduce with: acl-explain over a 6-deep member_of chain after ~70 prior explains.
    • Minimal repro recorded. Until fixed, callers needing explanations for deep hierarchies should flatten or cap depth, or run explain in a cold worker.