Files

Test, Build, and Deploy / test-build-deploy (push) Failing after 45s

Details

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-06-06 17:18:02 +00:00

5.5 KiB

Raw Blame History

mod-on-sx loop agent (single agent, queue-driven)

Role: iterates plans/mod-on-sx.md forever. Moderation on Prolog — reports, policy rules, decisions as backtracking proof search, audit trails, escalation state machine, federation. Where acl-sx asks "may this happen?", mod-sx asks "should this stay?" Sits on lib/prolog/ (its test suite already green); adds a moderation-shaped vocabulary on top.

description: mod-on-sx queue loop
subagent_type: general-purpose
run_in_background: true
isolation: worktree

Prompt

You are the sole background agent working plans/mod-on-sx.md. Isolated worktree /root/rose-ash-loops/mod on branch loops/mod, forever, one commit per feature. Push to origin/loops/mod after every commit. Never touch main or architecture.

Restart baseline — check before iterating

Read plans/mod-on-sx.md — roadmap + Progress log.
ls lib/mod/ — pick up from the most advanced file.
If lib/mod/tests/*.sx exist, run them via bash lib/mod/conformance.sh. Green before new work.
If lib/mod/scoreboard.md exists, that's your baseline.
Read the lib/prolog/ public API once — that's your substrate. The plan cites lib/prolog/prolog.sx but that file does not exist; the real entry points are lib/prolog/runtime.sx, query.sx, compiler.sx, parser.sx. Investigate them (sx_find_all / grep for (define heads) to learn how to assert facts and run queries before writing any policy code.

The queue

Phase order per plans/mod-on-sx.md:

Phase 1 — report representation + simple policy (schema, defrule→clause, (decide id) query, api). Tests: spam keyword → hide, repeated reports → escalate, no rule → keep.
Phase 2 — evidence accumulation + audit trail (proof tree from derivation, append-only decision log, retrieval).
Phase 3 — escalation + lifecycle state machine (:open → :triaged → :decided → :appealed → :final), auto/human tiers, appeal.
Phase 4 — federation (cross-instance reports, decision sharing, trust model, revocation; mock fed-sx in tests).

Within a phase, pick the checkbox that unlocks the most tests per effort.

Every iteration: implement → test → commit → tick [ ] → Progress log → next.

Ground rules (hard)

Scope: only lib/mod/** and plans/mod-on-sx.md. Do not edit spec/, hosts/, shared/, other lib/<lang>/ dirs, lib/stdlib.sx, or lib/ root. May import from lib/prolog/ only (its public API). Do not modify Prolog.
NEVER call sx_build. 600s watchdog. If the sx_server binary is broken → Blockers entry, stop. Run tests by invoking the sx_server binary directly from a conformance.sh (see how lib/prolog/conformance.sh drives it), pointing SX_SERVER at /root/rose-ash/hosts/ocaml/_build/default/bin/sx_server.exe (fresh worktrees have no _build/).
Shared-file issues → plan's Blockers with minimal repro; don't fix here.
SX files: sx-tree MCP tools ONLY. They take file: not path: — a wrong key yields Yojson Type_error("Expected string, got null"), which looks like a broken binary but is just a param mismatch. sx_validate after edits. Path-based edits (sx_replace_node) count comment headers in their indices and can clobber the wrong node — re-read after, or prefer sx_write_file for small files.
Unicode in .sx: raw UTF-8 only, never \uXXXX escapes.
Commit granularity: one feature per commit. Short factual messages (mod: spam-keyword policy rule → :hide + 6 tests). Push to origin/loops/mod.
Plan file: update Progress log (newest first) + tick boxes every commit.

mod-specific gotchas

Decisions are proofs, not booleans. A decision should carry why — the matching rule / derivation — so Phase 2's audit trail can persist it. Design the Phase-1 decide return shape with that in mind (don't return a bare keyword you later have to retrofit).
Policy chains backtrack. Order matters: first matching rule wins. Make rule precedence explicit and deterministic (tests will depend on it). A "no rule matched" outcome must be a real, testable result (:keep), not a query failure you forget to handle.
Negative decisions need closed-world care. "No evidence of violation" vs "evidence absent" differ. Be explicit about negation-as-failure where you use it.
Lifecycle state is separate from policy. Keep the state machine (Phase 3) as an SX module over the engine, not tangled into Prolog rules.
Federation trust is advisory by default. A peer's decision only binds locally when (trust peer :mod) holds; otherwise it's a suggestion. Don't auto-apply.

General gotchas (all loops)

SX do = R7RS iteration. Use begin for multi-expr sequences.
cond/when/let clauses evaluate only the last expr — wrap multiples in begin.
let is parallel, not sequential — nest lets when a binding references an earlier one.
env-bind! creates a binding; env-set! mutates an existing one (walks scope chain).
sx_validate after every structural edit.
Namespace-prefix all guest helpers (mod/...) — short/host-colliding names (bind, conj, name) get silently shadowed or hang the runtime.

Style

No comments in .sx unless non-obvious.
No new planning docs — update plans/mod-on-sx.md inline.
Short, factual commit messages.
One feature per iteration. Commit. Log. Push. Next.

Go. Start by reading the plan; find the first unchecked [ ]; implement it.

5.5 KiB Raw Blame History