scripts: forge-driven fix-loop launcher (git→gitea→agentic→tmux)
First live test of the sx-forge technology driving a real work session: - sx-fix-up.sh <forge-agent> <briefing.md>: reads the agent's briefing FROM the rose-ash/sx-review forge (agentic-sx branch), materialises a git worktree + branch (loops/sx-<slug>), and spins up a tmux+claude session briefed from the forge. Commits are LOCAL by default (no push). - sx-fix-down.sh [--clean]: stop the sx-fix session; --clean removes worktrees. - plans/agent-briefings/sx-gate-loop.md: W14 (test gate) briefing — the safe first payload (test-only, cannot regress the 5762p/274f baseline), scoped commit-no-push with hard guardrails. Verified live: launcher read the W14 briefing from the forge, created worktree /root/rose-ash-loops/sx-ws-w14 on loops/sx-ws-w14, booted claude, and the agent picked up the briefing. Watch: tmux a -t sx-fix. Note: MCP servers need /mcp auth in a fresh worktree (agent works via Bash meanwhile). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
57
plans/agent-briefings/sx-gate-loop.md
Normal file
57
plans/agent-briefings/sx-gate-loop.md
Normal file
@@ -0,0 +1,57 @@
|
||||
# sx-gate loop — W14 test gate (first live test of git→gitea→agentic→tmux)
|
||||
|
||||
**Forge agent:** `agents/ws-W14` in the `rose-ash/sx-review` forge (git-sx/gitea-sx/agentic-sx).
|
||||
**Goal (from the forge briefing):** make the verification infrastructure trustworthy — runner env
|
||||
== production env, a WASM corpus runner, harness honesty, and pinning tests for the fixes already
|
||||
landed. This is **W14** in `plans/sx-review/PLAN.md` (read that section — it lists the findings).
|
||||
**Findings:** C0b C9 C21 C22 C23 C3 C4 C5 C6 C7 F2 F3 F4 F5 F6 F7 F8 F9 F10 F11 F12 K19 K104.
|
||||
|
||||
## Why this workstream first
|
||||
The review's prime directive: no semantic fix should merge before its pinning test + a working
|
||||
gate exist, because the verification infra currently can't tell you whether a fix works. W14
|
||||
produces that infra. It changes **no language semantics**, so it cannot regress the 5762p/274f
|
||||
baseline — the ideal first payload while we test the agentic launch technology.
|
||||
|
||||
## Hard guardrails (this is a monitored test loop)
|
||||
- **Commit locally, do NOT push.** No `git push` at all. (This is a test; the maintainer reviews
|
||||
before anything reaches origin.)
|
||||
- **Stay in W14 scope** — tests, runners, harness, gate tooling. Do NOT edit `spec/*.sx`,
|
||||
`hosts/ocaml/lib/*.ml`, or any language semantics. If a task tempts you toward semantics, skip it
|
||||
and note why in the Progress log.
|
||||
- **Never `pkill sx_server`** (shared binary). Bound every `sx_server`/build/test with `timeout`.
|
||||
- You are on branch `loops/sx-gate` in worktree `/root/rose-ash-loops/sx-gate`. Build/test here only.
|
||||
- If the OCaml build or full suite is involved, compare against the recorded baseline
|
||||
**5762 passed / 274 failed** (fail set is the 273 hs-* + 1 r7rs radix; see PLAN W14/F10).
|
||||
|
||||
## One iteration per fire — pick the first unchecked `[ ]`, implement, test, commit (no push),
|
||||
tick the box, prepend one dated line to the Progress log, then stop.
|
||||
|
||||
- [ ] **Pin the dc7aa709 quick-wins batch.** Add regression tests (spec/tests/ or a new suite) that
|
||||
lock in the fixes that currently have none: K09 `unquote-splicing` longhand splices; K11 guard
|
||||
re-raise sentinel is unforgeable (`(guard (e (true (list 'quoted x))) ...)` returns the list);
|
||||
K18 `(expt 2 100)` is a float not 0; K20 `(contains? {:a 1} :a)` is true; K39 `(do ((fn (x) x) 5) 99)`
|
||||
→ 99; K49 the five void elements render. (K02 is already non-vacuously covered.) Confirm they pass
|
||||
on the current binary.
|
||||
- [ ] **Pin C1/C1b/S4 at the host level** (a small OCaml or shell test): a malformed command line
|
||||
returns an error response and the process survives; an error page is not cached.
|
||||
- [ ] **WASM corpus runner (F2).** Stand up a Node harness that runs a curated spec/tests subset
|
||||
against the shipped WASM kernel (seed: the conformance lane's `run_wasm.js` pattern, referenced in
|
||||
PLAN). Curated subset, not the full 6k (js_of_ocaml is ~24s/test — see F18). Wire it as a script.
|
||||
- [ ] **Harness honesty (C22/K104):** make `spec/harness.sx` log the IO call *before* invoking the
|
||||
mock so a throwing mock is recorded. Add a test that a throwing mock leaves a log entry.
|
||||
- [ ] **Runner-vs-prod env audit (F7/K42):** list every binding that exists only in `run_tests.ml`
|
||||
but not the production kernel env (`values`/`call-with-values` are the known ones). Write the audit
|
||||
to `plans/sx-review/runner-env-gap.md`. (Fixing them is later; the audit is the W14 task.)
|
||||
- [ ] **Protocol fuzz suite (C3/C4/C5/C6):** a bounded test that feeds the epoch loop malformed
|
||||
lines (`(epoch)`, `(epoch foo)`, stray `(io-response …)`, two-exprs-per-line) and asserts the
|
||||
process never dies and responses stay correctly tagged.
|
||||
- [ ] **hs-upstream skip-list (F10/F18):** make the native runner's 272 hs-* failures a skip-list so
|
||||
a red FAIL column means something. Record the count moved.
|
||||
|
||||
## Progress log (newest first)
|
||||
<!-- prepend: `- YYYY-MM-DD <what landed, test result, commit sha>` -->
|
||||
- (none yet — first fire will add the first entry)
|
||||
|
||||
## Recording back to the forge
|
||||
After each commit, note the sha here; the maintainer (or a later step) records it as a
|
||||
`test`-kind commit on `agents/ws-W14` in the forge so the program stays the system of record.
|
||||
23
scripts/sx-fix-down.sh
Executable file
23
scripts/sx-fix-down.sh
Executable file
@@ -0,0 +1,23 @@
|
||||
#!/usr/bin/env bash
|
||||
# sx-fix-down.sh — stop the sx-fix tmux session (and optionally remove worktrees).
|
||||
# Usage: ./scripts/sx-fix-down.sh [--clean]
|
||||
# --clean also `git worktree remove` every /root/rose-ash-loops/sx-* worktree
|
||||
# Commits on loops/sx-* branches are preserved either way (branches are not deleted).
|
||||
set -euo pipefail
|
||||
ROOT="$(cd "$(dirname "$0")/.." && pwd)"; cd "$ROOT"
|
||||
SESSION="sx-fix"
|
||||
|
||||
if tmux has-session -t "$SESSION" 2>/dev/null; then
|
||||
tmux kill-session -t "$SESSION"
|
||||
echo "killed tmux session '$SESSION'."
|
||||
else
|
||||
echo "no tmux session '$SESSION'."
|
||||
fi
|
||||
|
||||
if [ "${1:-}" = "--clean" ]; then
|
||||
for wt in /root/rose-ash-loops/sx-*; do
|
||||
[ -d "$wt" ] || continue
|
||||
git worktree remove --force "$wt" 2>/dev/null && echo "removed worktree $wt" || echo "skip $wt"
|
||||
done
|
||||
echo "branches loops/sx-* are kept (commits preserved)."
|
||||
fi
|
||||
115
scripts/sx-fix-up.sh
Executable file
115
scripts/sx-fix-up.sh
Executable file
@@ -0,0 +1,115 @@
|
||||
#!/usr/bin/env bash
|
||||
# sx-fix-up.sh — spin up ONE claude fix-loop in tmux, driven by an agentic-sx
|
||||
# branch in the rose-ash/sx-review forge (git-sx / gitea-sx / agentic-sx).
|
||||
#
|
||||
# This tests the git→gitea→agentic→tmux wiring: the forge holds the agent
|
||||
# branch + briefing; this launcher READS that briefing from the forge (proving
|
||||
# the forge drives the session), then materialises a git worktree and a tmux
|
||||
# window running claude on the corresponding real branch.
|
||||
#
|
||||
# Usage: ./scripts/sx-fix-up.sh <forge-agent> <briefing.md> [interval]
|
||||
# forge-agent agentic-sx agent branch in the forge, e.g. ws-W14
|
||||
# briefing.md file under plans/agent-briefings/ with the detailed work plan
|
||||
# interval optional /loop interval (e.g. 15m); omit for self-paced
|
||||
#
|
||||
# Example: ./scripts/sx-fix-up.sh ws-W14 sx-gate-loop.md
|
||||
#
|
||||
# The real code branch is loops/sx-<slug> in worktree
|
||||
# /root/rose-ash-loops/sx-<slug>. Commits are LOCAL by default — the briefing
|
||||
# decides whether to push. Watch: tmux a -t sx-fix ; stop: ./scripts/sx-fix-down.sh
|
||||
set -euo pipefail
|
||||
|
||||
ROOT="$(cd "$(dirname "$0")/.." && pwd)"; cd "$ROOT"
|
||||
AGENT="${1:?usage: sx-fix-up.sh <forge-agent> <briefing.md> [interval]}"
|
||||
BRIEFING_MD="${2:?briefing file under plans/agent-briefings/ required}"
|
||||
INTERVAL="${3:-}"
|
||||
SLUG="$(echo "$AGENT" | tr '[:upper:]' '[:lower:]' | tr -c 'a-z0-9' '-' | sed 's/-*$//')"
|
||||
SESSION="sx-fix"
|
||||
WT="/root/rose-ash-loops/sx-${SLUG}"
|
||||
BRANCH="loops/sx-${SLUG}"
|
||||
BIN="hosts/ocaml/_build/default/bin/sx_server.exe"
|
||||
BOOT_WAIT=20
|
||||
|
||||
[ -f "plans/agent-briefings/$BRIEFING_MD" ] || { echo "no such briefing: plans/agent-briefings/$BRIEFING_MD"; exit 1; }
|
||||
|
||||
# --- 1. read the briefing identity FROM the forge (the agentic-sx → launch link) ---
|
||||
echo "Querying the forge for agent '$AGENT'..."
|
||||
FORGE_OUT="$(
|
||||
{
|
||||
echo '(epoch 1)'
|
||||
for f in spec/stdlib.sx lib/r7rs.sx lib/persist/event.sx lib/persist/backend.sx \
|
||||
lib/persist/log.sx lib/persist/kv.sx lib/artdag/dag.sx \
|
||||
lib/datalog/tokenizer.sx lib/datalog/parser.sx lib/datalog/unify.sx \
|
||||
lib/datalog/db.sx lib/datalog/builtins.sx lib/datalog/aggregates.sx \
|
||||
lib/datalog/strata.sx lib/datalog/eval.sx lib/datalog/api.sx lib/datalog/magic.sx \
|
||||
lib/git/object.sx lib/git/ref.sx lib/git/dag.sx lib/git/worktree.sx \
|
||||
lib/git/diff.sx lib/git/merge.sx lib/git/porcelain.sx \
|
||||
lib/relations/schema.sx lib/relations/engine.sx lib/relations/api.sx \
|
||||
lib/relations/explain.sx lib/relations/federation.sx lib/relations/tree.sx \
|
||||
lib/agentic/schema.sx lib/agentic/branch.sx lib/gitea/repo.sx; do
|
||||
echo "(load \"$f\")"
|
||||
done
|
||||
echo '(epoch 2)'
|
||||
echo '(load "plans/sx-review/forge-build.sxsrc")'
|
||||
echo '(epoch 3)'
|
||||
echo "(eval \"(str (agentic/briefing-title (agentic/briefing-of fb-sp \\\"$AGENT\\\")) \\\" | \\\" (agentic/briefing-goal (agentic/briefing-of fb-sp \\\"$AGENT\\\")))\")"
|
||||
} | timeout 300 "$BIN" 2>/dev/null | grep -A1 '(ok-len 3' | tail -1
|
||||
)"
|
||||
if [ -z "$FORGE_OUT" ]; then
|
||||
echo "FAILED: could not read briefing for '$AGENT' from the forge." >&2
|
||||
exit 1
|
||||
fi
|
||||
echo "Forge briefing: $FORGE_OUT"
|
||||
|
||||
# --- 2. materialise the worktree + branch off architecture ---
|
||||
if [ -d "$WT/.git" ] || [ -f "$WT/.git" ]; then
|
||||
echo "worktree exists: $WT"
|
||||
else
|
||||
if git show-ref --verify --quiet "refs/heads/$BRANCH"; then
|
||||
git worktree add "$WT" "$BRANCH" >/dev/null
|
||||
else
|
||||
git worktree add -b "$BRANCH" "$WT" architecture >/dev/null
|
||||
fi
|
||||
echo "worktree created: $WT on $BRANCH"
|
||||
fi
|
||||
|
||||
# permissions so the loop doesn't stall on prompts (mirrors sx-loops-up.sh)
|
||||
mkdir -p "$WT/.claude"
|
||||
cat > "$WT/.claude/settings.local.json" <<'SETTINGS'
|
||||
{
|
||||
"permissions": {
|
||||
"allow": [
|
||||
"mcp__sx-tree__sx_summarise","mcp__sx-tree__sx_read_tree","mcp__sx-tree__sx_read_subtree",
|
||||
"mcp__sx-tree__sx_find_all","mcp__sx-tree__sx_find_across","mcp__sx-tree__sx_validate",
|
||||
"mcp__sx-tree__sx_write_file","mcp__sx-tree__sx_eval","mcp__sx-tree__sx_harness_eval",
|
||||
"mcp__sx-tree__sx_build","mcp__sx-tree__sx_test","mcp__sx-tree__sx_diff_branch",
|
||||
"mcp__sx-tree__sx_changed","mcp__sx-tree__sx_comp_list","mcp__sx-tree__sx_comp_usage",
|
||||
"Bash(node *)","Bash(python3 *)","Bash(bash *)","Bash(cp *)","Bash(git *)","Bash(timeout *)"
|
||||
]
|
||||
},
|
||||
"enabledMcpjsonServers": ["sx-tree","rose-ash-services","hs-test"]
|
||||
}
|
||||
SETTINGS
|
||||
|
||||
# --- 3. tmux window + claude + /loop briefing ---
|
||||
if ! tmux has-session -t "$SESSION" 2>/dev/null; then
|
||||
tmux new-session -d -s "$SESSION" -n "$SLUG" -c "$WT"
|
||||
else
|
||||
tmux new-window -t "$SESSION" -n "$SLUG" -c "$WT"
|
||||
fi
|
||||
tmux send-keys -t "$SESSION:$SLUG" "claude" C-m
|
||||
echo "waiting ${BOOT_WAIT}s for claude to boot..."
|
||||
sleep "$BOOT_WAIT"
|
||||
|
||||
PRE="/loop "; [ -n "$INTERVAL" ] && PRE="/loop $INTERVAL "
|
||||
CMD="${PRE}Read plans/agent-briefings/$BRIEFING_MD (forge agent $AGENT: $FORGE_OUT) and do ONE iteration per fire: pick the first unchecked [ ], implement, test, commit LOCALLY (no push), tick the box, prepend one dated line to the Progress log, then stop. You are on branch $BRANCH in worktree $WT. Obey the briefing's hard guardrails — test-only, no semantics edits, no push."
|
||||
tmux send-keys -t "$SESSION:$SLUG" "$CMD"
|
||||
sleep 0.5
|
||||
tmux send-keys -t "$SESSION:$SLUG" Enter
|
||||
|
||||
echo ""
|
||||
echo "Launched fix-loop '$SLUG' (forge agent $AGENT) in tmux '$SESSION'."
|
||||
echo " Attach: tmux a -t $SESSION"
|
||||
echo " Watch: tmux capture-pane -t $SESSION:$SLUG -p | tail -40"
|
||||
echo " Stop: tmux kill-window -t $SESSION:$SLUG (or ./scripts/sx-fix-down.sh)"
|
||||
echo " Branch: $BRANCH Worktree: $WT (commits LOCAL — review before push)"
|
||||
Reference in New Issue
Block a user