From 65f274c5730196e2227398e6c51119fba533954b Mon Sep 17 00:00:00 2001 From: giles Date: Sat, 6 Jun 2026 22:18:03 +0000 Subject: [PATCH] briefings: add host-persist loop briefing (durable storage host adapter) Briefing for the loop that builds the host-side servicer for persist/* IO ops, making lib/persist's durable backend actually durable. Points at the Blocker spec in plans/persist-on-sx.md as the authoritative contract; hard rules on build isolation (worktree _build only, never clobber the shared binary) and not pkilling the shared sx_server. Co-Authored-By: Claude Opus 4.8 (1M context) --- plans/agent-briefings/host-persist-loop.md | 94 ++++++++++++++++++++++ 1 file changed, 94 insertions(+) create mode 100644 plans/agent-briefings/host-persist-loop.md diff --git a/plans/agent-briefings/host-persist-loop.md b/plans/agent-briefings/host-persist-loop.md new file mode 100644 index 00000000..9c90b8c9 --- /dev/null +++ b/plans/agent-briefings/host-persist-loop.md @@ -0,0 +1,94 @@ +# host-persist loop agent (single agent, builds the durable storage host) + +Role: make `lib/persist`'s durable backend **actually durable**. The persist +substrate (`lib/persist/**`, 201/201 tests) performs `{:op "persist/..." :args}` +IO requests for every storage op; under `sx_server.exe` today nothing services +them, so writes silently vanish. You build the **host-side adapter** that answers +those ops against real on-disk storage — the one piece standing between persist +and "all subsystems share a durable substrate." + +``` +worktree: /root/rose-ash-loops/host-persist +branch: loops/host-persist (push origin/loops/host-persist; NEVER main/architecture) +``` + +## The authoritative contract — read this first, every restart + +`plans/persist-on-sx.md` → **Blockers → "OPEN — host durable-storage adapter"**. +That entry is the spec: the silent-data-loss repro, the full op contract table, +the hard invariants (monotonic `last-seq`, etc.), the blob adapter shape, where +to register in `sx_server.ml`, and the acceptance test. Do not restate it here — +read it there and implement it. The reference implementation to mirror is +`persist/serve` in `lib/persist/durable.sx` (same op names, same shapes). + +## Restart baseline — check before iterating + +1. Read the Blocker spec (above) + this briefing. +2. `git log --oneline -8` on `loops/host-persist` to see what's done. +3. Is there a worktree-local build? `ls hosts/ocaml/_build/default/bin/sx_server.exe`. + Fresh worktrees have none — the first build is the first task. +4. If an acceptance suite exists (e.g. `hosts/ocaml/test/persist_durable_*` or a + `lib/persist/tests/durable-real.sx`), run it against the **worktree-built** + binary. Green before new work. + +## The queue (phases) + +- **Phase 0 — reproduce.** Confirm the silent-data-loss repro from the spec under + this worktree. Builds your mental model; costs one short run. +- **Phase 1 — storage module.** A new OCaml module under `hosts/ocaml/` that + implements the op contract over **real persistent storage**. Start simple and + correct: a filesystem-backed store (one append-only file per stream + a kv + file + a per-stream seq high-water file), or SQLite if the toolchain has it. + Honour every invariant in the spec — especially: `last-seq` is a monotonic + counter stored separately from rows so it survives `truncate`; values + round-trip structurally. +- **Phase 2 — register.** Wire a `"persist/..."` arm into the kernel's IO + resolver (`Sx_types._cek_io_resolver`, ~line 3864 of `hosts/ocaml/bin/sx_server.ml`) + and/or the `cek_run_with_io` bridge path (~528–576), dispatching to the storage + module. Op names are the contract — do not rename. +- **Phase 3 — acceptance.** New tests that use `persist/durable-backend` (REAL + `perform`, not the mock) run under the freshly-built worktree binary: the + `durable` + `recovery` semantics must pass, and a **real process restart** + (start the built server, write, stop it, start again, replay) must recover + state from disk. Put host-owned tests under a host path (e.g. + `hosts/ocaml/test/`) — do not churn persist's existing suites. +- **Phase 4 — blob adapter.** Same pattern for `blob/put|get|has?` backed by a + content-addressed directory; persist stores only the ref. + +Every iteration: implement → build → test → commit (short factual message) → +push → update `plans/persist-on-sx.md` (tick the Blocker toward CLOSED, append a +dated Progress-log line, newest first) → next. + +## Ground rules (hard) + +- **Build is your job** (unlike the persist loop). But build **only in this + worktree's `_build`** via `dune` from `/root/rose-ash-loops/host-persist`. + **NEVER overwrite the shared binary** at + `/root/rose-ash/hosts/ocaml/_build/default/bin/sx_server.exe` — every sibling + loop uses it; clobbering it breaks them all. Point acceptance tests at the + worktree binary (`hosts/ocaml/_build/default/bin/sx_server.exe` *inside this + worktree*). +- **First build is slow** (full OCaml). The `sx_build` MCP tool has a ~600s + watchdog that may kill it — prefer `dune build bin/sx_server.exe` (or `@all`) + run via Bash with `run_in_background: true` and a long timeout, then poll. +- **NEVER `pkill sx_server`** — siblings share the process/binary. Start your own + server on a throwaway path/port for restart tests and stop only that PID; bound + every run with `timeout`. +- **Scope:** `hosts/**`, host-owned test files, and the Blocker entry + + Progress log in `plans/persist-on-sx.md`. Do **not** modify `lib/persist/**` + source (the persist loop owns it; its API is your contract, not your code) — + if you need an upstream change, leave a note in the Blocker entry. +- **Determinism:** replay from disk must equal the in-memory semantics; same log + → same state. +- **Commits:** one feature per commit; push to `origin/loops/host-persist`. +- **SX files:** `sx-tree` MCP tools ONLY, `file:` not `path:`, `sx_validate` + after edits. (Most of your work is OCaml — edit those with normal tools.) + +## Definition of done + +The Blocker entry flips to **CLOSED**: `persist/durable-backend` writes land on +disk, survive a real server restart, and the durable + recovery acceptance suites +are green against the worktree-built binary. At that point a subsystem migrated +per `lib/persist/examples/acl.sx` is genuinely durable. + +Go. Read the Blocker spec; reproduce the gap; build the storage module.