Briefing for the loop that builds the host-side servicer for persist/* IO ops, making lib/persist's durable backend actually durable. Points at the Blocker spec in plans/persist-on-sx.md as the authoritative contract; hard rules on build isolation (worktree _build only, never clobber the shared binary) and not pkilling the shared sx_server. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
5.3 KiB
host-persist loop agent (single agent, builds the durable storage host)
Role: make lib/persist's durable backend actually durable. The persist
substrate (lib/persist/**, 201/201 tests) performs {:op "persist/..." :args}
IO requests for every storage op; under sx_server.exe today nothing services
them, so writes silently vanish. You build the host-side adapter that answers
those ops against real on-disk storage — the one piece standing between persist
and "all subsystems share a durable substrate."
worktree: /root/rose-ash-loops/host-persist
branch: loops/host-persist (push origin/loops/host-persist; NEVER main/architecture)
The authoritative contract — read this first, every restart
plans/persist-on-sx.md → Blockers → "OPEN — host durable-storage adapter".
That entry is the spec: the silent-data-loss repro, the full op contract table,
the hard invariants (monotonic last-seq, etc.), the blob adapter shape, where
to register in sx_server.ml, and the acceptance test. Do not restate it here —
read it there and implement it. The reference implementation to mirror is
persist/serve in lib/persist/durable.sx (same op names, same shapes).
Restart baseline — check before iterating
- Read the Blocker spec (above) + this briefing.
git log --oneline -8onloops/host-persistto see what's done.- Is there a worktree-local build?
ls hosts/ocaml/_build/default/bin/sx_server.exe. Fresh worktrees have none — the first build is the first task. - If an acceptance suite exists (e.g.
hosts/ocaml/test/persist_durable_*or alib/persist/tests/durable-real.sx), run it against the worktree-built binary. Green before new work.
The queue (phases)
- Phase 0 — reproduce. Confirm the silent-data-loss repro from the spec under this worktree. Builds your mental model; costs one short run.
- Phase 1 — storage module. A new OCaml module under
hosts/ocaml/that implements the op contract over real persistent storage. Start simple and correct: a filesystem-backed store (one append-only file per stream + a kv file + a per-stream seq high-water file), or SQLite if the toolchain has it. Honour every invariant in the spec — especially:last-seqis a monotonic counter stored separately from rows so it survivestruncate; values round-trip structurally. - Phase 2 — register. Wire a
"persist/..."arm into the kernel's IO resolver (Sx_types._cek_io_resolver, ~line 3864 ofhosts/ocaml/bin/sx_server.ml) and/or thecek_run_with_iobridge path (~528–576), dispatching to the storage module. Op names are the contract — do not rename. - Phase 3 — acceptance. New tests that use
persist/durable-backend(REALperform, not the mock) run under the freshly-built worktree binary: thedurable+recoverysemantics must pass, and a real process restart (start the built server, write, stop it, start again, replay) must recover state from disk. Put host-owned tests under a host path (e.g.hosts/ocaml/test/) — do not churn persist's existing suites. - Phase 4 — blob adapter. Same pattern for
blob/put|get|has?backed by a content-addressed directory; persist stores only the ref.
Every iteration: implement → build → test → commit (short factual message) →
push → update plans/persist-on-sx.md (tick the Blocker toward CLOSED, append a
dated Progress-log line, newest first) → next.
Ground rules (hard)
- Build is your job (unlike the persist loop). But build only in this
worktree's
_buildviadunefrom/root/rose-ash-loops/host-persist. NEVER overwrite the shared binary at/root/rose-ash/hosts/ocaml/_build/default/bin/sx_server.exe— every sibling loop uses it; clobbering it breaks them all. Point acceptance tests at the worktree binary (hosts/ocaml/_build/default/bin/sx_server.exeinside this worktree). - First build is slow (full OCaml). The
sx_buildMCP tool has a ~600s watchdog that may kill it — preferdune build bin/sx_server.exe(or@all) run via Bash withrun_in_background: trueand a long timeout, then poll. - NEVER
pkill sx_server— siblings share the process/binary. Start your own server on a throwaway path/port for restart tests and stop only that PID; bound every run withtimeout. - Scope:
hosts/**, host-owned test files, and the Blocker entry + Progress log inplans/persist-on-sx.md. Do not modifylib/persist/**source (the persist loop owns it; its API is your contract, not your code) — if you need an upstream change, leave a note in the Blocker entry. - Determinism: replay from disk must equal the in-memory semantics; same log → same state.
- Commits: one feature per commit; push to
origin/loops/host-persist. - SX files:
sx-treeMCP tools ONLY,file:notpath:,sx_validateafter edits. (Most of your work is OCaml — edit those with normal tools.)
Definition of done
The Blocker entry flips to CLOSED: persist/durable-backend writes land on
disk, survive a real server restart, and the durable + recovery acceptance suites
are green against the worktree-built binary. At that point a subsystem migrated
per lib/persist/examples/acl.sx is genuinely durable.
Go. Read the Blocker spec; reproduce the gap; build the storage module.