persist: crash/restart recovery integration + migration notes — Phase 4 complete
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 37s

recovery.sx: 6-test end-to-end crash/restart of an order ledger (log +
subscription kv read model + snapshot + compaction + invoice blob ref) on the
durable backend; everything survives a restart over the same disk + content
store, seq continues, two restarts converge. Migration notes (mem → durable
under a live subsystem) added to the plan. Roadmap done, 111/111.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-06 19:14:01 +00:00
parent 1c7b602978
commit 4be6988963
5 changed files with 180 additions and 8 deletions

View File

@@ -42,7 +42,7 @@ read models (feeds, indices, audit logs) update incrementally.
## Status (rolling)
`bash lib/persist/conformance.sh`**105/105** (Phases 13 done, Phase 4 in progress)
`bash lib/persist/conformance.sh`**111/111** (Phases 14 complete)
## Ground rules
@@ -105,14 +105,58 @@ lib/persist/backend.sx lib/persist/api.sx
## Phase 4 — Durable backends via kernel IO
- [x] file/log backend driven through `perform` (IO-suspension boundary)
- [x] blob backend interface (store ref/CID; bytes live in artdag/IPFS)
- [ ] crash/restart replay test (mock IO platform)
- [ ] migration notes for swapping mem → durable under a live subsystem
- [x] crash/restart replay test (mock IO platform)
- [x] migration notes for swapping mem → durable under a live subsystem
### Migration notes — mem → durable under a live subsystem
The facet API takes the backend as its first argument and never names a concrete
backend, so swapping storage is a one-line change at the open site:
```
(persist/open) ; in-memory (test / ephemeral)
(persist/mock-durable (persist/mem-backend)); durable protocol, in-process disk
(persist/durable-backend) ; production: ops cross perform → host
```
Everything above the backend — `append`/`read`/`project`/`subscribe`/`snapshot`
/`compact` — is byte-identical across all three. A subsystem migrates by:
1. **Pick the seam.** The subsystem holds one backend value (today an in-memory
list). Replace its construction with `persist/open`/`durable-backend`; leave
every call site untouched.
2. **Backfill.** For an existing in-memory store, replay its current state into
the durable backend once (append historical events / `kv-put` current
values) before cutting reads over. New writes go to durable from then on.
3. **Read models rebuild themselves.** A projection is pure `(fold step seed)`;
after cutover, `persist/replay` (snapshot + tail) reconstructs every read
model from the durable log — no bespoke migration of derived state.
4. **Blobs first, by reference.** Move large payloads into the content store and
store only `persist/blob-ref`s; the log/kv stay small, so the backfill in (2)
never copies bytes.
5. **Concurrency is already handled.** Two writers racing a stream get a
`persist/conflict?` result, not corruption — the same on mem or durable, so
no new code is needed at cutover.
The only behavioural difference durable introduces is that each op crosses the
kernel IO-suspension boundary (`perform`): under the real kernel the call
suspends and the host resumes it transparently, so the facet code is unaware.
Tests prove this by routing the identical request shapes through `persist/serve`
over an in-process disk (the mock-IO harness).
## Consumers (post-foundation, not in scope here)
feed/-log, flow store, mod/audit, search index, acl grants, identity sessions all
become `persist` log or kv. Track each migration in that subsystem's plan.
## Progress log
- **Phase 4c+4d (111/111) — Phase 4 complete, roadmap done.** `recovery.sx` — a
6-test crash/restart integration: an order ledger (event log + subscription
kv read model + snapshot + compaction + invoice blob ref) over the durable
backend, where "crash" drops every in-process object and "restart" rebuilds
over the same disk + content store. Log, read model, snapshot, compacted
replay, and blob ref all survive; seq continues; two restarts converge
(determinism). Migration notes (mem → durable under a live subsystem) added
inline above.
- **Phase 4b (105/105).** `blob.sx` — large objects stay out of persist. A blob
ref is `{:cid :size :mime}`; the blob store is a SEPARATE injected dependency
(`persist/blob-io` over an injectable transport, perform in prod / mock