persist: crash/restart recovery integration + migration notes — Phase 4 complete
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 37s
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 37s
recovery.sx: 6-test end-to-end crash/restart of an order ledger (log + subscription kv read model + snapshot + compaction + invoice blob ref) on the durable backend; everything survives a restart over the same disk + content store, seq continues, two restarts converge. Migration notes (mem → durable under a live subsystem) added to the plan. Roadmap done, 111/111. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -13,7 +13,7 @@ if [ ! -x "$SX_SERVER" ]; then
|
||||
exit 1
|
||||
fi
|
||||
|
||||
SUITES=(event log kv project subscribe concurrency snapshot compaction durable blob)
|
||||
SUITES=(event log kv project subscribe concurrency snapshot compaction durable blob recovery)
|
||||
|
||||
OUT_JSON="lib/persist/scoreboard.json"
|
||||
OUT_MD="lib/persist/scoreboard.md"
|
||||
|
||||
@@ -9,9 +9,10 @@
|
||||
"snapshot": {"pass": 11, "fail": 0},
|
||||
"compaction": {"pass": 11, "fail": 0},
|
||||
"durable": {"pass": 15, "fail": 0},
|
||||
"blob": {"pass": 14, "fail": 0}
|
||||
"blob": {"pass": 14, "fail": 0},
|
||||
"recovery": {"pass": 6, "fail": 0}
|
||||
},
|
||||
"total_pass": 105,
|
||||
"total_pass": 111,
|
||||
"total_fail": 0,
|
||||
"total": 105
|
||||
"total": 111
|
||||
}
|
||||
|
||||
@@ -14,4 +14,5 @@ _Generated by `lib/persist/conformance.sh`_
|
||||
| compaction | 11 | 0 | 11 |
|
||||
| durable | 15 | 0 | 15 |
|
||||
| blob | 14 | 0 | 14 |
|
||||
| **Total** | **105** | **0** | **105** |
|
||||
| recovery | 6 | 0 | 6 |
|
||||
| **Total** | **111** | **0** | **111** |
|
||||
|
||||
126
lib/persist/tests/recovery.sx
Normal file
126
lib/persist/tests/recovery.sx
Normal file
@@ -0,0 +1,126 @@
|
||||
; Phase 4 — crash/restart integration. A whole subsystem (an order ledger:
|
||||
; event log + a kv read model kept by a subscription + a periodic snapshot + an
|
||||
; invoice blob ref) on the durable backend must survive a restart. "Crash" =
|
||||
; drop every in-process object (backend, hub, projections); "restart" = rebuild
|
||||
; them over the SAME disk + blob store. Nothing but the disk and content store
|
||||
; carries across, exactly as a real process restart.
|
||||
|
||||
(define rec-count (fn (acc e) (+ acc 1)))
|
||||
|
||||
(persist-test
|
||||
"log survives restart and seq continues"
|
||||
(let
|
||||
((disk (persist/mem-backend)))
|
||||
(begin
|
||||
(let
|
||||
((db (persist/mock-durable disk)))
|
||||
(begin
|
||||
(persist/append db "orders" "placed" 0 {:id "a"})
|
||||
(persist/append db "orders" "placed" 1 {:id "b"})))
|
||||
(let
|
||||
((db2 (persist/mock-durable disk)))
|
||||
(list
|
||||
(persist/project-fold db2 "orders" rec-count 0)
|
||||
(persist/event-seq
|
||||
(persist/append db2 "orders" "placed" 2 {:id "c"}))))))
|
||||
(list 2 3))
|
||||
(persist-test
|
||||
"subscription-driven kv read model survives restart"
|
||||
(let
|
||||
((disk (persist/mem-backend)))
|
||||
(begin
|
||||
(let
|
||||
((h (persist/hub (persist/mock-durable disk))))
|
||||
(begin
|
||||
(persist/subscribe
|
||||
h
|
||||
"orders"
|
||||
(fn
|
||||
(bk s e)
|
||||
(persist/kv-update
|
||||
bk
|
||||
"order-count"
|
||||
0
|
||||
(fn (n) (+ n 1)))))
|
||||
(persist/publish h "orders" "placed" 0 {})
|
||||
(persist/publish h "orders" "placed" 1 {})))
|
||||
(let
|
||||
((db2 (persist/mock-durable disk)))
|
||||
(persist/kv-get db2 "order-count"))))
|
||||
2)
|
||||
(persist-test
|
||||
"snapshot taken before crash drives replay after restart"
|
||||
(let
|
||||
((disk (persist/mem-backend)))
|
||||
(begin
|
||||
(let
|
||||
((db (persist/mock-durable disk)))
|
||||
(begin
|
||||
(persist/append db "orders" "placed" 0 {})
|
||||
(persist/append db "orders" "placed" 1 {})
|
||||
(persist/checkpoint db "orders" "count" rec-count 0)
|
||||
(persist/append db "orders" "placed" 2 {})))
|
||||
(let
|
||||
((db2 (persist/mock-durable disk)))
|
||||
(equal?
|
||||
(persist/project-value
|
||||
(persist/replay db2 "orders" "count" rec-count 0))
|
||||
(persist/project-fold db2 "orders" rec-count 0)))))
|
||||
true)
|
||||
(persist-test
|
||||
"compacted log still replays correctly after restart"
|
||||
(let
|
||||
((disk (persist/mem-backend)))
|
||||
(begin
|
||||
(let
|
||||
((db (persist/mock-durable disk)))
|
||||
(begin
|
||||
(persist/append db "orders" "placed" 0 {})
|
||||
(persist/append db "orders" "placed" 1 {})
|
||||
(persist/append db "orders" "placed" 2 {})
|
||||
(persist/compact db "orders" "count" rec-count 0)
|
||||
(persist/append db "orders" "placed" 3 {})))
|
||||
(let
|
||||
((db2 (persist/mock-durable disk)))
|
||||
(persist/project-value
|
||||
(persist/replay db2 "orders" "count" rec-count 0)))))
|
||||
4)
|
||||
(persist-test
|
||||
"invoice blob ref survives restart, bytes fetched from content store"
|
||||
(let
|
||||
((disk (persist/mem-backend)) (store (persist/mem-backend)))
|
||||
(begin
|
||||
(let
|
||||
((db (persist/mock-durable disk)) (blob (persist/mock-blob store)))
|
||||
(persist/kv-put
|
||||
db
|
||||
"invoice"
|
||||
(persist/blob-store blob "INVOICEPDF" "application/pdf")))
|
||||
(let
|
||||
((db2 (persist/mock-durable disk))
|
||||
(blob2 (persist/mock-blob store)))
|
||||
(persist/blob-fetch blob2 (persist/kv-get db2 "invoice")))))
|
||||
"INVOICEPDF")
|
||||
(persist-test
|
||||
"two independent restarts converge to the same state (determinism)"
|
||||
(let
|
||||
((disk (persist/mem-backend)))
|
||||
(begin
|
||||
(let
|
||||
((db (persist/mock-durable disk)))
|
||||
(begin
|
||||
(persist/append db "orders" "placed" 0 {})
|
||||
(persist/append db "orders" "placed" 1 {})
|
||||
(persist/append db "orders" "placed" 2 {})))
|
||||
(equal?
|
||||
(persist/project-fold
|
||||
(persist/mock-durable disk)
|
||||
"orders"
|
||||
rec-count
|
||||
0)
|
||||
(persist/project-fold
|
||||
(persist/mock-durable disk)
|
||||
"orders"
|
||||
rec-count
|
||||
0))))
|
||||
true)
|
||||
@@ -42,7 +42,7 @@ read models (feeds, indices, audit logs) update incrementally.
|
||||
|
||||
## Status (rolling)
|
||||
|
||||
`bash lib/persist/conformance.sh` → **105/105** (Phases 1–3 done, Phase 4 in progress)
|
||||
`bash lib/persist/conformance.sh` → **111/111** (Phases 1–4 complete)
|
||||
|
||||
## Ground rules
|
||||
|
||||
@@ -105,14 +105,58 @@ lib/persist/backend.sx lib/persist/api.sx
|
||||
## Phase 4 — Durable backends via kernel IO
|
||||
- [x] file/log backend driven through `perform` (IO-suspension boundary)
|
||||
- [x] blob backend interface (store ref/CID; bytes live in artdag/IPFS)
|
||||
- [ ] crash/restart replay test (mock IO platform)
|
||||
- [ ] migration notes for swapping mem → durable under a live subsystem
|
||||
- [x] crash/restart replay test (mock IO platform)
|
||||
- [x] migration notes for swapping mem → durable under a live subsystem
|
||||
|
||||
### Migration notes — mem → durable under a live subsystem
|
||||
|
||||
The facet API takes the backend as its first argument and never names a concrete
|
||||
backend, so swapping storage is a one-line change at the open site:
|
||||
|
||||
```
|
||||
(persist/open) ; in-memory (test / ephemeral)
|
||||
(persist/mock-durable (persist/mem-backend)); durable protocol, in-process disk
|
||||
(persist/durable-backend) ; production: ops cross perform → host
|
||||
```
|
||||
|
||||
Everything above the backend — `append`/`read`/`project`/`subscribe`/`snapshot`
|
||||
/`compact` — is byte-identical across all three. A subsystem migrates by:
|
||||
|
||||
1. **Pick the seam.** The subsystem holds one backend value (today an in-memory
|
||||
list). Replace its construction with `persist/open`/`durable-backend`; leave
|
||||
every call site untouched.
|
||||
2. **Backfill.** For an existing in-memory store, replay its current state into
|
||||
the durable backend once (append historical events / `kv-put` current
|
||||
values) before cutting reads over. New writes go to durable from then on.
|
||||
3. **Read models rebuild themselves.** A projection is pure `(fold step seed)`;
|
||||
after cutover, `persist/replay` (snapshot + tail) reconstructs every read
|
||||
model from the durable log — no bespoke migration of derived state.
|
||||
4. **Blobs first, by reference.** Move large payloads into the content store and
|
||||
store only `persist/blob-ref`s; the log/kv stay small, so the backfill in (2)
|
||||
never copies bytes.
|
||||
5. **Concurrency is already handled.** Two writers racing a stream get a
|
||||
`persist/conflict?` result, not corruption — the same on mem or durable, so
|
||||
no new code is needed at cutover.
|
||||
|
||||
The only behavioural difference durable introduces is that each op crosses the
|
||||
kernel IO-suspension boundary (`perform`): under the real kernel the call
|
||||
suspends and the host resumes it transparently, so the facet code is unaware.
|
||||
Tests prove this by routing the identical request shapes through `persist/serve`
|
||||
over an in-process disk (the mock-IO harness).
|
||||
|
||||
## Consumers (post-foundation, not in scope here)
|
||||
feed/-log, flow store, mod/audit, search index, acl grants, identity sessions all
|
||||
become `persist` log or kv. Track each migration in that subsystem's plan.
|
||||
|
||||
## Progress log
|
||||
- **Phase 4c+4d (111/111) — Phase 4 complete, roadmap done.** `recovery.sx` — a
|
||||
6-test crash/restart integration: an order ledger (event log + subscription
|
||||
kv read model + snapshot + compaction + invoice blob ref) over the durable
|
||||
backend, where "crash" drops every in-process object and "restart" rebuilds
|
||||
over the same disk + content store. Log, read model, snapshot, compacted
|
||||
replay, and blob ref all survive; seq continues; two restarts converge
|
||||
(determinism). Migration notes (mem → durable under a live subsystem) added
|
||||
inline above.
|
||||
- **Phase 4b (105/105).** `blob.sx` — large objects stay out of persist. A blob
|
||||
ref is `{:cid :size :mime}`; the blob store is a SEPARATE injected dependency
|
||||
(`persist/blob-io` over an injectable transport, perform in prod / mock
|
||||
|
||||
Reference in New Issue
Block a user