persist: Blocker spec for the host durable-storage adapter
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 36s
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 36s
Document the one gap to real durability: a hosts/ servicer for the persist/* IO ops. Includes the silent-data-loss repro (durable-backend currently no-ops under sx_server's default resolver), the full op contract table, hard invariants (monotonic last-seq, etc.), the blob adapter shape, where to register in sx_server.ml, and an acceptance test (swap transport, run durable + recovery suites against real storage, survive a real restart). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -308,6 +308,90 @@ real `lib/acl`.
|
||||
compared lists with list/nth.
|
||||
|
||||
## Blockers
|
||||
|
||||
### OPEN — host durable-storage adapter (the only gap to real durability)
|
||||
|
||||
**Owner:** a `hosts/` loop (NOT this one — `lib/persist/**` is the scope fence,
|
||||
and `sx_build` is forbidden here). **Without it, durable persistence silently
|
||||
drops all writes.**
|
||||
|
||||
**Symptom / minimal repro.** `persist/durable-backend` performs
|
||||
`{:op "persist/..." :args (...)}` for every storage op. Under `sx_server.exe`
|
||||
the kernel's default IO resolver answers unknown ops with `nil` — so the durable
|
||||
backend does not error, it *silently no-ops*:
|
||||
|
||||
```
|
||||
; load event/backend/log/durable, then:
|
||||
(let ((b (persist/durable-backend)))
|
||||
(begin (persist/append b "s" "x" 0 {})
|
||||
(persist/append b "s" "x" 0 {})
|
||||
(list (persist/event-seq (persist/append b "s" "x" 0 {}))
|
||||
(persist/count b "s")
|
||||
(persist/read b "s"))))
|
||||
; => (1 0 nil) ; every append gets seq 1, nothing stored, reads empty — DATA LOSS
|
||||
```
|
||||
|
||||
The in-memory backend (`persist/open`) is correct and complete; this gap is
|
||||
*only* the production transport.
|
||||
|
||||
**What to build.** A host servicer that answers the `persist/*` IO ops against a
|
||||
real store (sqlite/files/pg). It is the production twin of `persist/serve`
|
||||
(`lib/persist/durable.sx`) — same op names, same request/response shapes — so
|
||||
mirror that function and back it with durable storage instead of a mem-backend.
|
||||
|
||||
**Op contract** (request `{:op :args}` → response). `args` is a positional list;
|
||||
events are dicts `{:stream :seq :type :at :data}`:
|
||||
|
||||
| op | args | returns | semantics |
|
||||
|----|------|---------|-----------|
|
||||
| `persist/append` | `(stream event)` | (ignored) | store `event` in `stream` |
|
||||
| `persist/read` | `(stream)` | event list (oldest-first) | currently-stored events |
|
||||
| `persist/last-seq` | `(stream)` | number | **monotonic high-water mark** (see below) |
|
||||
| `persist/streams` | `()` | stream-name list | every stream ever appended to |
|
||||
| `persist/truncate` | `(stream n)` | (ignored) | drop events with `seq <= n` |
|
||||
| `persist/kv-get` | `(key)` | value or nil | |
|
||||
| `persist/kv-put` | `(key val)` | (ignored) | upsert |
|
||||
| `persist/kv-delete`| `(key)` | (ignored) | remove key |
|
||||
| `persist/kv-has?` | `(key)` | boolean | |
|
||||
| `persist/kv-keys` | `()` | key list | |
|
||||
|
||||
**Hard invariants** (the facets above rely on these; mem-backend + `persist/serve`
|
||||
are the reference):
|
||||
1. **`last-seq` is a per-stream monotonic counter, NOT the row count.** It must
|
||||
keep climbing after `truncate`, so a compacted stream never reassigns a seq.
|
||||
Store the counter separately from the rows.
|
||||
2. `append` is the only seq-assigner upstream (`log.sx` does `last-seq + 1`); the
|
||||
host must not renumber.
|
||||
3. `read` returns events in append order with `:seq` intact (post-truncate it
|
||||
returns only the surviving tail).
|
||||
4. `streams` is the set of streams that ever had an append (survives full
|
||||
compaction) — keep it keyed off the seq counters, like mem-backend's `seqs`.
|
||||
5. Values round-trip structurally: dicts/lists/numbers/strings/nil/booleans in =
|
||||
same out (event `:data`, kv values, blob refs).
|
||||
|
||||
**Blobs** are a *separate* adapter with the same pattern: ops `blob/put`
|
||||
`(bytes mime)` → cid, `blob/get` `(cid)` → bytes, `blob/has?` `(cid)` → bool
|
||||
(see `lib/persist/blob.sx` / `persist/blob-serve`). Back it with the
|
||||
content-addressed store (artdag/IPFS); persist only ever stores the returned ref.
|
||||
|
||||
**Where to register.** `hosts/ocaml/bin/sx_server.ml`:
|
||||
- the in-process resolver `Sx_types._cek_io_resolver` (~line 3864) — add a
|
||||
`"persist/..."` match arm dispatching to the new storage module (used by
|
||||
SSR/`eval_with_io`); and/or
|
||||
- the bridge path in `cek_run_with_io` (~line 528–576), which currently forwards
|
||||
unknown ops via `io_request op args` to the external bridge — a Python-bridge
|
||||
handler is the alternative home if storage lives Python-side.
|
||||
Pick one home; the op names are the contract, not the location.
|
||||
|
||||
**Acceptance test.** Swap the transport: point a `persist/io-backend` at the new
|
||||
host servicer (instead of `persist/serve` over a mem disk) and run the existing
|
||||
`durable` + `recovery` suites — they must stay green, and state must survive an
|
||||
actual process restart (kill the server, restart, replay → recovered). That is
|
||||
exactly what `lib/persist/tests/durable.sx` and `recovery.sx` already assert
|
||||
against the mock; the host adapter just makes the disk real.
|
||||
|
||||
---
|
||||
|
||||
- **Phase 4 perform-suspension not exercised end-to-end under sx_server.exe (by
|
||||
design, not a bug).** The CEK suspension primitives (`cek-step-loop`,
|
||||
`cek-resume`, `cek-suspended?`, `cek-io-request`) and a settable SX-level IO
|
||||
|
||||
Reference in New Issue
Block a user