Merge loops/persist into architecture: persist-on-sx durable substrate
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 58s
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 58s
The shared durable-state substrate (lib/persist) other subsystems build on: log + kv facets over an injectable backend, projections, subscriptions, snapshots + compaction, optimistic concurrency, a durable backend over the kernel perform IO boundary (blobs by reference), plus extensions (materialized views, kv CAS, stream catalog, query helpers, atomic batch, schema-evolution upcasters, exactly-once append, global commit ordering) and a worked ACL reference migration. 201/201 tests across 20 suites. Durability awaits the host-side storage adapter (tracked in the plan's Blockers; loops/host-persist). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
115
plans/agent-briefings/persist-loop.md
Normal file
115
plans/agent-briefings/persist-loop.md
Normal file
@@ -0,0 +1,115 @@
|
||||
# persist-on-sx loop agent (single agent, queue-driven)
|
||||
|
||||
Role: iterates `plans/persist-on-sx.md` forever. **Durable state on the SX kernel**
|
||||
— the foundation substrate every other subsystem currently fakes with an in-memory
|
||||
mutable list. Event log (append-only streams) + kv (current-state) over one
|
||||
injectable backend; pure projections; snapshots; durable IO at the kernel's
|
||||
`perform` boundary. This is **substrate-level**, not a guest language.
|
||||
|
||||
```
|
||||
description: persist-on-sx queue loop
|
||||
subagent_type: general-purpose
|
||||
run_in_background: true
|
||||
isolation: worktree
|
||||
```
|
||||
|
||||
## Prompt
|
||||
|
||||
You are the sole background agent working `plans/persist-on-sx.md`. Isolated
|
||||
worktree `/root/rose-ash-loops/persist` on branch `loops/persist`, forever, one
|
||||
commit per feature. Push to `origin/loops/persist` after every commit. Never touch
|
||||
`main` or `architecture`.
|
||||
|
||||
## Restart baseline — check before iterating
|
||||
|
||||
1. Read `plans/persist-on-sx.md` — roadmap + Progress log. Note the scope table:
|
||||
persist owns the **log** + **kv** facets; blobs are delegated (store the CID,
|
||||
not the bytes); cache is out of scope. Do not event-source everything.
|
||||
2. `ls lib/persist/` — pick up from the most advanced file.
|
||||
3. If `lib/persist/tests/*.sx` exist, run them via `bash lib/persist/conformance.sh`.
|
||||
Green before new work.
|
||||
4. If `lib/persist/scoreboard.md` exists, that's your baseline.
|
||||
5. **Learn the substrate before writing durable code.** persist sits on the kernel's
|
||||
IO-suspension surface — the third CEK phase: `perform`, `cek-step-loop`,
|
||||
`cek-resume`, `make-cek-suspended`. Study how IO is requested and resumed, and
|
||||
how `spec/harness.sx` mocks an IO platform for tests (assert-io-*). Phases 1–3
|
||||
need NO real IO — the in-memory backend is pure SX. Real durable IO (Phase 4)
|
||||
goes through `perform` and is tested against the mock-IO harness, not a real disk.
|
||||
Verify the actual exported names with sx_find_all / grep before relying on them.
|
||||
|
||||
## The queue
|
||||
|
||||
Phase order per `plans/persist-on-sx.md`:
|
||||
|
||||
- **Phase 1** — log + kv + in-memory backend (event record, injectable backend
|
||||
protocol, append/read, kv get/put/delete, api).
|
||||
- **Phase 2** — projections (`fold step seed`) + subscriptions; concurrency
|
||||
conflict as a real result.
|
||||
- **Phase 3** — snapshots + replay (checkpoint, replay = snapshot + tail,
|
||||
determinism).
|
||||
- **Phase 4** — durable backend via kernel IO (`perform`), blob-ref interface,
|
||||
crash/restart replay against the mock-IO harness.
|
||||
|
||||
Within a phase, pick the checkbox that unlocks the most tests per effort.
|
||||
|
||||
Every iteration: implement → test → commit → tick `[ ]` → Progress log → next.
|
||||
|
||||
## Ground rules (hard)
|
||||
|
||||
- **Scope:** only `lib/persist/**` and `plans/persist-on-sx.md`. Do **not** edit
|
||||
`spec/`, `hosts/`, `shared/`, or any `lib/<lang>/`. You may **import** the
|
||||
kernel's IO-suspension + platform-IO surface only. **Do NOT add host primitives.**
|
||||
If a durable IO op you need doesn't exist, it belongs in `hosts/` (out of scope) →
|
||||
Blockers entry with a minimal repro, and stop on that item.
|
||||
- **NEVER call `sx_build`.** 600s watchdog. If the sx_server binary is broken →
|
||||
Blockers entry, stop. Run tests by invoking the sx_server binary directly from a
|
||||
conformance.sh (model it on an existing one, e.g. `lib/apl/conformance.sh`),
|
||||
pointing `SX_SERVER` at `/root/rose-ash/hosts/ocaml/_build/default/bin/sx_server.exe`
|
||||
— fresh worktrees have no `_build/`.
|
||||
- **Determinism:** replay must be pure — same log → same state. No clocks/randomness
|
||||
inside projections; timestamps live on the event, passed in.
|
||||
- **Shared-file issues** → plan's Blockers with minimal repro; don't fix here.
|
||||
- **SX files:** `sx-tree` MCP tools ONLY. **They take `file:` not `path:`** — a
|
||||
wrong key yields `Yojson Type_error("Expected string, got null")`, which looks
|
||||
like a broken binary but is just a param mismatch. `sx_validate` after edits.
|
||||
Path-based edits (`sx_replace_node`) count comment headers in their indices and
|
||||
can clobber the wrong node — re-read after, or prefer `sx_write_file` for small
|
||||
files.
|
||||
- **Unicode in `.sx`:** raw UTF-8 only, never `\uXXXX` escapes.
|
||||
- **Commit granularity:** one feature per commit. Short factual messages
|
||||
(`persist: kv facet get/put/delete + 6 tests`). Push to `origin/loops/persist`.
|
||||
- **Plan file:** update Progress log (newest first) + tick boxes every commit.
|
||||
|
||||
## persist-specific gotchas
|
||||
|
||||
- **Two facets, not one.** Don't force current-state values (a stock count, a
|
||||
config value, a session blob) through the event log — that's the kv facet. Event
|
||||
log is for things whose *history* matters.
|
||||
- **Backend is injected.** The in-memory backend is the test default; never hardwire
|
||||
it. Every op goes through the backend protocol so file/pg/ipfs swap in unchanged.
|
||||
- **Optimistic concurrency is a real result.** A conflicting append returns a
|
||||
conflict value the caller can retry on — not a crash, not a silent overwrite.
|
||||
- **Blobs by reference only.** persist stores a content-address/CID + metadata. The
|
||||
bytes live in a content-addressed store (artdag/IPFS). Never put large payloads in
|
||||
the log.
|
||||
- **Replay determinism is the headline property.** Snapshot + tail must equal full
|
||||
replay. Test it explicitly, both directions.
|
||||
|
||||
## General gotchas (all loops)
|
||||
|
||||
- SX `do` = R7RS iteration. Use `begin` for multi-expr sequences.
|
||||
- `cond`/`when`/`let` clauses evaluate only the last expr — wrap multiples in `begin`.
|
||||
- `let` is parallel, not sequential — nest `let`s when a binding references an earlier one.
|
||||
- `env-bind!` creates a binding; `env-set!` mutates an existing one (walks scope chain).
|
||||
- `sx_validate` after every structural edit.
|
||||
- Namespace-prefix all helpers (`persist/...`) — short/host-colliding names get
|
||||
silently shadowed or hang the runtime.
|
||||
|
||||
## Style
|
||||
|
||||
- No comments in `.sx` unless non-obvious.
|
||||
- No new planning docs — update `plans/persist-on-sx.md` inline.
|
||||
- Short, factual commit messages.
|
||||
- One feature per iteration. Commit. Log. Push. Next.
|
||||
|
||||
Go. Start by reading the plan; find the first unchecked `[ ]`; implement it.
|
||||
@@ -42,7 +42,7 @@ read models (feeds, indices, audit logs) update incrementally.
|
||||
|
||||
## Status (rolling)
|
||||
|
||||
`bash lib/persist/conformance.sh` → **0/0** (not yet started)
|
||||
`bash lib/persist/conformance.sh` → **201/201** (Phases 1–4 complete + extensions + a reference migration)
|
||||
|
||||
## Ground rules
|
||||
|
||||
@@ -87,33 +87,325 @@ lib/persist/backend.sx lib/persist/api.sx
|
||||
```
|
||||
|
||||
## Phase 1 — Log + kv + in-memory backend
|
||||
- [ ] `event.sx` — event record, stream/seq helpers
|
||||
- [ ] `backend.sx` — injectable protocol + in-memory impl (log + kv)
|
||||
- [ ] `log.sx` — `append` (optimistic seq), `read`, `read-from`
|
||||
- [ ] `kv.sx` — `get`/`put`/`delete` current-state
|
||||
- [ ] `api.sx` + tests + scoreboard + conformance.sh
|
||||
- [x] `event.sx` — event record, stream/seq helpers
|
||||
- [x] `backend.sx` — injectable protocol + in-memory impl (log + kv)
|
||||
- [x] `log.sx` — `append` (optimistic seq), `read`, `read-from`
|
||||
- [x] `kv.sx` — `get`/`put`/`delete` current-state
|
||||
- [x] `api.sx` + tests + scoreboard + conformance.sh
|
||||
|
||||
## Phase 2 — Projections + subscriptions
|
||||
- [ ] `project.sx` — `(project stream step seed)`, incremental fold
|
||||
- [ ] subscription hook — projection / kv read model re-runs on append
|
||||
- [ ] concurrency conflict surfaced as a real result, not a crash
|
||||
- [x] `project.sx` — `(project stream step seed)`, incremental fold
|
||||
- [x] subscription hook — projection / kv read model re-runs on append
|
||||
- [x] concurrency conflict surfaced as a real result, not a crash
|
||||
|
||||
## Phase 3 — Snapshots + replay
|
||||
- [ ] `snapshot.sx` — checkpoint a projection; replay = snapshot + tail
|
||||
- [ ] compaction policy; replay-determinism tests
|
||||
- [x] `snapshot.sx` — checkpoint a projection; replay = snapshot + tail
|
||||
- [x] compaction policy; replay-determinism tests
|
||||
|
||||
## Phase 4 — Durable backends via kernel IO
|
||||
- [ ] file/log backend driven through `perform` (IO-suspension boundary)
|
||||
- [ ] blob backend interface (store ref/CID; bytes live in artdag/IPFS)
|
||||
- [ ] crash/restart replay test (mock IO platform)
|
||||
- [ ] migration notes for swapping mem → durable under a live subsystem
|
||||
- [x] file/log backend driven through `perform` (IO-suspension boundary)
|
||||
- [x] blob backend interface (store ref/CID; bytes live in artdag/IPFS)
|
||||
- [x] crash/restart replay test (mock IO platform)
|
||||
- [x] migration notes for swapping mem → durable under a live subsystem
|
||||
|
||||
### Migration notes — mem → durable under a live subsystem
|
||||
|
||||
The facet API takes the backend as its first argument and never names a concrete
|
||||
backend, so swapping storage is a one-line change at the open site:
|
||||
|
||||
```
|
||||
(persist/open) ; in-memory (test / ephemeral)
|
||||
(persist/mock-durable (persist/mem-backend)); durable protocol, in-process disk
|
||||
(persist/durable-backend) ; production: ops cross perform → host
|
||||
```
|
||||
|
||||
Everything above the backend — `append`/`read`/`project`/`subscribe`/`snapshot`
|
||||
/`compact` — is byte-identical across all three. A subsystem migrates by:
|
||||
|
||||
1. **Pick the seam.** The subsystem holds one backend value (today an in-memory
|
||||
list). Replace its construction with `persist/open`/`durable-backend`; leave
|
||||
every call site untouched.
|
||||
2. **Backfill.** For an existing in-memory store, replay its current state into
|
||||
the durable backend once (append historical events / `kv-put` current
|
||||
values) before cutting reads over. New writes go to durable from then on.
|
||||
3. **Read models rebuild themselves.** A projection is pure `(fold step seed)`;
|
||||
after cutover, `persist/replay` (snapshot + tail) reconstructs every read
|
||||
model from the durable log — no bespoke migration of derived state.
|
||||
4. **Blobs first, by reference.** Move large payloads into the content store and
|
||||
store only `persist/blob-ref`s; the log/kv stay small, so the backfill in (2)
|
||||
never copies bytes.
|
||||
5. **Concurrency is already handled.** Two writers racing a stream get a
|
||||
`persist/conflict?` result, not corruption — the same on mem or durable, so
|
||||
no new code is needed at cutover.
|
||||
|
||||
The only behavioural difference durable introduces is that each op crosses the
|
||||
kernel IO-suspension boundary (`perform`): under the real kernel the call
|
||||
suspends and the host resumes it transparently, so the facet code is unaware.
|
||||
Tests prove this by routing the identical request shapes through `persist/serve`
|
||||
over an in-process disk (the mock-IO harness).
|
||||
|
||||
## Extensions (post-roadmap)
|
||||
- [x] `view.sx` — materialized views: bundle stream + fold + snapshot name;
|
||||
`view-attach` keeps the snapshot current on every publish so `view-peek` is an
|
||||
O(1) read. The consumer-facing read-model abstraction (feed indices, audit
|
||||
rollups, search counters).
|
||||
|
||||
- [x] `kv.sx` CAS — `persist/kv-cas` (compare-and-swap) + `persist/kv-put-new`
|
||||
(create-only): atomic current-state updates, conflict as a real value (kv
|
||||
analogue of log `append-expect`). For sessions, acl grants, stock counts.
|
||||
|
||||
- [x] `catalog.sx` — stream catalog: `persist/streams`/`stream-count`/
|
||||
`stream-exists?`/`total-events`. Backend `:streams` op (from seq high-water
|
||||
marks, so compacted streams still list), threaded through mem + durable.
|
||||
|
||||
- [x] `query.sx` — read-side scans: `read-between` (seq range), `read-since`/
|
||||
`read-window` (by `:at`), `read-by-type`, `read-where`, `count-where`. Pure
|
||||
reads for audit windows / type filters / since-cursors.
|
||||
|
||||
- [x] `batch.sx` — `persist/append-batch` commits a list of `(type at data)`
|
||||
specs as one contiguous block; `persist/append-batch-expect` is transactional
|
||||
(all-or-nothing guarded by optimistic concurrency). For an order + its line
|
||||
items as one commit.
|
||||
|
||||
- [x] `upcast.sx` — event schema evolution: register a pure `(event -> event)`
|
||||
upcaster per type; `read-upcast`/`project-upcast` lift old events to the
|
||||
current shape on read so projections see one shape. Immutable registry;
|
||||
`upcast-data` helper merges new `:data` fields. Addresses the schema-evolution
|
||||
trap without rewriting history.
|
||||
|
||||
- [x] `idempotency.sx` — exactly-once append under retries: `persist/append-once`
|
||||
keyed by a caller idempotency key (per stream), returning the same event on a
|
||||
repeat. Marker lives in kv, so idempotency holds across restart. `seen?` check.
|
||||
|
||||
- [x] `global.sx` — global commit ordering across streams (the primitive feed's
|
||||
unified timeline needs). `persist/gappend` records a pointer in a reserved
|
||||
`$global` index whose seq is the commit position; `read-global`/
|
||||
`project-global` replay every event in commit order; `global-from` for
|
||||
incremental consumers. Opt-in (plain `append` never touches it); reserved
|
||||
index hidden from the public catalog. Deterministic across restart.
|
||||
|
||||
## Consumers (post-foundation, not in scope here)
|
||||
feed/-log, flow store, mod/audit, search index, acl grants, identity sessions all
|
||||
become `persist` log or kv. Track each migration in that subsystem's plan.
|
||||
|
||||
**Reference migration:** `lib/persist/examples/acl.sx` is a worked, tested
|
||||
template — an ACL-grants store rebuilt on persist (grants/revokes as events,
|
||||
current set as a projection, O(1) checks via a materialized view, an audit-window
|
||||
query). It carries an explicit BEFORE (hand-rolled ephemeral map) → AFTER
|
||||
diff in its header and proves the headline win (grants survive restart) on the
|
||||
durable backend. Other subsystem loops copy this pattern; it does not touch the
|
||||
real `lib/acl`.
|
||||
|
||||
## Progress log
|
||||
(loop fills this in)
|
||||
- **Reference migration: acl grants (201/201).** `lib/persist/examples/acl.sx` —
|
||||
a worked, in-scope template migrating an ACL-grants store from a hand-rolled
|
||||
ephemeral map to persist: grants/revokes as events, current set as a
|
||||
projection, O(1) checks via a materialized view, audit via `read-window`.
|
||||
Header carries the BEFORE→AFTER diff. 10 tests, incl. grants surviving restart
|
||||
on the durable backend (the capability the BEFORE version lacked). The pattern
|
||||
other subsystem loops copy.
|
||||
- **Ext: global commit ordering (191/191).** `global.sx` — `persist/gappend`
|
||||
records a pointer in a reserved `$global` index (its seq = global commit
|
||||
position); `read-global`/`project-global` resolve pointers to events in commit
|
||||
order; `global-from` for incremental global consumers. Opt-in; `$`-streams are
|
||||
now reserved + hidden from the public catalog (`streams-all` reveals them).
|
||||
Gives feed its cross-stream timeline. 11 tests incl. durable + restart
|
||||
determinism.
|
||||
- **Ext: exactly-once append (180/180).** `idempotency.sx` —
|
||||
`persist/append-once` appends at most once per (stream, idempotency key),
|
||||
returning the same event on a repeat; the marker lives in kv so it survives
|
||||
restart (verified on durable). `persist/seen?` check. 9 tests.
|
||||
- **Ext: event schema evolution (171/171).** `upcast.sx` — per-type pure
|
||||
`(event -> event)` upcasters in an immutable registry; `read-upcast`/
|
||||
`project-upcast` lift legacy events to the current shape on read so
|
||||
projections never branch on version. `upcast-data` merges new `:data` fields
|
||||
keeping stream/seq/type/at. 9 tests incl. mixed old/new + durable.
|
||||
- **Ext: atomic batch append (162/162).** `batch.sx` — `persist/append-batch`
|
||||
commits `(type at data)` specs as one contiguous block (real cons-list, in
|
||||
order); `persist/append-batch-expect` checks the stream is still at expected
|
||||
before writing any event, so the batch is all-or-nothing under a concurrent
|
||||
writer. 10 tests incl. conflict-writes-nothing + durable.
|
||||
- **Ext: read-side query helpers (152/152).** `query.sx` — `read-between` (seq
|
||||
range), `read-since`/`read-window` (by `:at`), `read-by-type`, `read-where`,
|
||||
`count-where`. Pure scans over `persist/read`; for ad-hoc relational queries
|
||||
consumers still project into a kv read model. 9 tests incl. durable.
|
||||
- **Ext: stream catalog (143/143).** New backend op `:streams` (keys of the seq
|
||||
high-water-mark dict, threaded through mem-backend + durable serve/io-backend)
|
||||
so fully-compacted streams still enumerate. `catalog.sx`:
|
||||
`persist/streams`/`stream-count`/`stream-exists?`/`total-events`. 10 tests
|
||||
incl. durable + restart.
|
||||
- **Ext: kv compare-and-swap (133/133).** `persist/kv-cas` sets a key only if
|
||||
its current value equals expected, else returns `{:conflict :expected
|
||||
:actual}`; `persist/kv-put-new` is create-only. The kv analogue of log
|
||||
`append-expect` — atomic current-state for sessions/acl/stock. 11 tests incl.
|
||||
racer + retry + durable backend.
|
||||
- **Ext: materialized views (122/122).** `view.sx` — `persist/view` bundles
|
||||
stream + step + seed + snapshot name; `view-attach` subscribes it to a hub so
|
||||
every publish refreshes the snapshot incrementally; `view-peek` is then an
|
||||
O(1) current read (no fold), `view-value` always folds the tail so it's never
|
||||
stale. 11 tests incl. on durable backend + a sum-over-data view.
|
||||
- **Phase 4c+4d (111/111) — Phase 4 complete, roadmap done.** `recovery.sx` — a
|
||||
6-test crash/restart integration: an order ledger (event log + subscription
|
||||
kv read model + snapshot + compaction + invoice blob ref) over the durable
|
||||
backend, where "crash" drops every in-process object and "restart" rebuilds
|
||||
over the same disk + content store. Log, read model, snapshot, compacted
|
||||
replay, and blob ref all survive; seq continues; two restarts converge
|
||||
(determinism). Migration notes (mem → durable under a live subsystem) added
|
||||
inline above.
|
||||
- **Phase 4b (105/105).** `blob.sx` — large objects stay out of persist. A blob
|
||||
ref is `{:cid :size :mime}`; the blob store is a SEPARATE injected dependency
|
||||
(`persist/blob-io` over an injectable transport, perform in prod / mock
|
||||
content store in tests). `persist/blob-store` puts bytes and returns ONLY the
|
||||
ref; `persist/blob-fetch` retrieves bytes via the ref. Mock store is
|
||||
content-addressed (same bytes dedupe). 14 tests assert the invariant: a ref in
|
||||
the log/kv carries the CID, never the bytes (`has-key? :bytes` is false).
|
||||
- **Phase 4a (91/91).** `durable.sx` — a backend whose every op crosses the
|
||||
kernel IO boundary via `(perform {:op "persist/..." :args (...)})`. The
|
||||
transport is injectable: `persist/durable-backend` uses the kernel's
|
||||
`perform` (suspends; host resumes); `persist/mock-durable` uses
|
||||
`persist/serve` over an in-memory disk. `persist/serve` is the reference host
|
||||
+ the mock-IO harness. Because the request shapes are identical, the ENTIRE
|
||||
facet stack (log/kv/project/snapshot/compaction) runs unchanged on
|
||||
mock-durable — verified. Crash/restart (drop backend, keep disk) recovers log
|
||||
+ kv + snapshot by replay; seq counter continues. 15 tests. See Blockers for
|
||||
why end-to-end perform suspension isn't exercised under sx_server.exe.
|
||||
- **Phase 3b (76/76) — Phase 3 complete.** Backend refactor: `last-seq` is now
|
||||
a monotonic per-stream high-water mark (backend `seqs` dict), not physical
|
||||
length, so a compacted log keeps assigning climbing seqs. Added backend
|
||||
`:truncate-through` + `persist/truncate`. `compaction.sx` — `persist/compact`
|
||||
checkpoints then drops events with seq <= snapshot seq; `should-compact?`/
|
||||
`maybe-compact` give an explicit "compact every N tail events" policy. 11
|
||||
tests: post-compaction replay value == uncompacted full replay (determinism),
|
||||
seq continuity after truncation, idempotence. `persist/count` = physical
|
||||
stored count (shrinks on compaction) vs `persist/last-seq` = logical.
|
||||
- **Phase 3a (65/65).** `snapshot.sx` — a snapshot is a projection state
|
||||
`{:value :seq}` stored in the kv facet under `snapshot/<name>`.
|
||||
`persist/checkpoint` replays + saves; `persist/replay` = snapshot + tail.
|
||||
11 tests assert the headline both ways: snapshot+tail == full replay (value
|
||||
and whole state), plus replay determinism.
|
||||
- **Phase 2c (54/54) — Phase 2 complete.** `concurrency.sx` — optimistic
|
||||
concurrency: `persist/append-expect b stream expected ...` refuses the append
|
||||
if the stream advanced past `expected`, returning a conflict VALUE
|
||||
`{:conflict true :expected :actual}` (never a crash, never a silent
|
||||
overwrite). `persist/conflict?` + accessors; caller re-reads actual and
|
||||
retries. 8 tests incl. two-writer race + retry.
|
||||
- **Phase 2b (46/46).** `subscribe.sx` — `persist/hub` wraps a backend with
|
||||
per-stream callbacks. `persist/publish` appends then fires subscribers
|
||||
`(backend stream event)`; direct `persist/append` bypasses them by design
|
||||
(bulk load/replay). Canonical use: callback re-runs `project-resume` or bumps
|
||||
a kv counter so read models update on write. 9 tests.
|
||||
- **Phase 2a (37/37).** `project.sx` — projection state `{:value :seq}`;
|
||||
`persist/project` folds whole stream from seed, `persist/project-resume`
|
||||
folds only the tail (seq > prior seq) so read models update incrementally.
|
||||
step is pure `(value event) -> value`. 9 tests incl. resume==full-from-zero.
|
||||
- **Phase 1 complete (28/28).** `event.sx` (event record + accessors),
|
||||
`backend.sx` (injectable protocol + in-memory log/kv impl, closure state via
|
||||
set!), `log.sx` (append/read/read-from, sequential per-stream seq, stream
|
||||
isolation), `kv.sx` (get/put/delete/has?/keys/get-or/update), `api.sx`
|
||||
(`persist/open` — mem default, backend injectable). conformance.sh + three
|
||||
suites (event/log/kv). Gotcha logged in Blockers: `map` returns an
|
||||
array-backed list not `equal?` to a `(list ...)` literal — assertions build
|
||||
compared lists with list/nth.
|
||||
|
||||
## Blockers
|
||||
(loop fills this in)
|
||||
|
||||
### OPEN — host durable-storage adapter (the only gap to real durability)
|
||||
|
||||
**Owner:** a `hosts/` loop (NOT this one — `lib/persist/**` is the scope fence,
|
||||
and `sx_build` is forbidden here). **Without it, durable persistence silently
|
||||
drops all writes.**
|
||||
|
||||
**Symptom / minimal repro.** `persist/durable-backend` performs
|
||||
`{:op "persist/..." :args (...)}` for every storage op. Under `sx_server.exe`
|
||||
the kernel's default IO resolver answers unknown ops with `nil` — so the durable
|
||||
backend does not error, it *silently no-ops*:
|
||||
|
||||
```
|
||||
; load event/backend/log/durable, then:
|
||||
(let ((b (persist/durable-backend)))
|
||||
(begin (persist/append b "s" "x" 0 {})
|
||||
(persist/append b "s" "x" 0 {})
|
||||
(list (persist/event-seq (persist/append b "s" "x" 0 {}))
|
||||
(persist/count b "s")
|
||||
(persist/read b "s"))))
|
||||
; => (1 0 nil) ; every append gets seq 1, nothing stored, reads empty — DATA LOSS
|
||||
```
|
||||
|
||||
The in-memory backend (`persist/open`) is correct and complete; this gap is
|
||||
*only* the production transport.
|
||||
|
||||
**What to build.** A host servicer that answers the `persist/*` IO ops against a
|
||||
real store (sqlite/files/pg). It is the production twin of `persist/serve`
|
||||
(`lib/persist/durable.sx`) — same op names, same request/response shapes — so
|
||||
mirror that function and back it with durable storage instead of a mem-backend.
|
||||
|
||||
**Op contract** (request `{:op :args}` → response). `args` is a positional list;
|
||||
events are dicts `{:stream :seq :type :at :data}`:
|
||||
|
||||
| op | args | returns | semantics |
|
||||
|----|------|---------|-----------|
|
||||
| `persist/append` | `(stream event)` | (ignored) | store `event` in `stream` |
|
||||
| `persist/read` | `(stream)` | event list (oldest-first) | currently-stored events |
|
||||
| `persist/last-seq` | `(stream)` | number | **monotonic high-water mark** (see below) |
|
||||
| `persist/streams` | `()` | stream-name list | every stream ever appended to |
|
||||
| `persist/truncate` | `(stream n)` | (ignored) | drop events with `seq <= n` |
|
||||
| `persist/kv-get` | `(key)` | value or nil | |
|
||||
| `persist/kv-put` | `(key val)` | (ignored) | upsert |
|
||||
| `persist/kv-delete`| `(key)` | (ignored) | remove key |
|
||||
| `persist/kv-has?` | `(key)` | boolean | |
|
||||
| `persist/kv-keys` | `()` | key list | |
|
||||
|
||||
**Hard invariants** (the facets above rely on these; mem-backend + `persist/serve`
|
||||
are the reference):
|
||||
1. **`last-seq` is a per-stream monotonic counter, NOT the row count.** It must
|
||||
keep climbing after `truncate`, so a compacted stream never reassigns a seq.
|
||||
Store the counter separately from the rows.
|
||||
2. `append` is the only seq-assigner upstream (`log.sx` does `last-seq + 1`); the
|
||||
host must not renumber.
|
||||
3. `read` returns events in append order with `:seq` intact (post-truncate it
|
||||
returns only the surviving tail).
|
||||
4. `streams` is the set of streams that ever had an append (survives full
|
||||
compaction) — keep it keyed off the seq counters, like mem-backend's `seqs`.
|
||||
5. Values round-trip structurally: dicts/lists/numbers/strings/nil/booleans in =
|
||||
same out (event `:data`, kv values, blob refs).
|
||||
|
||||
**Blobs** are a *separate* adapter with the same pattern: ops `blob/put`
|
||||
`(bytes mime)` → cid, `blob/get` `(cid)` → bytes, `blob/has?` `(cid)` → bool
|
||||
(see `lib/persist/blob.sx` / `persist/blob-serve`). Back it with the
|
||||
content-addressed store (artdag/IPFS); persist only ever stores the returned ref.
|
||||
|
||||
**Where to register.** `hosts/ocaml/bin/sx_server.ml`:
|
||||
- the in-process resolver `Sx_types._cek_io_resolver` (~line 3864) — add a
|
||||
`"persist/..."` match arm dispatching to the new storage module (used by
|
||||
SSR/`eval_with_io`); and/or
|
||||
- the bridge path in `cek_run_with_io` (~line 528–576), which currently forwards
|
||||
unknown ops via `io_request op args` to the external bridge — a Python-bridge
|
||||
handler is the alternative home if storage lives Python-side.
|
||||
Pick one home; the op names are the contract, not the location.
|
||||
|
||||
**Acceptance test.** Swap the transport: point a `persist/io-backend` at the new
|
||||
host servicer (instead of `persist/serve` over a mem disk) and run the existing
|
||||
`durable` + `recovery` suites — they must stay green, and state must survive an
|
||||
actual process restart (kill the server, restart, replay → recovered). That is
|
||||
exactly what `lib/persist/tests/durable.sx` and `recovery.sx` already assert
|
||||
against the mock; the host adapter just makes the disk real.
|
||||
|
||||
---
|
||||
|
||||
- **Phase 4 perform-suspension not exercised end-to-end under sx_server.exe (by
|
||||
design, not a bug).** The CEK suspension primitives (`cek-step-loop`,
|
||||
`cek-resume`, `cek-suspended?`, `cek-io-request`) and a settable SX-level IO
|
||||
hook are only bound by the `run_tests` OCaml binary (out of scope: hosts/, and
|
||||
sx_build is forbidden). Under `sx_server.exe`, an unhandled `perform` resolves
|
||||
through the OCaml io-request/io-response stdin bridge (production path) — not
|
||||
callable from the pure-eval conformance harness. Resolution: the durable
|
||||
backend's transport is injectable, so the production path is one line
|
||||
`(perform req)` (kernel-handled) and ALL durable logic is tested through the
|
||||
mock transport (`persist/serve` over an in-memory disk). The single untested
|
||||
line is the kernel primitive itself. No host primitive needed; nothing to fix.
|
||||
- **Not a blocker, a testing convention:** `map` returns an array-backed list
|
||||
that is NOT `equal?` to a `(list ...)` cons-literal (two `map` results do
|
||||
compare equal to each other). When asserting list-shaped results against a
|
||||
`(list ...)` literal, build the compared value with `list`/`nth`/`cons`, not
|
||||
`map`. `into`/list-coercion needs the IO bridge and is unusable in the
|
||||
pure-eval harness.
|
||||
|
||||
Reference in New Issue
Block a user