flow: replication + handoff across instances + 6 tests (Phase 4 complete)
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 42s

flow-replicate-to copies the plain-data store export to a peer's replica slot;
flow-restore-from imports it. Handoff = replicate, local instance dies, peer
restores and resumes by id. The replay log survives the move, so all resolved
suspends carry over. Same durable-data mechanism as crash recovery, across
instances. All four phases complete: 93/93.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-06 17:48:39 +00:00
parent f8722b3b08
commit 16cb727406
7 changed files with 70 additions and 15 deletions

View File

@@ -16,7 +16,7 @@ federation extension via fed-sx for remote-node execution.
## Status (rolling)
`bash lib/flow/conformance.sh`**87/87** (Phases 1-3 done; Phase 4 in progress: remote-node + failover done)
`bash lib/flow/conformance.sh`**93/93** (all four phases complete)
## Ground rules
@@ -123,9 +123,15 @@ lib/flow/spec.sx lib/flow/runtime.sx lib/flow/store.sx
- [x] failure semantics — `(remote-failover addrs fn local)` tries each peer in
order, moves to the next on any raised error, and runs the `local` node if every
peer fails. 6 tests.
- [ ] persistence across instances — flow state replicates via fed-sx
- [ ] handoff — flow started here can resume on a peer if the local instance is down
- [ ] `lib/flow/tests/distributed.sx` — federated flow scenarios (mock fed-sx in tests)
- [x] persistence across instances — `(flow-replicate-to addr)` copies this
instance's store (the plain-data export) to a peer's replica slot;
`(flow-restore-from addr)` imports it. Same mechanism as crash recovery, across
instances.
- [x] handoff — a flow started here resumes on a peer after the local instance dies:
replicate → wipe local store → restore on peer → `flow/resume`. The replay log
(and thus all resolved suspends) survives the move.
- [x] `lib/flow/tests/distributed.sx` — 19 cases: remote-node, failover,
replication, handoff (including replay-log survival across the move)
## Progress log
@@ -139,6 +145,15 @@ lib/flow/spec.sx lib/flow/runtime.sx lib/flow/store.sx
combinators use `(lambda args ...)` variadics + top-level recursion. Scheme
strings come back boxed as `{:scm-string "..."}` — unwrap with `(get s :scm-string)`.
- **Phases 2-4.** Control flow (branch/retry/timeout/try-catch + fail-value error
model), then the showcase: durable suspend/resume. Guest call/cc is escape-only
(re-entry hangs), so resume uses **deterministic replay** — re-run the flow,
replaying resolved suspends from a `(tag value)` log; only plain data persists, so
flows survive a wiped store (crash recovery) and a move to another instance
(replication + handoff). Phase 4 models the fed-sx boundary with a mock peer
registry. Timeout is a cooperative step budget (no wall clock in pure SX). Test
harness reuses one env with a per-test reset for speed.
## Blockers
(none)