# flow-on-sx: Durable DAG Workflows on Scheme rose-ash needs workflows that survive restarts: content pipelines (write → review → publish → federate), scheduled jobs (digest emails), multi-step user flows (signup, confirm, onboard). art-dag is the precedent — DAG-of-tasks with pause/resume at IO boundaries. Scheme's `call/cc` + delimited continuations make pause/resume natural: a `suspend` captures the continuation, serializes it as part of the flow record, and `resume` re-enters at exactly that point. No state-machine bookkeeping by hand. R7RS-small is already at 2644/2644 (see kernel/architecture status). End-state: a Scheme-on-SX layer over the existing scheme runtime, with combinators for sequence/parallel/branch/retry/timeout/suspend, persistent flow store, and a federation extension via fed-sx for remote-node execution. ## Status (rolling) `bash lib/flow/conformance.sh` → **0/0** (not yet started) ## Ground rules - **Scope:** only touch `lib/flow/**` and `plans/flow-on-sx.md`. Do **not** edit `spec/`, `hosts/`, `shared/`, `lib/scheme/**`, or other `lib//`. You may **import** from `lib/scheme/` (public API via `lib/scheme/scheme.sx`); do **not** modify Scheme. - **Shared-file issues** go under "Blockers" with a minimal repro; do not fix here. - **SX files:** use `sx-tree` MCP tools only. - **Architecture:** flow combinators are Scheme macros + procedures. Runtime is a driver loop that walks the flow graph and invokes `call/cc` at `suspend` points. Persistence layer serializes the continuation + open file/socket placeholders are forbidden (continuations must be resumable across process restart). - **art-dag awareness:** read `plans/art-dag*` if it exists for design lineage; do not import code. - **Commits:** one feature per commit. Keep Progress log updated and tick boxes. ## Architecture sketch ``` (defflow publish (sequence (write-content) (parallel (review) (spell-check)) (cond approved? (sequence (publish) (federate)) (notify-author)))) │ ▼ lib/flow/spec.sx lib/flow/runtime.sx lib/flow/store.sx — defflow — driver loop — append-only flow log — sequence/parallel — node dispatch — checkpoint serialize — cond/retry/timeout — call/cc at suspend — restart loader — suspend/resume │ │ ▼ ▼ lib/flow/api.sx lib/flow/remote.sx — (flow/start name args) — fed-sx adapter — (flow/resume id value) — node-on-peer execution — (flow/cancel id) — failure handling ``` ## Phase 1 — Declarative DAG + sequential execution - [ ] `lib/flow/spec.sx` — `defflow` macro, `sequence` combinator - [ ] node = Scheme thunk; output threads to next node (data flow) - [ ] `parallel` combinator (sequential semantics for now — TRUE parallelism in Phase 3) - [ ] runtime executes a flow synchronously, returns final value - [ ] `lib/flow/api.sx` — `(flow/start name args)` entry point - [ ] `lib/flow/tests/basic.sx` — 15+ cases: linear sequence, nested sequences, data flow between nodes, parallel-with-join - [ ] `lib/flow/scoreboard.{json,md}` - [ ] `lib/flow/conformance.sh` ## Phase 2 — Control flow + error handling - [ ] `cond` combinator — predicate selects branch - [ ] `retry n [backoff]` — re-runs node up to n times on exception - [ ] `timeout ms` — bounds node execution - [ ] `try-catch` — exception handler with reified error - [ ] error model — exceptions vs explicit `(fail :reason ...)` results - [ ] `lib/flow/tests/control.sx` — 25+ cases: each combinator + composition ## Phase 3 — Suspend / resume (the showcase) - [ ] `(suspend reason)` — `call/cc` captures continuation, returns flow-id to caller - [ ] `lib/flow/store.sx` — serialize flow state (continuation + open vars) - [ ] `(flow/resume id value)` — load continuation, inject value, re-enter - [ ] `(flow/cancel id)` — explicit termination - [ ] crash recovery — on restart, scan store for paused flows, mark resumable - [ ] `lib/flow/tests/suspend.sx` — pause-resume scenarios, cancellation, "restart" scenarios (simulated by re-loading store) ## Phase 4 — Distributed nodes via fed-sx - [ ] `(remote-node addr fn args)` — execute node on a federation peer - [ ] failure semantics — retry on different peer, fall through to local - [ ] persistence across instances — flow state replicates via fed-sx - [ ] handoff — flow started here can resume on a peer if the local instance is down - [ ] `lib/flow/tests/distributed.sx` — federated flow scenarios (mock fed-sx in tests) ## Progress log (loop fills this in) ## Blockers (loop fills this in)