Files
rose-ash/plans/business-logic-fed-flows.md
giles 915cc29a52 R3: test runner records a raising test as a failure (TDD); R2 deferred (mutex finding)
Failing test first (red: a probe with a raising actual-expr VANISHED — delta 0, total unchanged —
because the loader skips a raising top-level form and args are eager). Fix: host-bl-test is now a
MACRO expanding to (host-bl--check name (fn () actual) expected); the check evaluates the thunk under
(guard (e (true {:__raised …})) …), so an SX raise is recorded as a failure with the error instead
of disappearing. Native exceptions still escape guard — those already fail loud via conformance's
error grep, so this closes the actual silent-skip gap. Keeps the next TDD loop honest.

R2 DEFERRED: investigating it surfaced that lib/host serializes ALL handler evaluation per peer under
one mutex (held across persist IO + the outbound http-request) — zero intra-peer concurrency, so the
outbox 'race' is masked. Logged in plans + memory as the real concurrency task: narrow the handler
mutex for throughput (the multi-co-op future forces it, and that's when masked races become real).

blog suite 260/260; full conformance 662/662.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-07-03 21:47:29 +00:00

506 lines
45 KiB
Markdown

# Business logic as composition — a content-addressed DAG over pluggable substrates
**Vision (elevated 2026-07-02):** business logic IS art-dag. An object's behavior is a
**content-addressed DAG** (lib/artdag), declared on its **type** alongside content grammar +
allowed relations. Everything else is a pluggable ADAPTER — the same fold/adapter principle as
render-vs-execute-vs-deps, applied to execution/communication/deployment:
- **Behavior = an artdag DAG** — the invariant, content-addressed (`artdag/dag`, analyze/plan/
optimize/schedule). Business logic, art media pipelines, workflows — all the same abstraction.
- **Execution = an injected RUNNER** (`artdag/run dag RUNNER cache`; `artdag/op-table-runner`).
Substrates are runners on a **capability ladder**, same DAG throughout — a node DECLARES the
capabilities it needs, a runner ADVERTISES what it supports, and the match is checked at bind
time (fail fast, not a mystery at run time). artdag/analyze computes a DAG's required-capability
set → its MINIMUM runner. So "simple in SX / durable in Erlang / distributed in celery-sx" is a
DERIVED property of the DAG, not a human judgment; a trivial rule is a one-node DAG needing only
`{effect}` and runs synchronously with zero ceremony. Business logic and media pipelines are the
SAME structure (a content-addressed op-DAG); they differ only in the capabilities their nodes
require, hence the runner. The ladder (capabilities each runner adds):
- **op-table / execute-fold runner** — `{effect, branch, each}` — synchronous, local, in-request. Covers P0.
- **Erlang runner** — `+ suspend/resume/wait` (durable), deterministic replay (flow-on-erlang).
- **celery-sx runner** — `+ parallel, retry, offload` — distributed/durable task executor, "Celery the way it should have
been" on erlang-on-sx, ZERO packages. It's a LEAN GLUE of parts we already have, not a
reimplementation: broker = lib/persist KV (durable enqueue/claim/ack/visibility-timeout) ·
worker pool = the er-scheduler / Erlang processes · result backend = content-addressed results
(artdag keys by content-id → dedup/memoization FREE — Celery bolts this on badly) · retries/
replay = flow-on-erlang · scheduling/fan-out/chords = artdag/schedule (minikanren CLP(FD)) +
the DAG's topo batches · the plug point = artdag/op-table-runner. The genuinely-new code is
small (~a few hundred lines): a durable queue + a worker loop (pull node → runner → write
content-addressed result) + retry/backoff. **BUILD WHEN A DAG DEMANDS IT** — heavy compute,
long-running/retryable tasks, or fan-out across machines — NOT for the synchronous P0.
- **real-Celery over artdag/L1** — `+ gpu/heavy-compute` — the existing Python media pipeline (JAX/IPFS) as a runner.
- **Communication = an injected TRANSPORT** (`artdag/federation`, transport injected). Substrates:
fed-sx (ActivityPub/next/), internal HMAC HTTP (services), IPFS (content-addressed). Because
content-ids are global, a result computed on one instance is reusable on another by id.
- **Deployment = PLACEMENT** — a subdomain service, a fed-sx peer, an L1 worker: just where a
runner runs. Not the essence.
- **State change → triggers a DAG** (over a transport) → executed by a runner → effects (data) a
driver dispatches. fed-sx + Erlang is ONE adapter set (durable/federated), not THE architecture.
So: the TYPE carries content-grammar + allowed-relations + a **behavior DAG (+ triggers)**; the
object's state changes emit activities; the platform picks runner/transport/placement per context.
**Design (decided 2026-07-02; corrected after review):**
- **Activity log = every OBSERVABLE object-level state change** — the event source. NOT just CID
deltas: relations write `edge:*` rows, NOT the record, so a relation change does NOT shift the
CID. Two ActivityPub-faithful event classes: content/status change → a CID-carrying
`Create`/`Update`; relation change → an `Add`/`Remove` referencing the edge.
- **Verbs are TRANSITIONS, not raw deltas.** `on-publish` = draft→published (fire-once), not every
CID delta of a published post. Emitter picks the verb (Create/Update/Add/Remove/Delete). Global
fire-once is the emitter's job (+ a durable inbox for federation); the seam only dedups in-call.
- **Triggers = declared subscriptions** — a type declares named triggers; flows fire only on
matching ones. Log complete, execution precise.
- **Flows split by CAPABILITY (DERIVED, not a human "complexity" call):** every flow is a
content-addressed op-DAG. A node declares the capabilities it needs (a `wait` needs `suspend`, a
fan-out needs `parallel`, a heavy op needs `offload`); a runner advertises what it supports;
artdag/analyze computes the DAG's required set → its MINIMUM runner; bind-time checks required ⊆
runner (fail fast). So the sync/durable/distributed choice falls out of the DAG — a `{effect}`-only
DAG runs synchronously (no ceremony); a DAG with a `wait` node auto-requires the Erlang runner.
Business logic and media pipelines are the SAME structure — they differ only in node capabilities.
- **Effects are DATA; a DRIVER dispatches them** (perform no IO — a blocking call deadlocks the
er-scheduler). The host is the driver for P0.
- **The type carries its whole contract:** fields+grammar (content) · allowed relations (external)
· **behavior bindings** (triggers → DAG). All composition, all editable in the type-def editor.
- **CANONICAL ACTIVITY = the seam shape** `{:verb :actor :object <cid> :object-type :delta :prev
:ts :id}`; each runner adapter MARSHALS it to its substrate (e.g. the Erlang runner → next/'s
`[{type,…},{object,[…]}]` proplist via host/blog--activity->erl). DONE (P0.4): host/blog--publish-
activity emits the canonical shape; host/blog--activity->erl is the RA marshaller (staged).
**Reference (one durable-runner substrate, verified):** `next/tests/triggers_e2e.sh` = 10/10 —
next/'s trigger_registry + flow_dispatch + blog_publish_digest (suspend/resume/guard/dedup). This
is what the ERLANG RUNNER ADAPTER wraps (Phase RA), not what P0 wires directly.
## The ADAPTER SEAM (design first — the contract every substrate plugs into)
The invariant is an **activity** (state-change event) + a **behavior DAG**. Everything between is
a swappable adapter — each a dict-of-functions (SX-native, like the fold domains). Six contracts:
1. **Activity** (content-addressed event): `{:verb "create"|"update"|"add"|"remove"|"delete"
:actor :object <cid> :object-type :delta :prev <cid> :ts}`.
2. **Behavior binding** (declared on a TYPE): `{:on {:verb :object-type :guard} :dag <ref>}`. The
TYPE carries content-grammar + allowed-relations + these bindings. NO :runner hint — the runner
is DERIVED: the DAG's required-capability set (artdag/analyze over its nodes; a node declares
`:needs` e.g. `#{suspend}`) selects the minimum runner that advertises them.
3. **Trigger-registry adapter** — `{:register! (fn spec dag hint) :match (fn activity -> [binding])}`.
Impls: a local SX matcher / next/'s Erlang trigger_registry.
4. **Runner adapter** — `{:capabilities <set> :run (fn dag env -> {:status "done"|"suspended"|
"failed" :effects :resume :error})}`. :capabilities is what it ADVERTISES (op-table `{effect,
branch, each}`; Erlang `+ suspend`; celery-sx `+ parallel, retry, offload`); the binder checks
dag-required ⊆ runner-caps and fails fast otherwise. env = `{:activity :actor :ctx :effects
:binding}` (:effects = injected external-read interfaces, deterministic for replay). `:results`
is runner-INTERNAL. Durability = the runner's, not the DAG's. A durable runner that SUSPENDS is
wired at construction to the transport's INBOUND channel and injects its out-of-band completion
there (pump drains it).
5. **Transport adapter** — `{:emit (fn activity) :deliver (fn -> [activity])}`. `:emit` = outbound
(log/publish), `:deliver` = inbound (peers + async runner completions). Impls: in-process /
fed-sx (next/) / HTTP / IPFS. Content-ids global → results move by id.
6. **Effect driver** — `{:dispatch (fn effect -> [activity])}`. Perform the effect-as-data; may emit
NEW activities (loop closure).
**Engine** (`behavior/make-engine {:triggers :runner :transport :driver :effects? :ctx-of?}`) →
`behavior/process(engine, activity)` / `behavior/pump(engine)`:
```
step(activity): if activity.id already :seen → skip (dedup) ; in-call cycle guard
emit(transport, activity) ; log
for b in match(triggers, activity):
r = run(runner, b.dag, {:activity :actor :ctx :effects :binding})
case r.status:
"done" → for eff in r.effects: for a in dispatch(driver, eff): step(a) ; loop closes
"suspended" → record (+ r.resume); a durable runner completes out-of-band → inbound → pump
"failed" → record (+ r.error) for flow-level retry / dead-letter
pump(): for a in transport.deliver(): step(a) ; inbound drain, shared :seen
```
Every stage injected → swap runner (sync→Erlang→celery-sx), transport (in-proc→fed-sx), registry,
driver — DAG + engine unchanged. **Global** idempotency = emitter fire-once + a durable inbox.
**Build order:** (a) **DONE** — seam as SX contracts + reference engine, tested substrate-agnostic:
:status branch (done/suspended/failed), injected env (:ctx + :effects), dedup by activity :id,
behavior/pump for inbound, async-completion loop (suspend → out-of-band inject → pump). **behavior
10/10.** (b) P0 supplies the REAL SYNCHRONOUS adapters; durable/federated runners+transports are
later adapter phases (RA/TA below).
## P0 — publish workflow on the seam (SYNCHRONOUS, all-SX)
Prove: live host publish → the seam engine (in-process transport · local-SX trigger registry ·
sync op-table runner over an SX publish-DAG · host driver) → the effect surfaces. NO Erlang, NO
fed-sx yet — those are adapter phases (RA/TA). Every piece swaps later; the DAG + engine don't.
- [x] **P0.1 — publish-activity contract (SX side).** host/blog--publish-activity + post-category.
blog 200/200. NOTE: emits the next/-Erlang shape today; P0.4 reconciles to the canonical seam shape.
- [x] **P0.2 — the publish-DAG + execute-fold runner + the CAPABILITY check. DONE 2026-07-02.**
**HYPOTHESIS-TEST FINDING:** the synchronous business flow expresses NATURALLY as an EXECUTE-FOLD
composition (host/execute.sx: seq/effect/alt — the branch on category IS `alt`, exactly what it's
for), NOT an artdag DAG — artdag is pure DATAFLOW with no control flow. So "business logic = art-
dag" is confirmed at the ABSTRACTION (both content-addressed op-DAGs) and REFINED at the vocabulary:
the SYNCHRONOUS control-flow runner is the execute-fold (caps {effect,branch,each}); artdag is the
DATAFLOW sibling (a different runner). Two instances of one thing, run very differently — as
predicted. Built: lib/host/flows.sx (host/flow--{node-cap, required-caps, subset?, exec-runner,
bind}); host/blog--publish-dag + publish-ctx. Verified: publish-DAG required-caps derived =
{effect,branch} → binds to exec-runner; runs → newsletter→[validate,digest]/urgent→[validate,
notify]/other→[validate,skip]; a `wait` node → required {suspend} → binds FAIL-FAST against the
exec-runner (would need Erlang, RA). flows 7/7, blog 203/203, conformance 591/591.
IMPLICATION for RA/TA: the Erlang runner isn't a "different flow language" — it's the SAME op-DAG
with +{suspend} nodes; RA is the runner that advertises suspend + wraps flow_dispatch.
CAVEAT (don't calcify this finding): execute-fold-vs-artdag is a CURRENT capability SNAPSHOT, NOT
a permanent boundary. artdag MAY GROW control-flow node-kinds (a runner advertising +{effect,
branch, each}), and business logic then MIGRATES to artdag to inherit content-addressed
memoization / optimize (fuse/dedup/dce) / schedule / FEDERATION (a flow result reused across peers
by content-id — the federation vision, free). The capability model makes that migration seamless
(same DAGs, richer runner; the execute-fold is just the pragmatic sync runner NOW). See phase AX.
- [x] **P0.3 — wire the seam on the live host. DONE + LIVE-VERIFIED 2026-07-02.** host/blog--
{transport, triggers (on-publish: create+article → publish-DAG), driver (records each effect),
publish-engine, fire-publish!, maybe-publish!}. Both write handlers (form-submit POST /new,
edit-submit POST /:slug/edit) call maybe-publish!(slug, prev-status, new-status) — a non-published
→ published TRANSITION fires the flow (fire-once), in the handler BODY. /flows renders the flow
log. LIVE PROOF: logged in + POST /new on blog.rose-ash.com → /flows shows `validate` + `notify`
(category defaulted to urgent). behavior→exec-runner→driver all real. blog 207/207, conformance
595/595. GAP: the flow log is IN-MEMORY (clears on restart) — "durable record" is P0.3b (persist
the log to the blog store + boot-load, string-keyed to dodge the keyword/persist split). Also: the
live test post `p0.3-seam-live-test` persists (no delete route) — harmless, clean up if wanted.
- [x] **P0.4 — canonical activity + reconcile. DONE + LIVE-VERIFIED 2026-07-02.** host/blog--
publish-activity now emits the CANONICAL seam shape {:verb :actor :object <cid> :object-type :slug
:category :delta :id} — :object is a content-addressed REFERENCE (the CID, was an inlined dict),
:id the dedup identity, :slug+:category the domain fields the DAG reads. Consumers reconciled: the
trigger matches :verb+:object-type; publish-ctx reads top-level :category+:slug. Added the runner
MARSHALLER host/blog--activity->erl (canonical → next/'s proplist, for RA — defined+tested, unused
until RA). (:ts/:prev omitted — no clock primitive in the host; deferred.) LIVE: published on
blog.rose-ash.com → /flows fired validate+notify with the canonical activity. blog 209/209,
conformance 597/597.
**P0 COMPLETE** — the synchronous publish workflow runs end-to-end on the LIVE host through the
substrate-agnostic seam, durably, in the canonical activity shape, with the Erlang-runner marshaller
staged. Every piece is a swappable adapter: RA (Erlang runner) + TA (fed-sx transport) plug in next
without touching the DAG or the wiring.
### P0 REVIEW (2026-07-02) — findings + carried-forward debt
- **FIXED (was a live bug):** edit-submit ran maybe-publish! BEFORE set-field-values!, so an edit
that set a category AND published branched on the STALE category. Reordered (fields first);
regression test added. blog 210/210.
- **DEBT #1 (blocks P2): activity identity.** Dedup uses :id = the object CID. Relation Add/Remove
events don't change the CID → multiple relation activities would share :id → false dedup. P2 needs
a real activity id (verb+object+seq/edge), NOT the bare CID.
- **DEBT #2 (P1): the capability bind isn't wired into the live engine.** host/blog--publish-engine
calls exec-runner directly; host/flow--bind is tested but decorative in the live path. P1's engine
must DERIVE the runner via bind (required ⊆ advertised), not hardcode it.
- **DEBT #3 (RA): flow execution is synchronous in the request path.** Fine for 2 cheap effects, but
a durable/suspend runner CANNOT block a request — RA needs dispatch moved OFF the request path
(emit in-request → a background loop calls behavior/pump). The seam supports it (pump); the wiring
doesn't exist. This is structural, not just the marshaller.
- **DEBT #4 (minor): the "urgent" category default** notifies everyone for an uncategorised post —
demo-convenient, semantically wrong. Default to skip/none once the demo value is past.
- Acknowledged/known: unbounded flow log (cap/rotate), registry :register! is a stub (P1 makes it
real), :actor "site" placeholder (TA needs the actor model), marshaller unverified-until-RA.
### Sequencing note (hidden prerequisites)
- **P1's "runner derived via caps" is near-vacuous until a SECOND runner exists** — with only
exec-runner, everything derives to it. The derivation only MEANS something once RA lands.
- **RA is where the real risk lives** (SX→erlang-on-sx dispatch + suspend→resume→pump + the
er-scheduler context bug) AND P1/P2 quietly depend on what RA forces (a 2nd runner, the async
boundary). RECOMMENDATION: a narrow **RA SPIKE** next — prove one dispatch + one suspend/resume/
pump cycle in isolation — de-risks the whole durable/federated half before building P1/P2 on it.
## P1 — types DECLARE behavior (generalize) — DONE + LIVE-VERIFIED 2026-07-02
- [x] The type carries :behavior — a list of flat string-keyed bindings {"verb" "type" "dag"}
(persist-safe, like :type-relations), stored on the type-post (host/blog--type-behavior /
set-type-behavior!). The "article" type declares {"verb" "create" "type" "article" "dag" "publish"}.
- [x] The behavior REGISTRY (host/blog--behaviors) is gathered at boot from ALL posts' declarations
(host/blog--load-behaviors!, in serve.sh after seed-types!); the trigger match (host/blog--triggers
:match = host/blog--match-behaviors) consults it. The hardcoded create+article trigger is GONE.
- [x] The runner is DERIVED, not hinted (DEBT #2 fixed): match-behaviors resolves the :dag via a DAG
registry (host/blog--dag-registry) and picks the runner via host/flow--select-runner over the fleet
(host/blog--runner-fleet = [exec-runner]; RA joins at RA-live). Each binding carries its :runner;
behavior/-run-binding uses it. An {effect,branch} publish-DAG → exec-runner; a {suspend} DAG would
route to RA (proven in ra 9/9 with a 2-runner fleet).
- [x] The type-def view SHOWS each behavior + its derived runner (host/blog--behavior-lines): LIVE at
blog.rose-ash.com/article — "on create → publish DAG · needs {effect, branch} · runner:
synchronous (exec-fold)". Derived + visible, not hand-set.
- LIVE PROOF: published on blog.rose-ash.com → /flows fired validate+notify via the DECLARED path
(registry + derived runner). blog 213/213, full conformance 610/610. FINDING: load-behaviors! must
scan ALL posts, not filter by is-type? (article didn't pass is-type? on the durable store though it
did in-memory) — the type declaration is authoritative, the is-type? classification isn't reliable enough.
## P2 — state-change → activity emission (ALL events) — DONE + LIVE-VERIFIED 2026-07-02
- [x] TWO event classes emit canonical activities through the seam: CONTENT (host/blog--content-
activity: Create on first publish, Update on a subsequent published edit — object-type DERIVED from
is-a, not hardcoded) and RELATION (host/blog--relation-activity: Add/Remove, carrying :relation +
:target). host/blog--emit! runs any activity through behavior/process (logged + matched);
emit-content-change! (create/update) wired into form-submit + edit-submit; emit-relation!
(add/remove) wired into relate-submit + unrelate-submit.
- [x] DEBT #1 FIXED — per-EVENT :id, not the bare CID. Content = "create:"/"update:"+cid; relation =
"add:"/"remove:"+src:kind:dst (EDGE-based, since a relation change doesn't shift the CID, so a
CID-based id would false-dedup different edges on one object). Verified: different edges → different ids.
- [x] The activity log is the DURABLE EVENT SOURCE (host/blog--activity-log, string-keyed records
persisted under "activitylog", boot-loaded via host/blog-load-activitylog!). Surfaced at /activities.
This is what TA will push to peers.
- LIVE PROOF: on blog.rose-ash.com — publish → /activities "create article <cid>"; relate → "add
article p2-events — add welcome related"; unrelate → "remove …". All three classes, durable.
blog 217/217 (+4 P2, reframed P0.3 fire-once tests for Update semantics), full conformance 614/614.
## RA — the ERLANG (durable) RUNNER adapter ← the old "fed-sx spike", now an adapter
<!-- PREREQ (review): move dispatch OFF the request path (DEBT #3) — background loop calls behavior/pump; suspend can't block a request. -->
**SPIKE DONE 2026-07-02 — RA is VIABLE (plans/ra-spike.sh, 4/4).** Proven from the SX side:
(1) our canonical activity dict → Erlang activity-proplist SOURCE (ra/erl-src ≈ host/blog--activity->erl)
is valid Erlang the flow consumes; (2) it drives pipeline:apply_triggers → blog_publish_digest → done
+ 3 emails (urgent sync branch); (3) the newsletter activity SUSPENDS on the morning timer
(status =:= {ok,{suspended,morning}}); (4) flow_store:resume(Id, morning_ts) COMPLETES it → 3 emails
(the async cycle); (5) NO er-scheduler deadlock — flow-on-erlang's railway threading holds when
driven from SX. KEY FINDINGS for the build: erlang-eval-ast returns Erlang TERMS directly (integers
raw, atoms as {:tag atom :name …}) — the runner must parse results, not assume :name; flow_store
start→{done,V}|{suspended,Tag}, resume(Id,Res) maps 1:1 onto {:status done|suspended :effects :resume};
the flow instance Id is the resume handle.
- [x] **RA RUNNER BUILT + TESTED (module + integration) 2026-07-02.** lib/host/ra.sx — a PURE-SX
seam runner (advertises {effect,branch,each,suspend}) with an INJECTED erl-eval (real =
er-to-sx-deep ∘ erlang-eval-ast; mock in unit tests), so it loads in the plain host and is testable
without the Erlang runtime. host/ra--{atom,bin,erl-src,start-expr,resume-expr,parse,make-runner,
resume,real-eval}: marshals our canonical activity → Erlang source (CID as <<"…">> binary, atoms
single-quoted), starts flow_store, parses (ok Id (flow_done V))→{:status done :effects V :flow-id} /
(ok Id (flow_suspended T))→{:status suspended :resume {:id :tag}}. DUAL-RUNNER ROUTING in flows.sx:
host/flow--required-caps handles a {:erl-flow :needs} DAG (declared caps); host/flow--select-runner
picks the cheapest runner covering the DAG's needs — the capability model is now REAL (2 runners:
an {effect,branch} composition → exec-runner; a {suspend} DAG → RA). ra 9/9 (mock) + plans/
ra-integration.sh 4/4 (the REAL module driving live flow_store: urgent→done, newsletter→suspended
with resume handle, effect-as-data carried). Full host conformance green. next/tests/triggers_e2e.sh
10/10 baseline intact.
- [ ] **RA-LIVE (deferred — the deployment step, prerequisite now PRECISE).** KEY FINDING: gen_servers
do NOT persist across separate erlang-eval-ast calls (flow README: "the scheduler doesn't preserve
spawned processes across separate erlang-eval-ast invocations"). So a boot-per-call proves the
module (done), but TRUE async (suspend → return the request → resume LATER in another call) needs a
PERSISTENT next/ kernel PROCESS holding flow_store — the async boundary (DEBT #3) is deeper than
"off the request path".
**PERSISTENT-KERNEL SPIKE PASSED 2026-07-02 (plans/ra-kernel-spike.sh + ra_kernel.erl).** A
background sx_server running `ra_kernel:start` (flow_store + a blocking http:listen keeps the
er-scheduler + gen_server alive) survives across HTTP requests: GET /start suspends instance 1, a
SEPARATE GET /resume resumes that SAME live instance → done. So a persistent kernel process IS
viable, and the er-scheduler-context fear does NOT bite (er-bif-http-listen spawns each handler
IN-scheduler, so gen_server:call completes). Gotchas: start blocking http:listen hangs any
in-process erlang-eval-ast (so the kernel is a DEDICATED process, driven over TCP, not epoch cmds);
binary =:= is buggy (always true) → dispatch paths by PATTERN (byte-list binaries), not =:=.
REMAINING for RA-live: (a) a real kernel module (flow + inbox/outbox routes) run as a persistent
service (its own container/placement); (b) the host's RA runner POSTs activities to it (start) +
the completion re-enters via the transport inbound + behavior/pump (resume); (c) a durable behavior
binding ({:erl-flow "blog_digest" :needs (effect branch suspend)}) routed to RA via select-runner.
The prerequisite is PROVEN; this is now build (kernel service + host HTTP client), not research.
## TA — the FED-SX TRANSPORT adapter ← federation proper
- [x] **TA TRANSPORT BUILT + the federation LOOP PROVEN 2026-07-02.** lib/host/ta.sx — a seam
transport {:emit :deliver} over a DIRECTIONAL wire (out=outbox→followers, in=inbox←follows). The
transport is the SERIALIZATION boundary: activities cross as SX-source strings via host/ta--
serialize/deserialize (keyword-keyed activity ↔ flat string-keyed wire form — the P2 activity
fields). host/ta--make-transport(out-wire, in-wire) + host/ta--make-mem-wire (an in-memory
directional queue). PROVEN (ta 5/5): a content + a relation activity round-trip through the wire;
the FEDERATION LOOP — instance A emits an activity → the wire carries it → instance B's
behavior/pump delivers + processes it → B's engine fires ITS behavior on A's activity; DIRECTIONAL
(B re-emits to its own outbox, not back into the inbox — no loop). This is "everything works over
fed-sx" proven at the seam. Full host conformance green.
- [x] **TA-LIVE DONE + LIVE-VERIFIED 2026-07-02 (real two-instance A→B federation over HTTP).** The
wire is the HOST's own http-request (not next/ delivery — simpler + already persistent). host/ta--
{post, make-http-wire, federate}; the host gains a POST /inbox (host/blog-inbox → host/blog--receive!
→ process locally, does NOT re-federate). A DURABLE OUTBOX (host/blog--outbox, persisted) gives fed-
sx RELIABILITY: emit! processes locally (always succeeds), QUEUES per-peer, and delivers best-effort
— a peer being DOWN does not fail the local publish (delivery is guarded; failed items stay queued +
retry on next emit / on boot / manual /flows?flush=1). serve.sh: SX_PEERS → host/blog--set-peers!,
boot load+flush. docker-compose: a 2nd host `sx_host_b` (its own store, no peers) as peer B.
LIVE PROOF: (1) a peer POSTs a create/article to blog.rose-ash.com/inbox → A fires validate+notify.
(2) publish on A → federates to B → B's /flows fires validate+notify on A's activity; B's /activities
shows the received create. (3) RESILIENCE — publish with B DOWN → A returns 303 (was 500), activity
queued; start B + flush → B receives the backlog + fires. blog 218/218, conformance green.
NOTE on placement/domains: A = blog.rose-ash.com (Caddy/externalnet); B = sx_host_b, internal-only
(docker DNS, no public domain) — a real peer would get its own Caddy subdomain. FUTURE: the actor
model (:actor "site" placeholder → follower_graph decides WHO to deliver to); a background delivery
loop (currently retry is opportunistic on emit/boot/flush, not a timer); signature verification on /inbox.
## AX — artdag GROWS control-flow (business logic MIGRATES to artdag) [DEMAND-DRIVEN]
Today artdag is pure dataflow and the execute-fold is the synchronous control-flow runner. That's
a capability snapshot. When a business flow WANTS the artdag engine's benefits — content-addressed
memoization (recompute only on input-CID change), optimize (fuse/dedup/dce), schedule, and above
all FEDERATION (a flow result reused across peers by content-id) — grow artdag's node vocabulary
so its runner advertises `+{effect, branch, each}`, and the SAME behavior DAGs migrate onto it
(the capability model makes this seamless; the seam + DAGs don't change). Two real design pieces:
(1) DYNAMIC control in a static DAG — a `branch` PRUNES a path (conditional nodes make downstream
nodes live/dead; content-addressing holds on the taken path); (2) EFFECT nodes vs memoization —
pure nodes memoize, `{effect}` nodes are marked non-cacheable (must run / be idempotent). Build
when a flow's cost, reuse, or cross-peer sharing makes the execute-fold's re-run-everything
insufficient — not before. The execute-fold stays the lean default for cheap synchronous flows.
## RX — celery-sx runner (DEMAND-DRIVEN, not scheduled)
Build the distributed/durable runner adapter the moment a real DAG needs heavy compute /
long-running-retryable tasks / cross-machine fan-out (the artdag/JAX media case, or federated
flows that can't run in-request). New code is small — glue persist (durable queue: enqueue/claim/
ack/visibility-timeout) + er-scheduler (worker loop: pull node → op-table-runner → content-
addressed result) + artdag/schedule (fan-out) + retry/backoff. Slots in at artdag/op-table-runner
alongside the synchronous + Erlang runners. Zero packages. Do NOT pre-build; the op-table runner
covers everything until a DAG's cost/latency/placement forces the substrate.
## P4 — close the loop
- [ ] Flow effects mutate objects back durably (a flow's DescribeEffects → host writes / new
activities), so business logic can change state, which federates, which triggers more flows.
## Progress log (newest first)
- 2026-07-03 — MINI-PASS R1-R3 + a correcting finding. R1 DONE (per-offering atomic stock: buy holds
a 2nd pool "offering:<off>", cap=:cap field ∞-if-unset; the store's product stock, atomic + durable;
259 tests). FINDING (see [[project_host_handler_mutex]]): lib/host http-listen holds ONE per-peer
mutex across the WHOLE handler run (persist IO + the outbound http-request), so there is ZERO
intra-peer handler concurrency — the read-check-write "races" fixed in H3/H5/R1/R2 are NOT
exploitable today (the "10 concurrent buys → 1 seat" test proved the mutex serializes, not that
append-expect beat a race). The atomic-pool work keeps INDEPENDENT value (durable authority
surviving edge-wipes; H5 fixes a real seat LEAK; H6 durable dedup is real at-least-once redelivery),
and it FUTURE-PROOFS for when the mutex is narrowed for throughput. R2 (outbox→stream) DEFERRED —
its stated "no lost deliveries under a race" rationale was masked by the mutex; fold it into the
throughput work. NEW REGISTER ITEM: "narrow the handler mutex" is the real concurrency task — a
handler holds the peer's mutex during a 100s-of-ms cross-domain http-request, blocking the whole
peer; the multi-co-op throughput future forces it, and THAT is when the masked races become real.
R3 (test runner counts a raising test as a failure, not a silent skip) proceeding — independent value.
- 2026-07-03 — HARDENING PASS H1-H7 DONE (TDD, failing test first, all 7) + deployed + live-verified.
H1 internal endpoints HMAC-gated (x-int-sig of the TARGET; unsigned /ticket|/order|/person → 403 —
closed the live capacity-bypass). H2 admin ops (new-film/new-showing/offering-*/add-poll/new-event)
behind protect-html; /vote + /buy-ticket pinned public. H3 votes = atomic claims on stream
vote:<poll> (ev/book!), edge is a projection — dedup survives projection wipes. H4 P2 restored:
all cinema/poll mutations emit (create/schedule/offer/update/retract/vote/sell; voter anonymous on
the wire). H5 two-phase buy: ev/hold! → guarded mint (injectable host/blog--mint-ticket) →
ev/confirm! / ev/release! — the cross-domain seat leak is gone. H6 durable activity dedup: :id
claimed once-ever on stream activities:processed — prerequisite for payment. H7 adjacency streams:
out/in/out-raw fold per-(node,kind) event streams (rel:/rin:), not full kv scans; append-only = no
RMW race; boot reindex migrates legacy stores; add-edge-kv! collapsed in (algebra tests caught the
bypass). blog suite 218→256, FULL conformance 658/658. LIVE: 403 on unsigned mint, /login on
unauth admin, buy sold 2→3 through hold/confirm, vote +1-once, migrated reads correct.
NEXT (commerce arc, decided): single-till co-op settlement. Business+membership+rights → Cart +
per-line owner attribution → SumUp (stub→real) → reseed as the food co-op. Deferred: per-actor
keys + global naming (gate multi-co-op federation), lib/commerce money, blog.sx modular split.
- 2026-07-02 — CROSS-DOMAIN slice 1 DONE + LIVE-VERIFIED: allocate-a-post-to-a-calendar (blog→events).
events.rose-ash.com is now a fed-sx PEER — a lib/host instance with SX_DOMAIN=events, whose
"calendar" TYPE declares an on-allocate behavior (behaviors ARE type-declared — confirmed). Built:
DIRECTED delivery (activity :to <peer> → delivered to that peer's inbox, in addition to followers;
wire gains "to"); host/blog--allocate-activity/allocate! + POST /:slug/allocate?calendar=; serve.sh
SX_DOMAIN gate (blog=article behaviors, events=calendar+allocate-link DAG); the sx_events container
(own store, shared fed secret). LIVE: publish "Gig Night" on blog → allocate to calendar main → the
events peer RECEIVES the directed activity (/activities) and its calendar type's on-allocate behavior
FIRES (/flows "linked gig-night"). Signed + directed cross-domain federation, type-declared reaction.
NEXT (the vision): events runs lib/events (real calendars/events/recurrence/ticketing); make "linked"
a real relation/event; then link an event→post; then shop (lib/commerce) sells tickets. Same shape.
- 2026-07-02 — FEDERATION PRODUCTION LAYER DONE + LIVE-VERIFIED (the actor model + the rest). (1)
ACTOR MODEL: activities carry a real :actor (SX_ACTOR, not "site"); delivery is FOLLOWER-based, not
a static peer list — a peer POSTs {verb:follow, actor, base} to /inbox to subscribe; B follows A at
boot (SX_FOLLOW) → A delivers its activities to B (a follower). host/blog--{followers, add-follower!,
follow!, delivery-bases}. (2) BACKGROUND DELIVERY TIMER: serve.sh's detached loop hits /fed-tick every
15s → re-follow (idempotent, recovers if the target was down) + flush the outbox. Verified: B down →
publish queues → start B → the TIMER auto-delivers (no manual flush). (3) SIGNATURE VERIFICATION:
every fed POST is signed (dr/sess-sig shared-secret MAC over the body, SX_FED_SECRET); /inbox verifies
→ a forged POST gets 403. (4) PUBLIC DOMAIN: B (sx_host_b) is on externalnet (Caddy-reachable) — the
actual subdomain (DNS + a Caddy reverse_proxy route) is external ops config, not in this repo. LIVE
PROOF end-to-end: B follows A (followers:1); publish on A → signed delivery to follower B → B verifies
+ fires validate+notify; forged POST → 403; background timer delivers a backlog. blog 218/218, conformance green.
- 2026-07-02 — RA-LIVE + TA-LIVE DONE + LIVE-VERIFIED. (1) sx_kernel container (durable-execution
service) deployed; the host's RA kernel-runner drives it over HTTP — editing a newsletter article →
durable Update → kernel SUSPENDS (pending) → /flows?resume → done. (2) TA federation: host POST
/inbox receives peers' activities + fires; a 2nd instance sx_host_b (peer B) — publish on A →
federates to B → B fires ITS behaviors on A's activity. (3) DURABLE OUTBOX for fed-sx reliability
(user-driven): B down → A's publish still succeeds (303, was 500) + queues; B up + flush → backlog
delivers. The whole distributed half is LIVE. Remaining polish: actor model, background delivery
timer, /inbox signature verify, expose B on a public domain.
- 2026-07-02 — the REAL KERNEL SERVICE built (next/kernel/host_kernel.erl + serve.sh + tests/
host_kernel.sh, 4/4 over HTTP). A persistent durable-execution service: flow_store + named-flow
registry, parameterised flow routes (GET /flow/start/<category> → "<id>:<status>", GET
/flow/resume/<id>). Verified live: newsletter→instance 1 SUSPENDED, urgent/draft→DONE, resume 1 in a
SEPARATE request→DONE (durable state persists). serve.sh = the persistent launcher (container
entrypoint). This is the RA-live substrate. NEXT for RA-live: deploy the kernel (a container/
placement) + point host/ra.sx's real-eval at it (POST /flow) + route a durable binding to RA. TA-live
adds inbox/outbox routes on the same kernel.
- 2026-07-02 — PERSISTENT-KERNEL SPIKE PASSED (plans/ra-kernel-spike.sh + ra_kernel.erl). The shared
prerequisite for RA-live + TA-live is REACHABLE: a background sx_server (flow_store + blocking
http:listen) holds gen_server state across HTTP requests — /start suspends instance 1, a separate
/resume resumes the SAME live instance → done. The er-scheduler-context fear doesn't bite (handlers
spawn in-scheduler). Chose the persistent-kernel path (B) over host-side replay-log (A) — it serves
BOTH durability + federation on one fed-sx-native substrate + gives the full next/ kernel. Gotchas:
a blocking listener hangs in-process erlang-eval-ast (kernel = a dedicated TCP-driven process);
binary =:= buggy → pattern-match paths. RA-live/TA-live are now BUILD (kernel service + host HTTP
client + actor model), not research. NEXT: build the real kernel service + wire the host as client.
- 2026-07-02 — TA TRANSPORT built + the federation LOOP proven (lib/host/ta.sx, ta 5/5). A seam
transport over a directional wire (serialization boundary; activities cross as SX-source). Proven
in-memory: A emits → wire → B pump → B's engine fires ITS behavior on A's activity (directional, no
loop). "Everything works over fed-sx" proven at the seam. TA-live (real next/ delivery wire) deferred
— needs the persistent kernel (RA-live finding) + the actor model (who to deliver to). NEXT: the
DISTRIBUTED HALF is now all proven-at-the-seam (RA + TA); the remaining live steps (RA-live, TA-live)
share ONE prerequisite — a persistent next/ kernel process + the actor model. Or P4/AX/RX.
- 2026-07-02 — P2 DONE + LIVE-VERIFIED. All observable state changes now emit canonical activities
through the seam: content Create/Update + relation Add/Remove. DEBT #1 fixed (per-event ids; edge-
based for relations). The activity log is the durable event source, surfaced at /activities. Live:
publish→create, relate→add, unrelate→remove all logged. blog 217/217, conformance 614/614. The
event source is now complete + federatable — NEXT: RA-live (persistent kernel) or TA (fed-sx
transport pushes /activities to peers → federation).
- 2026-07-02 — P1 DONE + LIVE-VERIFIED. Types DECLARE :behavior (stored on the type-post, gathered
into a registry at boot); the trigger match consults the registry; the runner is DERIVED via
host/flow--select-runner over the fleet (DEBT #2 fixed — no hardcoded trigger, no runner hint). The
/article page shows the declared behavior + derived runner. Published live → /flows fired via the
declared path. blog 213/213, conformance 610/610. Finding: load-behaviors! scans ALL posts (not
is-type?-filtered — unreliable on the durable store). NEXT: RA-live (persistent kernel wires RA into
the fleet → durable bindings route to RA), or P2 (all state-change → activity emission).
- 2026-07-02 — RA RUNNER BUILT + tested (module + integration). lib/host/ra.sx = a pure-SX seam
runner with injected erl-eval (loads in the plain host, mock-testable); marshals our activity →
Erlang, drives flow_store, parses done/suspended → the runner contract. Dual-runner ROUTING in
flows.sx (host/flow--select-runner + required-caps for {:erl-flow :needs} DAGs) makes the capability
model REAL (2 runners). ra 9/9 (mock) + plans/ra-integration.sh 4/4 (REAL module → live flow_store).
Full host conformance 607/607. FINDING: gen_servers don't persist across erlang-eval-ast calls, so
RA-LIVE (true cross-call suspend/resume) needs a persistent next/ kernel process — the async
boundary is deeper than "off the request path". Runner mechanics fully proven; RA-live = lifecycle
+ wiring. NEXT: RA-live (persistent kernel + a durable binding wired to RA), or P1 (capability model
is now real, so it's no longer vacuous).
- 2026-07-02 — RA SPIKE DONE → RA is VIABLE (plans/ra-spike.sh, 4/4). From SX: our canonical activity
serializes to valid Erlang, drives blog_publish_digest through flow_store (done + suspend + resume),
no er-scheduler deadlock. De-risks the whole durable/federated half. Findings: erlang-eval-ast
returns terms directly (parse, don't assume :name); flow_store start/resume maps 1:1 onto the runner
contract. Remaining for full RA: load the Erlang runtime into serving (or out-of-process), the async
dispatch boundary (DEBT #3), CID→binary marshalling, structured result parsing. NEXT: full RA, or
P1 now that a real 2nd runner is proven reachable.
- 2026-07-02 — P0.4 DONE + LIVE-VERIFIED → **P0 COMPLETE**. Canonical seam activity shape
({:verb :object=cid :object-type :slug :category :delta :id}); consumers reconciled (trigger match,
publish-ctx); host/blog--activity->erl marshaller staged for RA. Published live → /flows fired
validate+notify with the canonical activity. blog 209/209, conformance 597/597. The synchronous
publish workflow is end-to-end on the live host through the substrate-agnostic seam, durable, in
the canonical shape. NEXT: P1 (types declare :behavior, engine built per type, runner derived via
caps) — or RA (Erlang durable runner, the marshaller is ready).
- 2026-07-02 — P0.3b DONE + LIVE-VERIFIED. The flow log is now DURABLE: the driver
persists string-keyed effect records to the blog store (dodging the keyword/persist top-level
split); host/blog-load-flowlog! rebuilds it on boot (serve.sh). Proof: published on
blog.rose-ash.com, RESTARTED the container, /flows still showed validate+notify (reloaded from the
store). blog 208/208, conformance 596/596. Whole-list rewrite per effect — cap/rotate later.
- 2026-07-02 — P0.3 DONE + LIVE-VERIFIED. The seam wired into the live publish path: on-publish
registry + in-process transport + host driver + the execute-fold runner, fired by the draft→
published transition in both write handlers. Published a real post on blog.rose-ash.com → /flows
surfaced validate + notify, driven by the actual behavior engine. blog 207/207, conformance
595/595. NEXT: P0.4 (canonical activity shape) or P0.3b (durable flow log); then P1 (types declare
behavior — build the engine per type from its :behavior bindings, runner derived via caps).
- 2026-07-02 — DON'T-CALCIFY note (user: "artdag may in the future contain business logic"). The
execute-fold-vs-artdag split from P0.2 is a capability SNAPSHOT, not a boundary. Added phase AX:
artdag grows +{effect,branch,each} node-kinds and business logic migrates onto it to inherit
content-addressed memoization / optimize / FEDERATION (flow result reused across peers by CID —
the federation vision, free). Design work named: dynamic control (branch prunes) in a static DAG;
effect nodes non-cacheable vs pure nodes memoized. Demand-driven; execute-fold stays the lean
default. P0.2 finding + flows.sx header annotated so the finding doesn't harden.
- 2026-07-02 — P0.2 DONE + the hypothesis CONFIRMED (and refined). The synchronous publish workflow
is NATURAL as an execute-fold composition (seq/effect/alt), NOT artdag dataflow (no branch there).
So business-logic = art-dag holds at the abstraction (content-addressed op-DAG) but the SYNCHRONOUS
runner is the execute-fold, artdag the dataflow sibling — two instances, run differently, exactly
the framing. lib/host/flows.sx (capability layer + exec-runner + bind) + host/blog--publish-dag.
Runner DERIVED via required-caps ⊆ advertised; wait→fail-fast. flows 7/7, blog 203/203, 591/591.
- 2026-07-02 — folded in CAPABILITY-TYPED nodes / CAPABILITY-ADVERTISING runners. A node declares
`:needs` (wait→suspend, fan-out→parallel, heavy→offload); a runner advertises `:capabilities`
(op-table {effect,branch,each}; Erlang +suspend; celery-sx +parallel,retry,offload); artdag/analyze
computes a DAG's min-runner; the binder checks required ⊆ runner-caps (fail fast). So the sync/
durable/distributed split is DERIVED from the DAG, not a human call — a {effect}-only DAG runs with
zero ceremony; a wait node auto-requires Erlang. Removed the :runner hint from the binding. P0.2
gains the hypothesis test: does the publish workflow express naturally as a DAG, and does flipping
a node to `wait` fail-fast against the op-table runner? Clarifies "business logic = art-dag" — same
op-DAG structure, differing only in node capabilities, hence runner. (Insight: they're two
instances of one thing, suitable to run very differently.)
- 2026-07-02 — whole-plan coherence review. The reframe (artdag+seam) had left the middle stale:
P0.2/P0.3 still described the Erlang-bridge-first path; P0.1's activity didn't match the seam
contract; the seam section predated the enrichment. Fixed: P0 rewritten around the seam + SX
op-table runner (all-SX, no Erlang/fed-sx); the Erlang/fed-sx path DEMOTED to explicit adapter
phases RA (durable runner) + TA (fed-sx transport); canonical activity shape + P0.4 reconcile;
seam contract refreshed to behavior.sx (status/dedup/pump/async, behavior 10/10); stray bits cleared.
- 2026-07-02 — seam DONE + reviewed twice. lib/host/behavior.sx (engine + 4 adapters); enriched
substrate-agnostic (status/env/dedup/pump); 2nd review corrected the async-completion contract
(construction-wired inbound + pump, not env :emit) + proved it. behavior 10/10, conformance 580.
- 2026-07-02 — P0.1 done. host/blog--publish-activity + host/blog--post-category; the publish
contract in SX, 200/200. Verified next/ triggers e2e baseline 10/10. Roadmap anchored. NEXT:
P0.2 the dispatch bridge (in-process: serve.sh loads next/ kernel + registers the on-publish
trigger; host emits the activity via the erlang-on-sx bridge to pipeline:apply_triggers).