diff --git a/plans/abstractions.md b/plans/abstractions.md new file mode 100644 index 00000000..07112cc6 --- /dev/null +++ b/plans/abstractions.md @@ -0,0 +1,594 @@ +# Abstraction Radar — backlog + +Maintained by the read-only `radar` loop (see `plans/agent-briefings/radar-loop.md`). +Detection only — implementation is a separate, coordinated step owned by the +relevant subsystem loop, never by radar. + +**AHA gate to reach _Proposed_:** ≥3 real consumers · all past Phase 2 & API-stable · +structurally identical (file:line evidence) · a natural home (usually NOT lib/guest). +Anything short → _Watching_ (what's missing) or _Rejected_ (why). + +--- + +## Last scan + +- **Date:** 2026-06-07 (radar loop, pass 32) +- **Pass 32 — A1 DONE.** `loops/conformance` merged to architecture (`db76cc8c`); 13 adopters + now on the shared driver; radar spot-checked common-lisp = 487/487 green post-merge → + coordination flag CLEARED. A1 moved to a new **Done** section. New nascent subsystems + `dream` + `maude` (0 files), `fed-prims` resumed (mutex-deadlock fix). The idle + `a1-conformance` loop can be retired (worklist complete). +- **Date:** 2026-06-07 (radar loop, pass 31) +- **Pass 31 — A1 conformance loop WORKLIST COMPLETE.** tcl excluded (foreign `*.tcl`); final: + 4 migrated (common-lisp/erlang/feed/go) + 5 excluded (forth/js/ocaml/smalltalk/tcl). A1 = + **12 on shared driver + 6 excluded**; only the parity-gated merge to architecture remains. + commerce shipped a refund saga on flow (2nd flow use) + finished Phase 5 → going quiescent. + relations building graph algos (all-paths) — still unconsumed (W9 unchanged). +- **Date:** 2026-06-07 (radar loop, pass 30) +- **Pass 30:** conformance loop near done — `ocaml` + `smalltalk` excluded (both foreign + `test.sh`/corpus runners, as predicted). Tally: 4 migrated, 4 excluded, **tcl only** left. + Next A1 milestone = the `loops/conformance`→architecture merge under adopter-parity. No + new candidate; relations/artdag steady (no new W9 delegation). +- **Date:** 2026-06-07 (radar loop, pass 29) +- **Pass 29:** conformance loop excluded `js` (test262 fixtures) → 4 migrated + 2 excluded, + 3 remain (ocaml/smalltalk/tcl). New subsystems advancing fast: `relations` → Phase 4 + federation, `artdag` → Phase 6 federation → both fold into W1 (now 7 federation modules, + theme-not-shape holds) and W9 (relations past Phase 2 but not yet consumed by anyone). +- **Date:** 2026-06-07 (radar loop, pass 28) +- **Pass 28 — fleet expanding again.** Conformance loop: `go` migrated 609/609; **`forth` + excluded** (foreign Forth corpus — classify-then-exclude working). 4 migrated +1 excluded + on the branch; js/ocaml/smalltalk/tcl remain. **2 new subsystems:** `relations` (Phase 1, + parent/child rel facts → new W9 nascent watch) and `artdag` (nascent, 0 files). `events` + MERGED to architecture (its persist+flow adoption now integrated — W4/W8 landed). Briefing + commit hints more incoming: `dream`, `host`, +5 language chisels. +- **Date:** 2026-06-07 (radar loop, passes 26–27) +- **Passes 26–27 (routine tracking):** conformance loop steady at ~1 migration/iteration — + erlang 761/761, then feed 189/189. A1 = 8 on architecture + 3 on the branch; 6 remain. + W4 still gated (host-persist adapter not landed); no new subsystem; app loops on + incremental domain work (commerce Phase 5 payment envelope, content/events/identity/fed-sx). + Nothing new to discover; merge-time adopter-parity flag still open. +- **Date:** 2026-06-07 (radar loop, pass 25) +- **Pass 25:** A1 → **8 adopters** (events via its own loop) + common-lisp 487/487 on the + conformance branch. The conformance loop **extended the shared `lib/guest` driver** + (per-suite counters/preloads) to do it → raised a **coordination flag in A1**: verify the + branch is non-regressive against all 8 adopters before merging to architecture. commerce + drafting Phase 5 provider-neutral payment envelope. No new candidate; A1 advancing fast. +- **Date:** 2026-06-07 (radar loop, pass 24) +- **Pass 24 — three real updates.** (1) **A1 → 7 adopters** (search migrated, counters mode + — corrects the earlier exclusion). (2) The dedicated `conformance` loop ran its 1st + iteration: refused to force-migrate common-lisp (parity gate worked) and surfaced a + **driver feature-gap** (per-suite counters + preloads) gating the complex multi-suite + candidates → A1 now splits simple-now vs gated-on-driver-enhancement. (3) **W8 commerce + is LIVE** ("order lifecycle as a durable flow-on-sx flow, Phase 3 done") → 2 live flow + consumers. events shipped TZ/DST; mod reverted its extraction note (declined on re-read). +- **Date:** 2026-06-07 (radar loop, pass 23) +- **Pass 23 — trigger fired (empty streak ends at 19–22).** commerce recorded a Phase 3 + **flow-integration design** (order saga as a flow-on-sx flow, payment suspended until + webhook resume) → 2nd durable-flow consumer; **W8 broadened** from "delivery" to + "externally-resumed orchestration on lib/flow." events made its federation transport + **fed-sx-ready** (injected) → reinforces W1's 5/5 inject-fed-sx seam. acl left tmux + (now fully quiescent). host-persist adapter still not landed (W4 migration still gated). +- **Empty-discovery streak: passes 19–22** (last verified pass 22). Fleet at steady state — + active loops (content CvRDT, events recurrence/reschedule, identity grant-mgmt, fed-sx + outbox internals) are building *inside* their domains, not cross-cutting infra. Census + exhausted (p17); all gates re-tested (W1 p18, W2 p19). No new candidate clears any gate. +- **Radar is now trigger-driven.** The next substantive pass needs one of: **(a)** a new + subsystem worktree spawning (auto-joins scan), or **(b)** host-persist's durable adapter + landing → unblocks the W4 acl/mod→persist/log migration, or **(c)** a quiescent + subsystem (acl/mod/search/commerce, static ~9–16 passes) resuming. Polling ~hourly until + one fires; will tighten cadence then. +- **Date:** 2026-06-07 (radar loop, pass 20) +- **Pass 20 — honest empty pass.** 3 new census recurrences since p17 (normalize/index ×2, + query ×3) — all **name collisions** (same noun, domain-specific op), added to the table. + Recorded the meta-pattern: the fleet shares vocabulary, not structure. Most subsystems + quiescent (acl/mod/search/commerce static ~9-15 passes = API-stable); only events/ + identity/content/fed-sx still committing domain features. No new gate-clearer. +- **Date:** 2026-06-07 (radar loop, pass 19) +- **Pass 19 — honest empty pass.** Scanned 10 active subsystems. content/index.sx is a + blog index/tag-cloud listing (presentation, not full-text search — no search reinvention) + and content/multi-doc indexing adds no per-viewer filter. **W2 re-tested: still 2** + (feed, search) — acl's `permit?`-like matches are its own authZ *engine* (the home), + not a downstream read filter. No new candidate cleared any gate. +- **Date:** 2026-06-07 (radar loop, pass 18) +- **Pass 18 — W1 gate re-test.** events shipped Phase 4 federation (5th consumer): a 5th + divergent merge (sorted agenda + `:origin` provenance), trust-gate = runtime list + membership (shares mod's mechanism, not acl's). Reinforces W1's "theme not shape" — but + the **inject-fed-sx-transport seam is now 5/5**, strengthening "all are fed-sx + consumers-in-waiting." Trust sub-pattern refined: mod+events (runtime set) vs acl (rule). +- **Date:** 2026-06-07 (radar loop, pass 17) +- **Pass 17 — filename census declared EXHAUSTED** (see the Census-status table above). + Examined the last unswept ≥2 recurrences (schema/engine = acl⇄mod substrate twins; + catalog/batch = name collisions; store = divergent). No new candidate. Incremental churn + elsewhere (content 621/621, identity PAR, events reminders). Future passes pivot from + censusing to re-testing gates as consumers mature. +- **Date:** 2026-06-07 (radar loop, pass 16) +- **Pass 16:** events started Phase 3 — **durable notification delivery on `lib/flow`** + (new W8: at-least-once + idempotency exemplar; fed-sx/mod roll their own outbox). The two + `notify.sx` (feed vs events) are a name collision (read-side digest vs delivery), noted + in W8. Substrate-adoption story deepening: app domains now consume persist (content/ + commerce/events), flow (events), commerce (events), acl-authZ (identity). +- **Date:** 2026-06-07 (radar loop, pass 15) +- **Pass 15:** added the **scanning-method note** above after `query.sx` again proved to + be merged-lib copies (lib/prolog + lib/persist in every worktree). Corrected census + surfaced `wire`×2 (content+mod) → Rejected (shared role, divergent structure: generic SX + serializer vs bespoke pipe-format under a Prolog-env string-prim constraint). events↔ + commerce integration appeared (paid tickets); acl/mod/search quiescent ~7 passes (now + API-stable). No new gate-clearer. +- **Date:** 2026-06-07 (radar loop, pass 14) +- **Pass 14:** filename census flagged `snapshot`×?? — but the `*/lib/persist/snapshot.sx` + copies are just the merged `lib/persist` in each worktree, NOT consumers (same artifact + as `lib/feed/rank.sx` everywhere). The one distinct file, `content/snapshot.sx`, + reimplements persist's projection-checkpoint on raw KV instead of using `persist/snapshot` + → new W7 (persist-adoption nudge). `audit`×3 = the W4 fakes (acl/mod/identity), known. +- **Date:** 2026-06-07 (radar loop, pass 13) +- **Pass 13 — honest re-test, no gate-clearer.** Re-tested the two longest-waiting gates + against the maturing app-domain loops: **W2** (per-viewer visibility) still 2 consumers + (feed, search) — commerce/content/events/identity add no per-viewer read filter; **W3** + (pagination) still 2 (feed, search) — `content/page.sx` is an HTML wrapper, not + pagination (filename collision, noted in W3). Incremental churn only elsewhere. +- **Date:** 2026-06-07 (radar loop, pass 12) +- **Pass 12:** `events` shipped **transactional booking on persist** (3rd live persist + consumer) using `persist/append-expect` (optimistic-concurrency CAS, lock-free capacity + safety). W4 ledger now shows a persist feature-ladder append → append-once → append-expect + that the hand-rolled fakes can't match. No new candidate; W4 reinforced. +- **Date:** 2026-06-07 (radar loop, pass 11) +- **Pass 11 — W4 sharpened with a consumer ledger.** commerce built an **order ledger on + persist** (2nd live exemplar; uses `persist/append-once` for webhook idempotency) and + identity a **grant audit ledger** (in-memory Erlang fake, gated on an Erlang↔persist + bridge). The append-only monotonic-seq event-log pattern is now validated across 4 + domains, 2 live on persist + 3 fakes flagged for adoption. See W4 table. +- **Date:** 2026-06-07 (radar loop, pass 10) +- **Pass 10:** commerce/content/events/identity advancing (content 238/238). Probed a + shape outside the routing table — **guarded lifecycle state machines** (mod/lifecycle + + identity/membership) → new W6: shared *design principle*, divergent *structure* + (SX transition-table vs Erlang gen_server), NOT an extraction target. No gate-clearer. +- **Date:** 2026-06-07 (radar loop, pass 9) +- **Pass 9:** `commerce` + `content` reached Phase 2 (`content` 162/162). **Key find: + `content` built its op log directly on `persist/log`** (backend-injected, append+replay- + to-seq) — the live reference exemplar for W4 (see W4). `events` MONTHLY RRULE, + `identity` OAuth2 auth-code + PKCE, search boolean-filtered ranked. A1 still 6 adopters. +- **Date:** 2026-06-06 (radar loop, pass 8) +- **Pass 8 — fleet expanded by 4 app-domain loops** (the briefing's anticipated + `commerce`/`identity` arrivals, auto-picked up by dynamic discovery). All early-stage, + **pre-Phase-2 → moving targets, none count toward any gate yet**: + - `commerce` (Phase 1: `api/cart/catalog/price`). Its "per-line audit" is a cost + *breakdown view* (`api.sx:44`), **not** an append-only decision log → NOT a W4 + consumer. + - `events` (Phase 1: `calendar.sx`, RRULE expansion). + - `identity` (early: `session/token`). Defers authZ to acl (`token.sx:15`) — reinforces + W2's "delegate `permit?` to acl-on-sx" routing; identity = authN, acl = authZ. + - `content` (just-started: `block.sx`). + These are the future consumers W2/W3 are waiting on — re-check their per-viewer filters + / pagination once each clears Phase 2. No new gate-clearer this pass. +- **Pass 7:** **A1 jumped 4→6 adopters** — `acl` + `mod` migrated to the shared + conformance driver (first app-domain adopters; proves it generalizes past substrates). + `host-persist` closed its blob-adapter blocker (durable storage adapter now landing → + W4 migration path opening). search shipped proximity/NEAR; flow + persist quiescent. +- **Pass 6:** new worktree **`host-persist`** (active — building persist's durable host + adapter); `feed` went quiescent (left tmux). acl shipped hardening (+25), fed-sx-m1 at + Step 6c. **mod loop independently wrote a shared-plumbing note** (`mod-on-sx.md`, + 538b8a53) corroborating W4/W5 — folded its claims + home disagreements into W1/W4/W5. + No new gate-clearer (audit log still 2 consumers), but consumers are now API-stable. +- **Pass 5:** search (+highlight/snippet) and fed-sx-m1 (+follower_graph) moved; rest + unchanged. Filename census: `api`×6, `fed`×3, then `schema/rank/query/page/explain/ + engine/batch/audit`×2. Examined the ×6 `api.sx` → Rejected (shared name, divergent + structure incl. implicit-vs-explicit-state contract). rank/batch/engine all ≤2 + + substrate/domain-divergent → no new gate-clearer. +- **Pass 4:** no churn vs pass 3 (same worktrees/tmux/HEADs/adopters). Swept audit+explain + surfaces: acl/mod share an append-only-log shape (→ sharpened W4 with persist/log API + evidence) and a proof-explain shape (→ new W5, substrate-bound). No new gate-clearer. +- **Pass 3 (earlier today):** subsystem set + tmux + A1 adopters (4) all unchanged vs pass 2. Loops + advanced: acl shipped Phase 4 federation; search shipped Phase 4 + pagination; feed + shipped pagination/threading; mod at Ext 19 (capstone); persist did a worked acl-grants + migration (W4). New shape found: offset/limit pagination → folded into W3. +- **Subsystem set discovered:** loop worktrees `acl, erlang, fed-prims, fed-sx-m1, + feed, flow, go, kernel, mod, ocaml, persist, radar, ruby, search, + sx-vm-extensions`; main-repo `lib/*` incl. merged `feed` + substrates (`apl, + common-lisp, datalog, erlang, forth, go, haskell, hyperscript, js, lua, minikanren, + ocaml, prolog, scheme, smalltalk, tcl`) + `lib/guest`. + Actively looping (tmux): `acl, fed-sx-m1, feed, flow, mod, persist, search` + (+ radar). +- **New since pass 1:** worktrees `kernel` (empty/unset — not yet a repo) and `ocaml` + (`lib/ocaml/baseline` only). Both early-stage, pre–Phase 2 → out of proposal scope. +- Re-enumerate every pass; new loops (e.g. a future `commerce`/`identity`) auto-join. + +**Census status (pass 17): EXHAUSTED.** Every own-namespace filename recurring ≥2× has +been examined and dispositioned — further filename-censusing is low-yield until new +subsystems/modules appear. Map: +| filename | owners | verdict | +|---|---|---| +| `api` ×10 | all | Rejected — shared role, divergent state contract | +| `fed`/`federation` | feed/search/mod/acl(+content) | W1 — theme not shape | +| `audit` ×3 | acl/mod/identity | W4 — append-only log → persist/log | +| `page` ×3 | feed/search (pagination) + content (HTML wrapper) | W3 + collision noted | +| `explain` ×2 | acl/mod | W5 — proof tree, substrate-bound | +| `snapshot` ×2 | persist(facet) + content(reinvents) | W7 | +| `wire` ×2 | content(SX serializer) / mod(pipe-format) | Rejected — divergent | +| `schema`,`engine` ×2 | acl/mod | substrate-twin parallels (Datalog vs Prolog); only audit (W4) is liftable | +| `catalog`,`batch` ×2 | commerce/persist, mod/persist | name collisions, unrelated | +| `normalize` ×2 | content(tree-prune)/feed(record-coerce) | name collision (pass 20) | +| `index` ×2 | content(listing)/search(inverted index) | name collision (pass 20) | +| `query` ×3 | content(doc-block)/search(bool AST)/persist(stream-read) | 3-way name collision (pass 20) | +| `store` ×2 | content(on persist) / flow(workflow records) | related concept, divergent | +| `rank` ×2 | feed/search | different domains (activities vs docs), ≤2 | +**acl⇄mod are structural twins** (decision engine over a logic substrate, Datalog vs +Prolog) — they parallel across engine/schema/explain/audit/fed, but only the *audit log* +is substrate-agnostic and liftable (→ W4); the rest are substrate-idiomatic. Next passes: +re-test gates (W2/W3/W8) as consumers mature, watch new modules — not re-census. + +**Meta-pattern (pass 20):** new module names keep *recurring* but the operations keep +*colliding* — same noun, domain-specific op (normalize, index, query, catalog, batch, +notify, page, store all proved to be collisions). This is *why* genuine extraction +candidates are rare: the fleet shares vocabulary, not structure. The real shared assets +are the **substrate subsystems** (persist, flow, acl, fed-sx) that app domains *adopt* +(W1/W2/W4/W7/W8), not hand-rolled libs to extract. + +**Scanning-method note (learned the hard way, passes 5/12/14/15):** a filename census +for *cross-subsystem* recurrence MUST restrict to each subsystem's OWN namespace — +`X/lib/X/*.sx` — never `X/lib/*/`. The merged substrate libs (`lib/prolog`, `lib/persist`, +`lib/feed`, `lib/datalog`, …) are checked out inside *every* worktree, so a naive census +reports e.g. `query.sx`/`snapshot.sx`/`rank.sx` ×N as phantom recurrences that are really +one merged file copied N times. Correct one-liner: +`for w in ; do for f in $w/lib/$w/*.sx; do basename $f .sx; done; done | sort | uniq -c | sort -rn`. + +--- + +## Done + +### A1 · Shared conformance driver — ✅ COMPLETE (merged `db76cc8c`, pass 32) +Full closed loop: radar detected it → dedicated `conformance` loop implemented it +(classify-then-migrate-or-exclude, hard parity gate) → **merged to architecture** +(`db76cc8c Merge loops/conformance into architecture: A1 conformance-driver migration`) +→ radar spot-verified post-merge (**common-lisp 487/487 green** on architecture — exercises +the new per-suite-counters/preloads driver feature, the riskiest change). Final state: +- **13 on the shared driver:** acl, apl, common-lisp, datalog, erlang, events, feed, go, + haskell, mod, prolog, relations, search. +- **6 correctly excluded** (foreign-program runners — a legitimately different harness): + forth, js, ocaml, smalltalk, tcl, lua. +- The shared driver gained per-suite counters + per-suite preloads (backward-compatible); + spot-check confirms existing adopters unaffected. Coordination flag CLEARED. +Detail of the migration arc retained under the original entry below. + +## Proposed (cleared the gate) + +_(empty — A1 graduated to Done, pass 32.)_ + +### A1 · Adopt the shared conformance driver across subsystems +- **Pattern:** every subsystem hand-rolls a near-identical `conformance.sh` + (epoch-load → eval → scoreboard emit) and an inline `-test name got expected` + pass/fail counter. +- **Consumers (≥3, overwhelming):** 15 `lib/*/conformance.sh` — `apl, feed, datalog, + flow, mod, lua, erlang, forth, go, common-lisp, haskell, js, ocaml, prolog, + smalltalk, tcl`. +- **Home:** `lib/guest` — the one legitimate exception (the shared driver + `lib/guest/conformance.sh` + `lib/guest/conformance.sx` already exist; modes + `dict` and `counters`). +- **Status: IN PROGRESS — 6 adopters (pass 7).** `prolog` (dict), `haskell` (counters), + `apl` (dict), `datalog` (dict), and **`acl` (dict) + `mod` (dict), newly migrated this + pass** — all 3-line exec shims into `lib/guest/conformance.sh` with a `conformance.conf`. + **acl + mod are the first *app-domain* adopters** (not language substrates) — strong + evidence the driver generalizes beyond the substrate layer, which was the open question. + The `apl` migration earlier *surfaced a latent bug*: the old awk extractor + under-counted `pipeline` (40 vs the real 152 assertions); true apl total is **562**, + not 450 — evidence that adopting the driver also improves correctness. +- **Not a target (different harness shape):** `lua/conformance.sh` is a Python runner + (`lib/lua/conformance.py`) that walks real `*.lua` source files via `lua-eval-ast` + and classifies pass/fail/timeout — it does not run SX `deftest` suites with a + counter/dict scoreboard, so the shared driver does not fit. Excluded, not pending. +- **Remaining hand-rolled candidates (~120–220 lines each):** `common-lisp, erlang, + feed, forth, go, js, ocaml, smalltalk, tcl` — now being worked by the dedicated + `conformance` loop (above). (`lua` excluded: walks real `*.lua` files via Python. + `smalltalk` likely excludes too — runs `*.st` via its own `test.sh`. `search` was + thought to be excluded but DID migrate via counters mode — see the 7-adopter note.) +- **Action:** each remaining subsystem's OWN loop migrates when quiescent — add a + `conformance.conf` (+ a `test-harness.sx` preload defining its counters) and + replace `conformance.sh` with the 1-line exec shim + (`exec bash …/guest/conformance.sh …/conformance.conf "$@"`). Recipe template: + `lib/haskell/conformance.conf` (counters) or `lib/prolog/conformance.conf` (dict). + Keep the `bash lib/X/conformance.sh` entry point so no loop is disrupted. +- **Priority: HIGH** (15 consumers, low risk, interface-preserving, additive). +- **8 adopters on architecture** (pass 25): acl, apl, datalog, **events**, haskell, mod, + prolog, search — `events` migrated via its OWN loop; `search` via counters mode (which + corrects the earlier "search excluded" note). **+4 on the `loops/conformance` branch: + `common-lisp` 487/487, `erlang` 761/761, `feed` 189/189, `go` 609/609** — pending merge. + **5 EXCLUDED — all foreign-runner harnesses** (correctly, not force-migrated): `forth` + (Hayes core.fr via awk+python), `js` (test262 `.js`/`.expected`), `ocaml` (scrapes + `test.sh` + `.ml` baseline), `smalltalk` (scrapes `test.sh` + `*.st` corpus), `tcl` + (foreign `*.tcl` vs `# expected:` annotations). +- **✅ CONFORMANCE LOOP WORKLIST COMPLETE (pass 31).** Final A1 picture: + - **12 on the shared driver:** acl, apl, datalog, events, haskell, mod, prolog, search + (on architecture) + common-lisp, erlang, feed, go (on `loops/conformance`, pending merge). + - **6 correctly excluded** (foreign-program runners — testing a language impl against an + external corpus is legitimately a different harness): forth, js, ocaml, smalltalk, tcl, lua. + - **Honest finding:** the driver's reach is narrower than the raw "15 conformance.sh" + count implied — language substrates that run real `.lua/.st/.ml/.tcl/.js/.fr` programs + *should* keep their foreign runners. ~half migrate, ~half don't, and that's correct. + - **One step left:** merge `loops/conformance` → architecture under the **adopter-parity + check** (the coordination flag above — the shared `lib/guest` driver change must be + proven non-regressive against all existing adopters first). The loop is now idle. +- **NOW IN PROGRESS — dedicated loop (2026-06-07).** A human-triggered `conformance` loop + (worktree `/root/rose-ash-loops/conformance`, branch `loops/conformance`, tmux session + `a1-conformance`, briefing `plans/agent-briefings/conformance-loop.md`) is working the + remaining candidates (common-lisp, erlang, feed, forth, go, js, ocaml, smalltalk, tcl) + one per iteration, **classify-then-migrate-or-exclude with a hard test-count parity gate** + (reverts on any mismatch; never pushes to main/architecture). Radar tracks; it implements. +- **Driver-capability boundary found (pass 24, first iteration).** The loop did NOT + force-migrate `common-lisp` (baseline 305/0 across 12 suites) — the shared driver can't + reproduce it: `MODE=counters` supports only ONE global pass/fail counter pair + ONE fixed + preload set, but common-lisp needs **per-suite counter names** (8 distinct pairs) and + **per-suite preload chains**. It logged a precise blocker + unblock path (extend the + `SUITES` entry format with optional per-suite counters/preloads) and moved on. +- **Driver gap RESOLVED next iteration (pass 25) — but it touched the shared driver.** The + loop extended `lib/guest/conformance.sh` (+38 lines: optional per-suite counters + per-suite + preloads in the `SUITES` format, backward-compatible) and then migrated common-lisp at + **487/487** (above the 305 baseline — likely another extractor under-count correction, à la + apl's `pipeline`). The parity gate held throughout. +- **⚠ COORDINATION FLAG (radar): the `loops/conformance` branch now carries a change to the + SHARED `lib/guest` driver** used by all 8 adopters. It's additive by design, but **before + this branch merges to `architecture`, re-run the existing adopters' suites under the new + driver to confirm zero regression** (acl/apl/datalog/events/haskell/mod/prolog/search). + This is the one cross-cutting risk in an otherwise per-subsystem-isolated effort — surfaced + here so the merge is gated on adopter-parity, not assumed. + +--- + +## Watching (real but not yet through the gate) + +### W1 · Federation scaffold (merge / ingest / backfill / trust-gate) +- **FAILS the structural-identity gate (deep-dived 2026-06-06, all 4 read).** Consumer + count is met (4) but they are *superficially* similar, not structurally identical — + the federated unit and merge op differ fundamentally: + + | Subsystem (file) | Federated unit | Merge op | Trust gate | Injected transport | + |---|---|---|---|---| + | feed (`fed.sx:14,18,40`) | activity streams | dedupe by `(actor verb object)` | none (visibility via `permit?` separately) | `send-fn`, `fetch-fn` | + | search (`fed.sx:8`) | inverted indices | relabel DocId `peer*1000+local` + union posting lists | none | none (pure merge fn) | + | mod (`fed.sx:11-14,99`) | moderation decisions | advisory-list vs applied-list; bind iff `mod/trusted?` | **yes — runtime list** `mod/trusted? peer scope` | mock outbox / `fed-send!` | + | acl (`federation.sx:43,56`) | Datalog delegate facts | pull facts, gate by `trust`/`level_covers` rule, re-saturate | **yes — Datalog rule** at query time | `transport` dict | + | events (`federation.sx`) | calendar agendas | fold trusted peers' agendas into one sorted agenda + `:origin` provenance | **yes — runtime list** `ev/trusts?` (peer-id ∈ trust-set) | injected behind `ev/peer-agenda` | + +- **The ONLY real commonality is the injection seam** (now 5/5, pass 18), not extractable + code: every one says "the real transport is `fed-sx`'s job; inject `send-fn`/`fetch-fn`/ + `transport`/`peer-agenda` and mock it in tests." That is an architectural *convention the + fleet already follows*. The merge op diverges 5 ways (dedupe / index-union / advisory / + fact-saturation / agenda-sort). The trust gate, where present, splits: **mod + events use + a runtime trust-set membership check; acl uses a declarative Datalog rule** — so even the + trust sub-pattern is 2-of-3, and the membership check is a trivial one-liner (below the + extraction threshold). No shared merge, no single shared trust mechanism. +- **Disposition:** do NOT extract a shared "federation lib." When `fed-sx` ships its + real transport, these 4 become its *consumers* (wiring `send-fn`/`fetch-fn`/`transport` + to it) — that work belongs to each subsystem's loop + the `fed-sx` loop, not a + cross-cutting extraction. Stop re-proposing on the shared name. Home: `fed-sx`. +- **Now 7 federation modules (pass 29):** + `relations` (Phase 4: erel trust-gating, + peer_rel/trust, fed-sx mock transport — Datalog-rule trust like acl) and `artdag` + (Phase 6: content-addressed cache + trust + **invalidation** — a merge shape unlike any + other). Each new one reinforces "theme not shape": 7 divergent merges, all sharing only + the inject-fed-sx-transport seam. Verdict unchanged — they're fed-sx consumers-in-waiting. +- **Narrower sub-claim (mod note, pass 6; refined pass 18):** mod asserts the *fed + trust/outbox* shape shares between mod+acl. Radar evidence refines this: the trust gate + splits by mechanism, not by subsystem pair — **mod + events** both use a runtime + trust-set membership check (`mod/trusted?`, `ev/trusts?`), while **acl** uses a Datalog + rule. So a "trust-set membership" helper has 2 consumers (mod, events) — but it's a + one-line `member?` and the merge it gates diverges, so still not worth extracting. + Resolve at the architecture-merge point if a heavier shared trust-set surface emerges. + +### W2 · Per-viewer visibility / permission filter +- **2 shipped consumers, same shape** — `filter `: + - `feed/lib/feed/acl.sx:27` `feed/visible = (feed/filter stream (fn (a) (permit? viewer a)))`, + capstone at `:34` (stream → ACL → rank → top-N). `permit?` injected, sig `(viewer activity)→bool`. + - `search/lib/search/fed.sx:16` `aclFilter permit docs = filter permit docs`; + `topNTfIdfAcl n permit ts idx = take n (aclFilter permit (rankTfIdf ts idx))`. + `permit` injected, sig `DocId→Bool` (viewer baked in by caller). +- **NOT a consumer:** `mod/lib/mod/policy.sx` is moderation policy (reviewer actions), + no per-viewer read filter. So mod won't be the 3rd. +- **Missing:** (a) only 2 consumers, need ≥3; (b) the two interfaces *diverge* — + feed passes `(viewer, item)`, search bakes the viewer in — so any shared form must + pick a convention; (c) both already **inject** the predicate, and the filter body is + literally one line (`filter permit xs`). Leaning toward: the predicate's home is + `acl-on-sx` (`permit?`), and the one-line filter is too thin to extract. +- **Home when ripe:** delegate `permit?` to `acl-on-sx`; do NOT extract the filter. + Re-check if a 3rd genuine per-viewer read filter ships (e.g. events/commerce). + +### W3 · Collection helpers (group-by, dedupe-by-key, stable top-N, distinct-order, offset/limit page) +- feed built all of these on APL primitives. search/commerce/events will want + group-by / top-N. +- **NEW (2026-06-06): offset/limit pagination shipped in 2 subsystems, identical shape** + `take limit (drop offset xs)`: + - `feed/lib/feed/page.sx:9` `feed/page` (offset/limit window over a stream). + - `search/lib/search/page.sx:9` `paginate off lim docs = take lim (drop off docs)`. + - NOT a 3rd: `persist/lib/persist/query.sx:5` has a *since-cursor* for incremental log + consumption — resumable-stream semantics, not result windowing. Different shape. + - feed *also* has cursor-by-`:at` recency pagination (`page.sx:21-44`); search has no + cursor. So only the plain offset/limit window is shared, and it is a literal 1-liner. +- **Missing:** ≥3 stable consumers; AND every item here is collection math that belongs + in the **substrate** (APL/Haskell already expose grade/sort/unique/take/drop), not a + shared lib. A 1-line `take/drop` window is far below the extraction threshold. Watch; + revisit only if a non-substrate subsystem needs the same windowing without take/drop. +- **Filename-collision caution (pass 13):** `content/lib/content/page.sx` is an **HTML + page wrapper** (full HTML5 doc), NOT pagination — do not count it as a 3rd pagination + consumer. `page.sx` now means two unrelated things across the fleet. Re-tested pass 13: + pagination still only feed + search (2). + +### W4 · In-memory store fakes → `persist-on-sx` +- Not an abstraction to extract — a migration target. Every subsystem fakes its + store with a mutable list (`feed/-log`, flow store, mod audit, …). +- **Owner:** `persist-on-sx` (in progress). Tracked there, listed here for visibility. +- **Concrete instance (file:line, found pass 4): the append-only decision/audit log.** + `acl/lib/acl/audit.sx` and `mod/lib/mod/audit.sx` are the SAME hand-rolled shape, and + `persist/lib/persist/log.sx` (the persist *log facet*) already implements it durably: + + | role | acl/audit.sx | mod/audit.sx | persist/log.sx (target) | + |---|---|---|---| + | log var | `acl-audit-log` :9 | `mod/*audit-log*` :10 | backend stream | + | monotonic seq | `acl-audit-seq` :10 | `mod/*audit-seq*` :11 | per-stream high-water :1 | + | append (auto-seq) | `acl-audit-decide!` | commit :32 | `persist/append` :17 | + | count | `acl-audit-count` :51 | `mod/audit-count` :44 | `persist/count` :12 | + | read-all oldest-first | snapshot/tail :73 | `mod/audit-all` :43 | `persist/read` :29 | + | read seq≥from | — | by-seq | `persist/read-from` :31 | + + Both deliberately use a monotonic seq with **no wall-clock** (deterministic/testable) — + identical to persist/log's design. Action when persist's host adapter lands: acl + mod + loops swap their in-memory log for `persist/log`. 2 consumers today; not a new lib — + the home already exists. Belongs to acl/mod loops × persist loop, not an extraction. +- **Cross-loop corroboration (pass 6):** the mod loop independently reached the same + conclusion — `mod/plans/mod-on-sx.md` (commit 538b8a53): *"mod-sx (Prolog) and acl-sx + (Datalog) converged on the same module shape … only the audit log + fed trust/outbox + shapes truly share; extract at the architecture-merge point, refactoring both consumers + atomically, not unilaterally from a loop branch."* Confirms the shape AND the + do-not-extract-unilaterally stance. +- **Home disagreement to resolve at merge:** mod's note proposes lifting the audit-log + primitives into **`lib/guest/`**. Radar routing disagrees: a durable append-only log is + a **`persist-on-sx`** concern (the log facet already exists), not language-impl plumbing. + Hold the line — `lib/guest` is lexer/parser/AST/HM/test-runner, not an event log. +- **Migration is becoming concrete:** new `host-persist` loop (worktree + tmux, pass 6) + is building the durable-storage host adapter persist was blocked on — once it lands, + acl/mod can actually swap to `persist/log`. +- **LIVE REFERENCE EXEMPLAR (pass 9): `content` already does it right.** `content` + (Phase 2 complete, 162/162) built its op log directly on `persist/log` instead of + faking it — `content/lib/content/store.sx`: backend injected via `(persist/open)` + ("content knows nothing about which backend", :10); append op as event + `persist/append b (content/-stream doc-id) …` (:20); read `persist/read` (:36); + `persist/last-seq` (:47); **version = replay op stream up to a seq** + (filter `persist/event-seq ev <= seq`, :61). "The op log is the source of truth … + the materialised doc is a cache, never primary state." + This proves the W4 target is real, not hypothetical: acl + mod's hand-rolled + monotonic-seq logs should adopt exactly content's `persist/log` pattern. +- **Consumer ledger of the append-only monotonic-seq event log (pass 11):** + + | consumer | what | backing | note | + |---|---|---|---| + | content (`store.sx`) | doc op log | **persist/log ✓ live** | plain append + replay-to-seq | + | commerce (`ledger.sx`) | order ledger | **persist/log ✓ live** | `persist/append-once` — idempotent, webhook-replay-safe :40,58 | + | events (`booking.sx`) | booking roster | **persist/log ✓ live** | `persist/append-expect` — optimistic-concurrency CAS, capacity-safe, lock-free | + | acl (`audit.sx`) | decision log | in-memory fake (SX) | migrate directly when host adapter lands | + | mod (`audit.sx`) | decision log | in-memory fake (SX) | migrate directly | + | identity (`audit.sx`) | grant ledger | in-memory fake (**Erlang**) | `{Seq,Subject,Action}`; needs an **Erlang↔persist bridge** first — author scoped it out until persist lands ("queryable semantics identical") | + +- **Two takeaways:** (1) the pattern is **validated across domains** — CRDT doc ops, + financial orders, event bookings, rule decisions, OAuth grants all reduce to the same + append-only monotonic-seq stream; (2) migrating to `persist/log` is strictly *better* + than the fakes — persist exposes a **feature ladder the fakes don't have**: + `append` (content) → `append-once`/idempotency (commerce) → `append-expect`/optimistic- + concurrency (events). Every fake would have to reinvent a weaker version of these. + This is an **adoption** item (the home already exists), NOT a new extraction — owned by + persist/host-persist × each consumer loop. The SX fakes (acl, mod) migrate directly; + the Erlang fake (identity) is gated on an Erlang↔persist bridge. + +### W5 · Proof-tree explanation over a logic-program derivation +- `acl/lib/acl/explain.sx` (reconstructs a canonical proof by goal-directed search over a + saturated Datalog db) and `mod/lib/mod/explain.sx` (renders a Prolog-style proof tree + goal-by-goal with proved/unproved marks + unification bindings) are the same *idea*. +- **Missing / disposition:** only 2 consumers, and they sit on **different substrates** + (acl→`lib/datalog`, mod→`lib/prolog`). Proof reconstruction/rendering is logic-engine + machinery → it belongs in each **substrate** (datalog/prolog), not a shared app lib. + Watch; revisit only if a 3rd logic-backed subsystem reimplements proof explanation. +- **Cross-loop note (pass 6):** mod's note calls `mod/proof-goals` (re-query-each-goal) + generic and proposes lifting it into **`lib/guest/`**. Radar caveat: proof-tree + reconstruction *is* engine-agnostic logic machinery, but `lib/guest` is for + lexer/parser/AST/HM/match/test-runner — a logic-engine proof helper is a poor fit there. + If genuinely shared by ≥3 engines, a `lib/logic`-style substrate helper is the better + home than `lib/guest`. Still 2 consumers → stays Watching either way. + +--- + +### W9 · Parent/child relationship tracking → the new `relations` subsystem (nascent) +- **New subsystem (pass 28):** `relations` (loops/relations, Phase 1 — `schema.sx`+`api.sx`, + rel facts + `relate`/`unrelate`/`children`/`parents`/`related`, 22 tests). Per CLAUDE.md + it's the canonical "cross-domain parent/child relationship tracking." +- **Why watch:** several subsystems already track parent/child *locally* — feed reply-to + threading (`thread`/`replies`), content nested block trees, events occurrence/RECURRENCE-ID + links. If `relations` becomes the shared home, those are candidate *delegators* (like + acl=authZ, persist=log). But it's **Phase 1, pre-Phase-2, moving target** — and each + local impl is currently domain-specific (different keys/semantics). Do NOT propose yet. + Re-check when relations is past Phase 2 AND ≥3 subsystems' relationship logic could + genuinely delegate to it. `artdag` also just spawned (nascent, 0 files) — tracking only. + (pass 32: `dream` + `maude` also spawned, nascent 0-files; `fed-prims` resumed.) +- **Update pass 29:** relations rocketed to **Phase 4** (one gate — past Phase 2 — now met), + but it's building ITSELF out (schema/federation), **not yet being consumed** by anyone. + The blocker is the other gate: 0 subsystems currently *delegate* their parent/child logic + to it (feed/content/events still track locally). Watch for the first real delegation. + (artdag also raced to Phase 6 — these ports advance fast; treat committed state as truth.) + +### W8 · Durable externally-resumed orchestration on `lib/flow` (suspend→host-IO→resume) +- **The shared shape:** a durable `flow` that `request`s an external action (a suspend + point), the **host** performs the IO, then `flow/resume`s the flow with the outcome; + flow's deterministic replay means a completed step never re-runs on recovery. +- **Consumers (pass 24): 2 LIVE** (events delivery, commerce order saga). + - `events/lib/events/notify.sx` (**live**) — reminders/digests as durable flows; + suspend on delivery `dispatch`, resume with send outcome. At-least-once + idempotency key. + - `commerce` (**LIVE** as of pass 24 — "order lifecycle as a durable flow-on-sx flow, + 21 tests, Phase 3 done") — order saga `(defflow ordf … (request 'reserve oid) … )`: + reserve→pay→fulfil as a flow, **payment stays suspended until the payment webhook calls + `flow/resume`**. Carries only the order-id; pure orchestration over `ledger.sx`. + - **Now 2 LIVE consumers** of the *same* pattern: long-running process, external resume + (delivery dispatch vs payment webhook). fed-sx/mod still roll their own outbox (watch + for convergence). Strengthens "lib/flow is the home"; still adoption, not extraction. +- **Disposition:** `lib/flow` IS the abstraction (events proves it, commerce adopts it) → + this is an **adoption** observation like W4, NOT an extraction. Home = `lib/flow`. +- **Flow-onboarding friction (light signal):** commerce's note logs real gotchas adopting + flow — `flow-make-env` returns a large likely-cyclic env (don't print it), env build is + slow (budget ~540s like flow's own suite). If ≥3 subsystems hit the same onboarding + gotchas, that's a signal to smooth `lib/flow`'s adopter API — flow's concern, flagged here. +- **Name-collision caveat:** `notify.sx` means two unrelated things — `feed/notify.sx` is + a *read-side digest* (group inbox by verb+object), NOT delivery. Do not pair them. + +### W7 · Snapshot/projection-checkpoint reimplemented vs `persist/snapshot` (delegate) +- `persist/lib/persist/snapshot.sx` already provides a **generic** projection checkpoint: + store `{:value :seq}` in the kv facet under a namespaced key; the headline property is + **snapshot + tail == full replay** (pure, clock-free). +- `content/lib/content/snapshot.sx` **reimplements that same pattern on raw persist KV** + rather than delegating: `persist/kv-put b (content/-snap-key doc-id) {:doc … :seq seq}` + (:20), `persist/kv-has?`/`kv-get` (:27-28), and its own tail-replay (:53-59). It never + calls `persist/snapshot-*`. content's doc-materialisation *is* a projection fold over + its op stream — exactly what `persist/snapshot` checkpoints generically. +- **Disposition:** persist-adoption nudge (like W4): content could delegate to + `persist/snapshot` (its projection = "fold ops → doc"), dropping the duplicated + KV+replay code. Home already exists → NOT an extraction; owned by content × persist + loops. Only 1 reinventor today; watch whether commerce/events/identity also hand-roll a + snapshot on raw KV instead of using the facet (would strengthen the nudge). NB timeline: + unclear if `persist/snapshot` predated content's — flag, don't blame. + +### W6 · Guarded lifecycle state machine (illegal transition = explicit error) +- Recurs as a **design principle**, NOT a shared structure (found pass 10): + - `mod/lib/mod/lifecycle.sx` — pure SX: immutable case `{:state :error :history …}`, + explicit transition table `mod/lc-transitions` (:31), illegal transition returns the + case unchanged with `:error` set. States open→triaged→decided→appealed→final. + - `identity/lib/identity/membership.sx` — an **Erlang `gen_server`** fragment (identity + runs on erlang-on-sx): a `receive` loop with `case find(...) of … {error, St}` guards. + States none→pending→active→lapsed→revoked. +- **Both share the guideline** ("invalid transitions are explicit errors, never silent + no-ops") but **implement it substrate-idiomatically** — SX transition-table over + immutable values vs an Erlang process loop with per-message case guards. Same W1/`api.sx` + trap: shared *idea*, divergent *structure*. +- **Disposition:** not an extraction target — the FSM mechanism is ~10 substrate-specific + lines; the value is in each domain's state graph, not the plumbing. At most a **design + guideline** ("model lifecycle as a guarded FSM with explicit-error transitions"). Watch + whether commerce-checkout / events-booking add their own — if so it confirms the + *guideline*, still not a lib. Do not propose extracting a shared state-machine lib. + +## Rejected (considered, declined — do not re-propose) + +- **"Continuous auto-implementing abstractor loop."** Rejected at design time: an + agent writing across `lib//**` breaks the worktree isolation that makes the + fleet safe, and is rewarded for manufacturing premature/wrong abstractions. The + radar is read-only by design. (This file is the alternative.) +- **Shared `api.sx` "public boundary" module (×6).** Rejected pass 4-5: every subsystem + has an `api.sx` (acl, feed, flow, mod, persist, search — a 100% filename match), but it + is a naming *convention for the public entry point*, not a shared structure. They + disagree on the most basic contract: acl/feed use **implicit module state** + (`acl/api.sx` "implicit current db", `feed/api.sx` "single mutable log") while + `persist/api.sx` threads an **explicit backend as every call's first arg**; flow's api + *builds a Scheme env*, search's api *concatenates a Haskell source string*, mod's is a + *lifecycle state-machine façade* (17 defs vs persist's 1). Same role, no common shape — + the W1 coincidental-resemblance trap. Do not re-propose on the filename. +- **Shared `wire.sx` "serialization" module (×2).** Rejected pass 15: content + mod both + have a `wire.sx`, but `content/wire.sx` uses the **generic SX serializer** + (`serialize`/`parse`, full-fidelity round-trip) while `mod/wire.sx` is a **bespoke + versioned pipe-delimited line** (subset of fields, `split` hand-built over slice/len + because mod's Prolog-loaded env strips string prims). Shared role (wire format), + divergent structure + substrate constraint → not a candidate; the SX serializer is + already the shared tool for SX-substrate subsystems, and mod can't use it. (Same family + as the `api.sx` rejection above.) +- **Dumping app-domain plumbing into `lib/guest`.** Rejected: `lib/guest` is for + language-implementation plumbing. App patterns route to acl/fed-sx/persist/ + substrate/host instead (see the routing rule in the briefing). diff --git a/plans/rose-ash-on-sx-migration.md b/plans/rose-ash-on-sx-migration.md new file mode 100644 index 00000000..c2a04a33 --- /dev/null +++ b/plans/rose-ash-on-sx-migration.md @@ -0,0 +1,170 @@ +# Re-implementing rose-ash on SX — migration strategy + +Status: **strategy proposal** (drafted by the `radar` loop, 2026-06-07). Not a +unilateral architecture decision — a starting point for the fleet to refine. Radar's +role here is detection: the `*-on-sx` subsystems have converged into a host-agnostic +re-implementation of rose-ash's domain logic, so this doc proposes *when* and *how* to +wire them to production. + +--- + +## 1. Premise: we are ~70% into a re-implementation already + +The fleet of `lib/` SX subsystems is not a set of experiments — it is rose-ash's +domain logic, re-expressed substrate-by-substrate, deliberately **host-agnostic**: + +| SX subsystem (`lib/`) | rose-ash production domain | +|---|---| +| content-on-sx (CRDT docs, versioning, `page.sx` HTML render) | **blog** | +| commerce-on-sx (catalog, pricing, cart, order + refund sagas) | **market + cart + orders** | +| events-on-sx (calendar, ticketing, booking) | **events** | +| feed-on-sx (activity streams, AP-shaped, threading) | **federation** | +| identity-on-sx (OAuth2, sessions, grants, membership) | **account** | +| acl-on-sx (permissions) | cross-cutting authZ | +| relations / likes | **relations / likes** (internal) | +| persist-on-sx (log / kv / snapshot facets) | per-service Postgres layer | +| flow-on-sx (durable sagas) | order/refund/delivery workflows | +| mod-on-sx, search-on-sx | new capabilities | + +**The architectural enabler:** every core was built with *injected seams* — `permit?`, +`send-fn`/`fetch-fn`, `transport`, `dispatch`, `backend`. That is ports-and-adapters +(hexagonal) on purpose. Evidence from the radar backlog (`plans/abstractions.md`): +W1 (7/7 federation modules inject the fed-sx transport), W4 (content/commerce/events run +live on `persist/log`), W8 (events+commerce run sagas on `lib/flow`). **The cores do not +depend on how they're hosted, persisted, or federated.** + +**Corollary that makes the whole migration tractable:** because logic is separated from +rendering and storage, we can hold the **domain logic to parity** while **freely +redesigning the presentation** — the two are different layers with different rules. + +--- + +## 2. The gating insight: the cores are *ahead of the host* + +The domain logic is mature. What is *not* yet production-grade is the **host trio** — and +that is the real critical path: + +- **host-on-sx** — HTTP / request-response / session host (briefing exists; the OCaml SX + HTTP server already serves `sx.rose-ash.com`). +- **host-persist** — durable storage adapter (real disk/pg/ipfs) under `persist`'s + facets (content-addressed blob blocker recently closed). +- **fed-sx** — the real ActivityPub transport every core injects (well into m2). + +> **So "when do we start?" answers itself: start when the host trio is production-grade, +> not when the cores are done — they mostly already are.** Prioritise the host loops over +> further domain features. + +--- + +## 3. The model: duplicate → cut over → diverge (per slice) + +This is the "duplicate first, then change" approach, made precise. Each domain slice goes +through three phases independently: + +**Phase A — Duplicate (hold logic to parity).** Stand the SX implementation of the slice +up *in parallel*, behind the existing edge, serving no users yet. Get its **domain/data +behaviour** to match Python (see §4 on how). Presentation can start as a rough port or an +early new design — it doesn't have to match. + +**Phase B — Cut over (strangler flip).** Point the edge route for that slice at the SX +host. Python stays as instant rollback. The slice is now live on SX. + +**Phase C — Diverge (change freely).** With the slice live and validated, evolve the +look/feel and functionality on the SX side. The validated domain logic underneath is +untouched, so UX/feature changes can't silently corrupt data. + +You never rewrite the whole platform at once; you walk slices through A→B→C, oldest tree +strangled last. + +--- + +## 4. The two techniques, and how "we'll change things" reshapes them + +### Strangler edge +The edge (Caddy) is the front door every request hits. Add routing rules so **one route +at a time** goes to the SX host while everything else still goes to Python. Properties: +the site is never half-broken; any single route flips back to Python instantly; the old +app is strangled route-by-route. (Opposite of big-bang swap, which is how these die.) + +### Shadow diff — split by layer +Run the new version on real traffic in the background, discard its output, and **log how +it differs** from Python. Flip the edge only when diffs are zero/intended. + +But because we *intend* to change look/feel + functionality, parity is a tool we apply +**only where we want sameness**, not a straitjacket: + +| Layer | Want parity? | Oracle | +|---|---|---| +| **Domain/data** (totals, tax, permissions, what's stored, who-sees-what) | **YES — silent difference = data corruption** | shadow-diff at the *core* boundary; deterministic cores → replay real request logs through the harness and diff | +| **Presentation/UX** (HTML, layout, look, feel, flows) | **NO — this is what we're changing** | manual QA + design review; this is the Phase-C divergence | + +Practical shape: shadow-diff hits the **domain core's output** (the computed order, the +visible-activity set, the permission decision) — not the rendered HTML. The deterministic, +harness-replayable cores are the single biggest advantage we have here; it's the same +parity discipline that made the A1 conformance migration safe (one reference slice, hard +parity gate, revert on mismatch). + +--- + +## 5. Readiness gates (start the production migration when ALL hold) + +1. **Host trio production-grade** — host-on-sx (HTTP/session), host-persist (durable + adapter), fed-sx (AP transport) — each conformance-green. +2. **Data-migration story exists** — a way to get existing production Postgres state into + `persist` event streams (event-source the current state, or dual-write during overlap). + This is the honest long-pole; it is *not* domain logic and nobody has built it yet. +3. **One vertical slice proven end-to-end** at data-parity in production — the reference + migration, the way the conformance loop migrated one subsystem before the rest. + +--- + +## 6. Sequencing + +1. **Host trio first** (critical path — it's behind the cores). +2. **Build the strangler edge + shadow-diff harness** as first-class tooling: edge routing + rules + a dual-run logger that diffs *core outputs* (not HTML) and stores discrepancies. +3. **First slice = lowest risk × highest readiness × cleanest data oracle.** + Recommended: **the blog read path (content-on-sx)** or **the feed read path** + — read-heavy, no money, CRDT/versioning + `page.sx` HTML already exist, and the data + oracle is clean. *Avoid cart/orders/payments first* (transactional + SumUp webhooks = + highest blast radius). +4. **Persistence-first, federation-last.** Land host-persist + migrate per-domain event + stores before any cutover. Do fed-sx federation as a *coordinated* cut near the end — + W1 shows all 7 cores light up federation together once the shared transport ships. +5. **Walk the remaining slices A→B→C**, retiring Python routes as each cuts over. + +--- + +## 7. The honest long tail (mostly host + adapters, not cores) + +The cores are pure domain logic; the production *tail* is not in them yet and is most of +the remaining real effort: + +- Auth: first-party cookies / Safari-ITP, CSRF, silent SSO, grant caching. +- Cross-cutting: rate limiting, observability/metrics, error pages, caching. +- Integrations: SumUp payment + webhooks, Ghost CMS sync. +- Presentation: the actual HTMX templates + CSS (this is also where the redesign happens). +- **Live data migration** — the single biggest non-core workstream. + +--- + +## 8. Concrete next steps + +1. Treat the **host trio** as the fleet's critical path; prioritise over more domain features. +2. Stand up the **strangler edge + core-level shadow-diff harness** as a tool. +3. Prove **one slice** (blog/content read path) end-to-end in production as the reference. +4. **Spec the Postgres → persist data migration** (the long-pole nobody has started). +5. Then walk slices through duplicate → cut over → diverge, redesigning UX in Phase C. + +--- + +## 9. Why this is low-risk despite being a platform rewrite + +- It's **wiring host-agnostic cores to a host**, not rewriting domain logic from scratch. +- The **strangler edge** means the site always works and any route reverts in seconds. +- **Deterministic cores** make data-parity *mechanically checkable* (replay + diff), so + correctness isn't a matter of faith. +- **Logic/presentation separation** lets us change look/feel + functionality (Phase C) + *without* re-risking the validated domain logic. +- It's the **same discipline that just shipped A1**: one reference migration, a hard + parity gate, honest exclusions, verify-before-merge.