diff --git a/plans/erlang-on-sx.md b/plans/erlang-on-sx.md index 8ca25a63..9492e3c3 100644 --- a/plans/erlang-on-sx.md +++ b/plans/erlang-on-sx.md @@ -122,10 +122,30 @@ Replace today's hardcoded BIF dispatch (`er-apply-bif`/`er-apply-remote-bif` in - [ ] `sqlite:open/1`, `sqlite:close/1`, `sqlite:exec/2`, `sqlite:query/2` — **BLOCKED** (no SQLite primitive). See Blockers. - [x] Tests: 1 round-trip per BIF; suite name `ffi`; conformance scoreboard auto-picks it up — **+14 ffi tests** at 637/637 total. Suite covers the 3 implemented file BIFs (9 tests: write-ok, read-ok-tag, payload-is-binary, byte_size content, missing-enoent, bad-path-enoent, binary-payload round-trip, delete-ok, read-after-delete-enoent) plus 5 negative asserts (one per blocked BIF — `crypto:hash`/`cid:from_bytes`/`file:list_dir`/`httpc:request`/`sqlite:exec`) so this suite fails fast if a future iteration adds a wrapper without registering proper tests. Target "+40 ffi tests" was relative to the original 5-BIF-family plan; with 5 of those families blocked on host primitives, the achievable count is 14 — the suite scaffolding is what matters and is ready to accept the remaining tests when the primitives land. +### Phase 9 — specialized opcodes (the BEAM analog) + +**Driver:** Erlang-on-SX going through the general-purpose CEK machine has architectural perf ceilings (call/cc per receive, env-copy per call, mailbox rebuild on delete). The fix is specialized bytecode opcodes that bypass the general machinery for hot Erlang operations. Targets: 100k+ message hops/sec, 1M-process spawn in under 30sec. Layered perf strategy: Layer 1 (this) = specialized opcodes; Layer 2 (Phase 10, deferred) = multi-core scheduler. + +**Architectural note:** opcodes get developed in `lib/erlang/vm/` (in scope). The **opcode extension mechanism in `hosts/ocaml/`** (Phase 9a) is **out of scope** for this loop — log as Blocker until a session that owns `hosts/` lands it. Sub-phases 9b-9g design and test opcodes against a stub dispatcher in the meantime; integrate when 9a is available. + +**Shared-opcode discipline:** opcodes that another language port could plausibly use (pattern match, perform/handle, record access) get prepared for **chiselling out to `lib/guest/vm/`** when a second use materialises. Same lib/guest pattern, applied at the bytecode layer. Don't pre-extract; do annotate candidates in commit messages. + +- [ ] **9a — Opcode extension mechanism** (in `hosts/ocaml/evaluator/`) — **OUT OF SCOPE for this loop**. Log as Blocker. Lets `lib//vm/` register opcodes without modifying SX VM core. Design lives in `plans/sx-vm-opcode-extension.md`. +- [ ] **9b — `OP_PATTERN_TUPLE` / `OP_PATTERN_LIST` / `OP_PATTERN_BINARY`**: specialized pattern-match opcodes for Erlang's bread-and-butter `case` clauses. Replace SX-`case` dispatch on the hot path. Tests: every pattern shape, including nested. Conformance must remain 637/637 + all prior. Candidate for chiselling to `lib/guest/vm/match.sx`. +- [ ] **9c — `OP_PERFORM` / `OP_HANDLE`** (algebraic effects style): replace the call/cc + raise/guard machinery used for `receive` suspension. Pure Erlang interface unchanged; underlying mechanism specialized. Candidate for chiselling (Scheme call/cc, OCaml 5 effects, miniKanren all want the same thing). +- [ ] **9d — `OP_RECEIVE_SCAN`**: built on 9c. Specialized opcode for selective receive — scans mailbox in pattern order, suspends + binds on match. Should give 10-100× speedup on receive-heavy workloads (ring benchmark, bank, fib_server). +- [ ] **9e — `OP_SPAWN` / `OP_SEND` + lightweight scheduler**: per-process register/heap layout, scheduler that runs Erlang bytecode units rather than going through general SX evaluator each time. Process record fields become VM register slots. Target: spawn cost under 50µs, send cost under 5µs. +- [ ] **9f — BIF dispatch table**: `OP_BIF_` for hot BIFs (`length/1`, `hd/1`, `tl/1`, `element/2`, `lists:reverse/1`, etc.) — direct dispatch, no registry lookup. Cold BIFs continue through the general dispatch path. +- [ ] **9g — Conformance + perf bench**: full Phase 1-8 conformance must pass on the new VM. Ring benchmark target: **100k+ hops/sec at N=1000** (current ~30/sec → ~3000× speedup target). 1M-process spawn target: **under 30 seconds** (current ~9h extrapolation → ~1000× speedup target). Document achieved numbers in `lib/erlang/bench_ring_results.md`. + +**Acceptance:** ring benchmark hits the 100k hops/sec target. All prior phase tests pass. Two opcodes chiselled to `lib/guest/vm/` (or annotated as candidates with a written rationale). + ## Progress log _Newest first._ +- **2026-05-14 Phase 9 scoped + supporting plan files synced** — Copied three plan files from `/root/rose-ash/plans/` (architecture branch) that this worktree was missing: `fed-sx-design.md` (124KB, the substrate design referenced from Phase 7/8 drivers), `fed-sx-milestone-1.md` (33KB, first concrete implementation milestone), `sx-vm-opcode-extension.md` (19KB, the prerequisite for Phase 9a — designs how `lib//vm/` registers opcodes against the OCaml SX VM core). Then appended **Phase 9 — specialized opcodes (the BEAM analog)** to `plans/erlang-on-sx.md` covering sub-phases 9a-9g: 9a (opcode extension mechanism in `hosts/ocaml/`) is out-of-scope for this loop (will be logged as a Blocker when the next iteration tries to start it); 9b-9g (PATTERN_TUPLE/LIST/BINARY, PERFORM/HANDLE, RECEIVE_SCAN, SPAWN/SEND + lightweight scheduler, BIF dispatch table, conformance + perf bench) can be designed and tested against a stub dispatcher in the meantime. Targets: ring benchmark 100k+ hops/sec at N=1000 (~3000× speedup), 1M-process spawn under 30sec (~1000× speedup). Plan framing intact for Phase 7/8 — those reflect the actual implementation done in this loop; the architecture-branch framing diverges in language but the work is equivalent. No code touched this iteration. Total **637/637** unchanged. + - **2026-05-14 ffi test suite extracted, conformance scoreboard auto-picks it up** — New `lib/erlang/tests/ffi.sx` with its own counter trio (`er-ffi-test-count`/`-pass`/`-fails`) and `er-ffi-test` helper following the same pattern as runtime/eval/ring tests. The 10 file BIF eval tests from the previous iteration moved out of `eval.sx` (eval dropped from 395 to 385 tests) and into the new suite where they're now 9 tests (consolidated the two write+read tests). `conformance.sh` updated: added `ffi` to `SUITES` array with `er-ffi-test-pass`/`-count` symbols, added `(load "lib/erlang/tests/ffi.sx")` after `fib_server.sx`, added `(epoch 109) (eval "(list er-ffi-test-pass er-ffi-test-count)")`. Scoreboard markdown auto-updated to include the row. Suite also asserts that the 5 blocked BIFs (`crypto:hash`, `cid:from_bytes`, `file:list_dir`, `httpc:request`, `sqlite:exec`) are NOT yet registered — turns a future "added the wrapper but forgot to extend ffi tests" into a hard failure. One eval-comparison gotcha en route: SX's `=` does identity equality on dicts so comparing two separately-constructed `(er-mk-atom "true")` values is false; the existing eval suite has an `eev-deep=` helper that handles this, but the simpler fix in ffi was to extract `:name` via `ffi-nm` and compare strings. Total **637/637** (+14 ffi). Phase 8 fully ticked aside from the BLOCKED bullets — those remain unchecked with explicit Blockers references. - **2026-05-14 file BIFs landed; crypto/cid/list_dir/http/sqlite blocked on missing host primitives** — Three new FFI BIFs registered in `runtime.sx`: `file:read_file/1`, `file:write_file/2`, `file:delete/1`. Each wraps the SX-host primitive (`file-read`, `file-write`, `file-delete`) inside a `guard` that converts thrown exception strings into Erlang `{error, Reason}` tuples. New helper `er-classify-file-error` does loose pattern-matching on the error message using `string-contains?` to map to standard POSIX-style reasons: `"No such"` → `enoent`, `"Permission denied"` → `eacces`, `"Not a directory"` → `enotdir`, `"Is a directory"` → `eisdir`, fallback `posix_error`. Filenames coerce through `er-source-to-string` so SX strings, Erlang binaries, and Erlang char-code lists all work. Read returns `{ok, Binary}` (bytes via `(map char->integer (string->list ...))` then `er-mk-binary`); write returns bare `ok`; delete returns bare `ok`. Bootstrap registrations added at the bottom of `er-register-builtin-bifs!` under `"file"`. 10 new eval tests: write-then-read round-trip, ok-tag, payload is binary, byte_size content, missing-file `enoent`, delete-ok, read-after-delete `enoent`, write to non-existent dir `enoent`, binary payload (5 raw bytes) round-trip preserving byte count. Blockers entry added covering five Phase 8 BIFs whose host primitives don't exist in this SX runtime: `crypto:hash/2`, `cid:from_bytes/1`/`to_string/1`, `file:list_dir/1`, `httpc:request/4`, `sqlite:open/exec/query/close`. Fix path documented inline (architecture-branch iteration to register OCaml-side primitives). Total **633/633** (+10 eval). diff --git a/plans/fed-sx-design.md b/plans/fed-sx-design.md new file mode 100644 index 00000000..62e811d1 --- /dev/null +++ b/plans/fed-sx-design.md @@ -0,0 +1,2638 @@ +# fed-sx — Federated SX Activity Substrate + +A federated, content-addressed, extensible application substrate where the unit of +computation is a signed activity, the unit of state is a pure SX projection over the +activity log, and the substrate's own extensibility (new verbs, new object types, new +projections, new validators) is itself published through the same mechanism. + +Status: **design** — not yet implemented. Target subdomain: `next.rose-ash.com`. +Target location in repo: `next/` (new top-level dir, sibling to `blog/`, `market/`, +etc.). Stack: pure SX-on-OCaml. Implementation language(s) to be chosen after design +is complete. + +--- + +## 1. Premise + +ActivityPub's data model — actors, signed activities, inboxes/outboxes — generalises +beyond social posting to any domain where state evolves via signed messages. fed-sx +takes that generalisation seriously: + +- The unit of communication is a **signed AP activity**. +- The unit of content is an **AP object**, content-addressed by **CID** (multihash + + multicodec, default `dag-cbor` over the parsed SX AST). +- State is the **deterministic fold** of pure SX functions over the activity log. +- The substrate is **self-extending**: new activity types, object types, projections, + validators, codecs, transports, and signature suites are themselves published as + `Define*` activities — federated like any other content. + +Three commitments make the rest fall into place: + +1. **The kernel is dumb.** It only knows envelope shape, signature verification, + append-to-log, fetch-by-id, transport in/out. It does not know what `Create` or + `Pin` *mean*. +2. **Everything else is registry-driven.** Verbs, object types, validators, projections, + codecs, transports, audiences, proofs, sig suites — all looked up in registries the + kernel calls into. +3. **The registries are themselves publishable.** New entries arrive as `Define*` + activities. Bootstrap registries load from a known set of CIDs at startup; everything + else is replayed from the log. + +Result: the only code that ever needs to change in the kernel is the envelope itself. +New verbs = published SX, federated like any other artifact. + +--- + +## 2. CIDs and content addressing + +Every artifact has a CID. Default codec is **dag-cbor** over the parsed SX AST (not +the raw text). This buys: + +- **Sub-AST addressing for free.** Each nested structure has an implicit CID; IPLD can + walk paths like `/components/card`. The "file CID *and* component CID" + question dissolves: every node is a CID, you choose the granularity at reference + time. +- **Polyglot canonicalization.** JS, OCaml, Python only need to agree on AST shape + + CBOR's deterministic encoding (RFC 8949 §4.2.1). No byte-identical pretty-printer + required across hosts. +- **Format immunity.** Reformatting, indent changes, equivalent-form normalisations + do not change the CID. +- **Tooling fit.** sx-tree already has the parsed form in memory; computing or + verifying a CID is just an encode + hash. + +Costs accepted: +- One spec to maintain: SX↔CBOR mapping (number → CBOR int/float, string → text, + symbol → tag, keyword → tag, list → array, dict → map). ~50 lines of code per host. +- Author's exact source text is not preserved; re-pretty-print on fetch. +- "Why don't these CIDs match" requires comparing CBOR (a `cid-explain` tool helps). + +The CID format itself is multicodec-agile: the substrate also accepts `raw`, +`dag-json`, `dag-pb`, etc. when seen, dispatched via the codec registry. + +--- + +## 3. Kernel surface (fixed — get this right) + +The kernel is the only thing that's hard to change later. Everything else is in +registries. Two envelope shapes plus five operations. + +### 3.1 Activity envelope + +``` +{ id, type, actor, published, + to, cc, audience-extras, + object | target | origin | result, # AP slots, opaque to kernel + capabilities-required: [...], # so receivers can refuse cleanly + proofs: [...], # OTS, on-chain, multi-sig — all opaque + signature: { key-id, algorithm, value, covered-fields } } +``` + +### 3.2 Object envelope + +``` +{ id, type, cid, media-type, + where: inline | cid | url, + content?, link? } # only one populated based on `where` +``` + +### 3.3 Kernel verbs + +The only verbs implemented directly by the kernel: + +- **Append signed activity** to outbox (after envelope check + sig verify + validator + pipeline). +- **Verify signature** against actor's published keys, time-aware (which key was + active at `published`). +- **Fetch** by `id` or by `cid`. +- **Receive at inbox** (verify + dispatch to registered handlers). +- **Replay log** to rebuild registries on boot. + +Everything else is registry-resolved. + +--- + +## 4. Registries + +Each registry has a default-populated set (loaded from genesis-bundled CIDs) and +accepts new entries via `Define*` activities. Default entries themselves are SX +artifacts — versioning, audit, replacement work the same way as user content. + +| Registry | Bootstrap defaults | Extended by | +|----------|-------------------|-------------| +| **Activity types** | `Create`, `Update`, `Delete`, `Announce` | `DefineActivity{type, schema-sx, semantics-sx}` | +| **Object types** | `SXArtifact`, `Note`, `Image`, `Tombstone` | `DefineObject{type, schema-sx, render-hint}` | +| **Validators** | envelope shape, signature, type-schema | `DefineValidator{applies-to, predicate-sx}` | +| **Projections** | identity, by-type, by-cid, by-actor, actor-state, define-registry, audience-graph, by-object | `DefineProjection{name, fold-sx, query-sx}` | +| **Codecs** | dag-cbor, raw, dag-json | `DefineCodec{multicodec, encode-sx, decode-sx}` | +| **Hash algorithms** | sha2-256 | multihash table — agile by spec | +| **Transports** | http-inbox-push | `DefineTransport{name, deliver-sx, receive-sx}` | +| **Audience predicates** | `Public`, `Followers`, direct | `DefineAudience{name, member-of-sx}` | +| **Subscription types** | `Follow` (AP-standard) | `DefineSubscription{name, schema-sx, match-sx, delivery}` | +| **Proof types** | (none) | `DefineProof{type, attach-sx, verify-sx}` | +| **Storage backends** | files-on-disk | `DefineStorage{where-tag, put-sx, get-sx}` | +| **Triggers** | (none) | `DefineTrigger{when-subscription, then-sx, cascade-limit}` | +| **Signature suites** | rsa-sha256 (AP-compatible) | `DefineSigSuite{name, sign-sx, verify-sx}` | +| **Application bundles** | (none) | `DefineApplication{name, subscriptions, triggers, projections, storage}` | + +Adding `Pin`, `Endorse`, `Supersede`, `Test`, `Build`, `Compose`, etc. later is just +publishing `DefineActivity` artifacts — no kernel diff, no redeploy required if +registries are hot. + +--- + +## 5. The meta-level + +A `DefineActivity` is itself an AP `Create` activity over an `SXArtifact` of a +specific type: + +```sx +(activity 'Create + :object {:type "DefineActivity" + :name "Pin" + :schema (fn (act) + (and (string? (-> act :object :path)) + (cid? (-> act :object :cid)))) + :semantics + '(fn (act state) + (assoc-in state [:pins (-> act :object :path)] + (-> act :object :cid)))}) +``` + +When the kernel receives an activity with `type: "Pin"` it looks up the registered +semantics from a `DefineActivity{name: "Pin"}` artifact, runs the SX, projects the new +state. The semantics are themselves content-addressed and federated — every receiver +runs the same code. + +Same pattern handles `DefineProjection`, `DefineValidator`, etc. The substrate is +genuinely self-extending. + +--- + +## 6. Verbs + +### 6.1 Bootstrap verbs (milestone 1) + +The substrate exposes `POST /activity` (not `POST /publish`) — generalised entry +point that takes any well-formed AP activity, validates, signs, appends to outbox. +`(publish sx)` is sugar at the SX layer for `Create{SXArtifact}`. + +Day-one verbs (cost ~zero once `/activity` exists): + +- **`Create`** — the publish primitive. +- **`Update`** — supersede a previous activity (correct metadata, change a path + mapping). Distinct from "publishing new content" — new content is always a new + `Create` with a new CID. +- **`Delete`** — tombstone. AP-native; readers honour it. +- **`Announce`** — boost another actor's artifact into your outbox. Comes free. +- **`Subscribe`** — generalised subscription verb (parallel to publish/`Create`). + Wraps any registered `DefineSubscription` type. `Follow` is the standard AP + `Subscribe{Follow{actor: ...}}` for wire compatibility. See §18. +- **`Unsubscribe`** — `Undo` of a prior `Subscribe`. Same shape as AP + `Undo{Follow}`. + +### 6.2 Custom verbs (designed-for, defined later) + +Substrate accepts these from day one (any signed activity can be appended); semantics +projected once `DefineActivity` artifacts exist. + +- **`Pin`** — assign `domain:path/name → CID`. The future name-resolution layer made + of activities. Each pin is signed; the resolver replays the outbox to compute current + state. +- **`Endorse`** (modelled on `Like`/`Approve`) — third-party signature on a CID. + Web-of-trust style code review without central authority. +- **`Supersede`** — "CID A replaces CID B". Stronger than `Update`; readers can chase + the chain. +- **`Test`** — published assertion that running CID A under conditions X yields result + Y. Test-as-artifact, federated. +- **`Build`** — links a source CID to a compiled-output CID, with provenance. +- **`Compose`** — derived artifact citing input CIDs. Provenance graph in the outbox + itself. +- **`Note`** (AP-native) — comments / reviews / discussion attached to a CID. +- **`Follow`** / **`Undo(Follow)`** — subscribe to another instance's outbox. + +The pattern that matters: your outbox isn't just "things published," it's an +**append-only log of every assertion this actor makes about the SX universe.** + +--- + +## 7. Capability discovery + +Two pieces: + +- **`GET /.well-known/sx-capabilities`** — JSON listing every registered activity-type, + object-type, codec, transport, sig-suite, proof-type. Each with the CID of the + `Define*` artifact that introduced it. Peers can diff capabilities before federating. +- **`capabilities-required`** field on activities — sender declares "this needs `Pin` + semantics + `dag-cbor` codec." Receivers without those capabilities return a clean + 422 referencing the missing CIDs; sender knows whether to replay-and-deliver the + bootstrapping `Define*` artifacts first. + +Federation degrades gracefully across instances at different versions. + +--- + +## 8. Axes of flexibility (all designed-for) + +1. **Object types** beyond SXArtifact — `Note`, `Article`, `Image`, `Video`, `Question`, + `Event`, etc. via the object-type registry. +2. **Storage tier per-object** — `where: inline | cid | url`. Tiny things inline; big + things to IPFS; legacy stuff URL-linked. Migrating storage backends doesn't migrate + the substrate. +3. **Multihash + multicodec agility** — sha2-256 + dag-cbor by default; substrate + accepts blake3, raw, dag-json, dag-pb, etc. +4. **Multi-key actors** — `publicKeys` array always; per-key `purpose`; multiple key + types (RSA for AP wire compat, Ed25519 modern). See §9. +5. **Audience / visibility** — AP-native `to`, `cc`, `bto`, `bcc`. Public, followers, + direct, unlisted. Custom audiences via `DefineAudience`. +6. **Outbox-as-database** — no source-of-truth other than the log. Projections are + recomputable views. +7. **Programmable activities** — activities can carry SX. Reactive federation, + conditional pins, automated propose/test/release pipelines, all expressed as AP + activities. +8. **Federation transport pluggable** — outbox is canonical; how peers exchange is + pluggable (HTTP push, pull, libp2p, polling). +9. **Optional timestamp proofs** — every activity has an attachable `proofs` slot. + OpenTimestamps, on-chain merkle commit, third-party TSA all slot in without changing + activity semantics. + +Explicitly **not** pursuing for MVP: +- Schema-version negotiation (premature; `@context` handles extension). +- Configurable conflict-resolution per actor (last-signed-wins, log preserved for + audit). +- Verb-specific kernel handlers (other than `Create`'s "compute CID, store body"). + +--- + +## 9. Identity & actor lifecycle + +### 9.1 Actor doc shape + +```jsonld +{ + "@context": ["https://www.w3.org/ns/activitystreams", + "https://w3id.org/security/v1", + "https://next.rose-ash.com/ns/fed-sx/v1"], + "type": "Person", // or Service, Group, Application + "id": "https://next.rose-ash.com/actors/giles", + "preferredUsername": "giles", + "inbox": "https://next.rose-ash.com/actors/giles/inbox", + "outbox": "https://next.rose-ash.com/actors/giles/outbox", + "followers": "...", + "following": "...", + + "publicKeys": [ // ARRAY from day one — never `publicKey` + { "id": "...#key-2026-05", + "type": "RsaVerificationKey2018", + "owner": "", + "publicKeyPem": "...", + "purpose": ["sign-activity", "sign-http"], + "created": "2026-05-14T...", + "expires": null, + "supersedes": null, + "supersededBy": null }, + { "id": "...#key-ed25519-2026-05", + "type": "Ed25519VerificationKey2020", + "owner": "", + "publicKeyMultibase": "z6Mk...", + "purpose": ["sign-activity"], + "created": "2026-05-14T..." } + ], + + "capabilities": "https://.../actors/giles/capabilities", // what verbs they speak + "alsoKnownAs": ["did:web:rose-ash.com:giles", ...], // bridge to DID, AP migration + "movedTo": null // set on Move +} +``` + +Key shape decisions: + +- **`publicKeys` array always.** Single-key actors have an array of length 1. AP + standard `publicKey` is *also* served as the first array element for back-compat + with vanilla AP servers (Mastodon etc. ignore the array). +- **Per-key `purpose`** — separates signing weight. Day-to-day publish key vs. high- + value key for `Pin`/`Endorse` vs. delegated machine key. Validators can require + specific purposes per activity type (registry-driven). +- **Multiple key types** — RSA for AP wire compat, Ed25519 for everything else + (smaller, faster, modern). Sig suite registry decides which suites are accepted. +- **`supersedes` / `supersededBy`** — keys form a chain, not a snapshot. Old activities + still verify against historical keys. + +### 9.2 Key rotation + +Key rotation is itself an activity, signed by the *old* key (or a recovery key): + +```sx +(activity 'Update + :object actor-id + :patch {:add-publicKey new-key + :supersede {old-key-id new-key-id}}) +``` + +Kernel: +1. Fetches actor's current state (a projection over their own outbox). +2. Verifies activity is signed by a key with `purpose: rotate-key` (or any active key, + if registry allows). +3. Appends. The actor-state projection now has the new key. + +Old activities still verify because the projection retains the historical key with +`supersededBy` set — sig verification looks up "what keys were active at activity +timestamp T." + +### 9.3 Key recovery / loss + +- **Recovery key** — separate key at actor creation, never used except to rotate. + Stored offline. `purpose: ["recover"]`. Validator allows + `Update{actor, patch: rotate-all-keys}` if signed by a recovery key. +- **Social recovery** — designate N trusted actors, M-of-N can co-sign a recovery + `Update`. Implemented as a `DefineValidator` extension; multi-sig slot in `proofs` + makes it possible without changing the envelope. +- **Total loss** — if both signing and recovery keys are gone, the actor is dead. + They publish a new actor with `alsoKnownAs: ` from a fresh key. + Followers can choose to re-follow but there's no cryptographic continuity. + +### 9.4 Migration (`Move`) + +AP-native: + +```sx +(activity 'Move + :object old-actor-id + :target new-actor-id) +``` + +Receivers update their follow lists. New actor's `alsoKnownAs` must include old +actor — bidirectional handshake prevents hijacking. + +For fed-sx, `Move` should also carry an outbox migration hint (CID of an export bundle) +so receivers can re-anchor projections without re-fetching activity-by-activity. + +### 9.5 Subordinate actors / delegation + +Two patterns supported: + +- **Service actors** (AP-native `type: Service`): bots, build servers, test runners. + Their own keys, their own outboxes, but `attributedTo` a parent actor. +- **Capability tokens**: parent publishes `Authorize{actor: child, capabilities: [...], + expires: ...}` signed by parent. Child publishes activities normally with their own + key; receivers verify the capability chain when child invokes an authority they don't + own outright. Useful for: temporary publish access, delegated `Pin` rights for a + specific path prefix, multi-device. + +Both work *without* new kernel mechanism — just activities. + +### 9.6 Implications + +- **Sig verification is timestamp-aware.** Verifying an old activity needs the key + state at the time it was published — actor-state projection must support time-travel + queries. +- **Inbox doesn't trust `keyId` blindly.** Fetches actor doc, projects current key + state, checks key was valid at `published`. +- **Cross-instance identity via `alsoKnownAs` and DIDs.** Don't depend on DIDs but + slot them in for Bluesky-bridge, Solid-bridge, etc. + +--- + +## 10. Projection model + +The architectural commitment: **state is what you get when you fold pure SX over the +log.** No DB-of-record. Everything queryable is a projection. + +### 10.1 What a projection is + +A `DefineProjection` activity registers four things: + +```sx +(activity 'Create + :object {:type "DefineProjection" + :name "actor-state" + :initial-state {} ; pure SX value + :fold (fn (state activity) ; pure SX + (case (:type activity) + "Create" (when (= "Person" (-> activity :object :type)) + (assoc state (:id activity) (:object activity))) + "Update" (apply-patch state activity) + "Move" (set-moved state activity) + state)) + :snapshot-codec "dag-cbor" + :indexes [{:by :id} {:by :preferredUsername}]}) +``` + +- **`name`** — query handle. Unique per actor; collisions resolved by CID + supersession. +- **`initial-state`** — pure SX value used as state-zero. +- **`fold`** — pure SX function `(state activity) → state`. The only thing the kernel + calls. +- **`indexes`** — optional hint for materializing lookup paths. + +The CID of the `DefineProjection` artifact is the projection's identity. Two instances +running the same projection are running the same CID's `fold` over the same log slice +— equivalence is decidable. + +### 10.2 The fold contract — purity, determinism, gas + +The fold function must be **pure and deterministic**. Non-negotiable; it's what makes +cross-instance equivalence and replay possible. + +- **No IO.** No HTTP, no file access, no DB calls, no clock. The activity carries its + own `published` timestamp. +- **No randomness.** No host-seeded PRNG. (If pseudo-randomness is needed, seed from + the activity's CID — deterministic across hosts.) +- **No mutation outside the returned state.** +- **Bounded execution.** Each fold call gets a gas budget (default tunable, e.g. 100k + CEK steps). Exceeding it is a hard failure. + +Enforced at the SX evaluator level by running folds in a sandboxed environment with +the IO platform stripped to nothing. Same sandbox model applies to validators and +trigger semantics. + +**Cross-host equivalence guarantee:** for the same projection CID + same activity log +slice, every conforming SX host (JS, OCaml, Python, Haskell-on-SX, …) must produce a +state value with the same canonical CID. Tested via the spec test suite. + +### 10.3 Bootstrap projections + +The kernel cannot start without some projections, because the kernel itself uses them. +Baked into the genesis bundle (see §11), superseded only by deliberate kernel-version +upgrades. + +| Projection | What it computes | Used by | +|------------|------------------|---------| +| `activity-log` | Identity — every activity, indexed by id and CID | Everything | +| `by-type` | `type → ordered list of activity-CIDs` | Most queries | +| `by-actor` | `actor-id → ordered list of activity-CIDs` | Per-actor outbox view | +| `by-object` | `object-CID → list of referencing activity-CIDs` | "Who pinned this?" | +| `actor-state` | `actor-id → current actor doc with key history` | Sig verification (kernel) | +| `define-registry` | `kind+name → currently-active Define* CID` | All other Define* lookups | +| `audience-graph` | `actor → followers/following` | Federation push | + +`define-registry` is the bootstrap chicken-and-egg: it's the projection that knows +which projections (and validators, codecs, etc.) are currently active. Kernel ships +with it hardcoded; once running, every other projection (including a future replacement +of `define-registry` itself) is a regular `DefineProjection` superseding it. + +### 10.4 Snapshotting + +Replaying the entire log on every restart is unacceptable past day one. + +- **Snapshot = `(activity-tip-CID, projection-state, projection-CID)` tuple,** + dag-cbor encoded, content-addressed. +- **Snapshot rule** — every K activities (default 1000) and every T seconds (default + 60), serialize, hash, store on disk. +- **Resume** — on startup, find latest snapshot for each (projection-CID, log-tip), + load state, fold forward. +- **Snapshot CID is verifiable** — anyone with the same log slice and projection-CID + can recompute and check the CID matches. This is the cross-instance agreement proof. + +Snapshots are themselves publishable as activities (`Create{Snapshot}`): an instance +can publish "here's my computed state for projection X at log-tip Y, CID Z." Other +instances can fetch and use as a starting point. **Federated state sharing falls out of +federated activities.** + +Snapshots are pruning-friendly: keep latest + snapshots referenced by published +`Create{Snapshot}` activities; everything else is GC-able. + +### 10.5 Reprojection on definition change + +When `DefineProjection{name: "actor-state"}` is superseded by a new CID with a +different fold: + +1. `define-registry` projection sees the supersession; its state advances. +2. New projection materialized **alongside** the old one — both kept live during + migration. +3. New projection runs in catch-up mode: replay from genesis (or from deepest + compatible snapshot). +4. When new projection catches up to log tip, queries cut over. Old projection state + can be retired. +5. Snapshots of old version stay around as long as referenced (e.g. for time-travel + queries against historical state under old semantics). + +Changing a projection definition is **safe and online**. Cost: temporary state +duplication during catch-up. Slow folds → slow migrations, but never breakage. + +For projections too expensive to fully reproject, `Update{DefineProjection}` can +declare `migrationHint: ` — opt-in, used at migrator's +risk. + +### 10.6 Time-travel queries + +Folds are deterministic functions of `(initial-state, activity-list-prefix)`. +Time-travel is fold-up-to: + +- `state-as-of(projection, activity-id-or-timestamp)` → walk to requested point, + return state. +- Snapshots act as accelerators (resume from nearest snapshot ≤ target). +- Used by sig verification ("what keys did this actor have when this activity was + signed?"), audit, "what did we believe last Tuesday." + +### 10.7 Projection composition + +**Projections do not directly read each other's state during folding.** Preserves +locality and parallelism — every projection runs independently against the same log. + +Composition via: + +- **Query time** — `(query (projection actor-state) ...)` joins are SX expressions + over multiple projection states. +- **Republishing as activities** — a projection that exposes its state as input to + others publishes `Create{Snapshot}` periodically. Downstream projections fold over + those. + +Direct cross-projection reads during fold introduce ordering, cycles, cache- +invalidation problems we don't need. + +### 10.8 Querying + +Three layers: + +- **Raw projection state** — `GET /projections/?at=` returns dag-cbor + (also JSON for tooling). Large states paginated by index. +- **SX queries** — `POST /query` with an SX expression that runs against one or more + projection states in pure mode. Equivalent to Datalog/GraphQL. +- **Materialized indexes** — declared on projection (`indexes:` field). Kernel + maintains as side-tables for `O(log n)` lookup. + +Real-time: clients `GET /projections//subscribe` (SSE), receive deltas as +activities land. Delta is `(old-state, new-state, applied-activity-CID)`; clients can +verify by re-folding. + +### 10.9 Lag, async, concurrency + +- **Append is sync; projection is async.** `POST /activity` returns once activity is + durably in the log. Projections run in a separate worker pool; query results carry + `projected-up-to` so callers know whether the latest write is visible. +- **One worker per projection.** Folds are sequential, but projections run in parallel + with each other. +- **Sync option** — `POST /activity?wait-for=projection-name` blocks until the named + projection has folded the new activity. Use sparingly. + +### 10.10 Failure modes + +| Failure | Response | +|---------|----------| +| **Gas exhaustion** | Activity tagged `projection-failed` for this projection. State unchanged. Operator alert. | +| **SX runtime error** (assertion, type mismatch) | Same as gas: activity skipped, error logged, state unchanged. | +| **Schema violation** | Caught earlier in validation pipeline, never reaches projection. | + +The log itself is always written successfully if it passes envelope + signature + +validator checks. Projection failures don't gate appending — that would couple writes +to arbitrary user-defined code. + +### 10.11 Operational implications + +- **Projection determinism is the linchpin.** If JS and OCaml ever produce different + state for the same log + projection, federation cracks. Spec test suite must cover + projection equivalence across hosts as a first-class requirement. +- **Snapshots are eventual consensus.** Two instances publish `Create{Snapshot}` for + the same log+projection; if their CIDs match, they agree without coordination. +- **Kernel reads its own projections.** `actor-state` for sig verification; + `define-registry` for every Define* lookup. Startup sequence must bootstrap these + before serving traffic. +- **Reprojection cost is real.** Heavy projection changes mean replaying from genesis. + Encourage incremental schemas (small per-activity work, idempotent updates) and + provide profiling. + +--- + +## 11. Sandbox & determinism + +The runtime contract that makes folds (and validators, triggers, semantics) safe to +execute, and that guarantees every conforming SX host computes the same state from +the same log. + +### 11.1 Three sandbox levels + +Different registry entries need different power. We define three nested execution +modes; the registry entry declares which mode it requires. + +| Mode | Used by | IO | Clock | Random | Determinism | +|------|---------|----|----|--------|-------------| +| **pure** | folds, validators, audience predicates, semantics, trigger `when-sx` | none | activity's own `published` only | seeded from activity CID only | required across hosts | +| **crypto** | sig suite verify, codec encode/decode | crypto primitives only | none | sign-only secure RNG | required across hosts (verify); single-host (sign) | +| **effectful** | storage backends, transports, trigger `then-sx`, some proof verifiers | per-capability grant only | host clock | host RNG | not required; single-host | + +Default mode is **pure**. The other two are opt-in at registration time, and the +registration is itself a signed activity — anyone can audit which extensions claim +which powers. + +### 11.2 Pure sandbox (the load-bearing one) + +This is the mode every projection fold runs in. It must produce identical results on +every conforming SX host, every time. + +**Allowed:** +- All spec primitives in `spec/primitives.sx` that don't perform IO (arithmetic, + comparison, predicates, string ops, collection ops, dict ops, format helpers). +- The activity being processed (full envelope), as the function's argument. +- The current state value, as the function's argument. +- A small set of fed-sx-specific deterministic primitives: + - `(activity-cid act)` → CID of the activity envelope + - `(activity-time act)` → ISO timestamp from `published` + - `(actor-state-as-of state-snapshot actor-id activity-time)` → if the projection + has been declared dependent on `actor-state` (see §10.7), reads from a snapshot + of that projection at the activity's timestamp + - `(seeded-rng cid)` → deterministic PRNG seeded from a CID, returns a stream of + uniform values + +**Forbidden:** +- All IO: HTTP, file, network, stdin/stdout, environment. +- Wall-clock access. The host's `now` is not in scope; the only time available is + `(activity-time act)`. +- Host-seeded randomness. Only `seeded-rng` (CID-derived) is available. +- Mutation outside the returned value. Enforced by the SX evaluator's lack of + ambient mutable bindings; folds may use local `let` and mutation within their own + closure but cannot reach outside. +- Calling other registry entries by name. Composition happens at query time, not + fold time (see §10.7). + +**Enforced by:** evaluator runs the fold with the IO platform stripped to nothing. +The fed-sx kernel constructs a `pure-platform` (no fetch, no query, no action, no +DOM, no storage) and uses it as the sole evaluator platform when calling the fold. +Any IO primitive call raises a hard error caught as a fold failure. + +### 11.3 Crypto sandbox + +Sig suites and codec encode/decode need hash + crypto + encoding primitives but +nothing else. They're still deterministic across hosts (verify case) but get a +narrower platform than effectful, wider than pure. + +**Additional primitives over pure:** +- `(sha2-256 bytes)`, `(sha3-256 bytes)`, `(blake3 bytes)`, … +- `(rsa-verify pubkey msg sig)`, `(ed25519-verify pubkey msg sig)`, … +- `(rsa-sign privkey msg)`, `(ed25519-sign privkey msg)` — sign-only; requires the + caller to supply a secure RNG handle (which is *not* in pure mode) +- `(cbor-encode value)`, `(cbor-decode bytes)` — for codecs implementing CBOR variants +- `(base32-encode bytes)`, `(base58btc-encode bytes)`, `(multibase-encode tag bytes)` +- `(multihash-encode tag digest-bytes)`, `(multihash-decode bytes)` +- `(cid-encode codec mhash)`, `(cid-decode bytes)` + +**Sign vs verify:** verify is pure (deterministic). Sign is not — it consumes +randomness. fed-sx draws a clean line: signing happens *outside* registry-entry SX +(it's an operation the kernel/runtime performs on behalf of the actor with their +private key); registry SX only ever *verifies*. This keeps the pure↔crypto distinction +tractable. + +### 11.4 Effectful sandbox + +Storage backends, transports, trigger `then-sx`, and proof verifiers that need the +network (e.g. blockchain RPC for on-chain proof verification) all need real IO. +These are not used to compute projected state; they're how the substrate interacts +with the outside world. + +**Capability-granted primitives.** The registration activity declares the +capabilities the entry needs: + +```sx +(activity 'Create + :object {:type "DefineStorage" + :where-tag "ipfs" + :capabilities [{:type "http-client" :allowlist ["http://localhost:5001/*"]} + {:type "fs-read" :path-prefix "/var/cache/fed-sx/ipfs/"} + {:type "fs-write" :path-prefix "/var/cache/fed-sx/ipfs/"}] + :put-sx (fn (cid bytes) ...) + :get-sx (fn (cid) ...)}) +``` + +**Capability types** (initial set; extensible): + +- `http-client` with `allowlist` (URL prefix patterns) +- `http-server` with `path-prefix` (mounts a sub-handler) +- `fs-read` / `fs-write` with `path-prefix` (chroot-style) +- `subprocess` with `command-allowlist` +- `clock-read` (wall clock; granted if registry entry needs to timestamp something) +- `random-bytes` (host CSPRNG) + +**No ambient authority.** Default capability set is empty; every capability is +explicit, declared, signed, and auditable. A peer can refuse to load a registry +entry whose capability claim is unacceptable to them. + +**Capabilities are content-addressed.** Each capability descriptor has a CID. The +substrate maintains a registry of "capability CIDs that this instance trusts to +honour" — operator policy, not protocol. + +### 11.5 Gas and resource accounting + +Each sandbox call gets a budget: + +- **CEK gas** — every evaluator step costs 1 unit; primitive calls cost a per- + primitive amount declared in `spec/primitives.sx`. Default budget: 100k units per + fold call. Tunable per-projection via `DefineProjection.gas-limit`. +- **Memory ceiling** — peak heap size for the fold call. Default 64 MB. Tunable. +- **IO budget** (effectful only) — bytes read/written and network calls per + invocation, granted separately per capability. +- **Wall-clock budget** (effectful only) — max real-time before forced termination. + +Exceeding any budget is a hard failure; the call returns an error value, the fold's +state is unchanged, and the activity is tagged for the projection. + +Gas accounting is part of the spec — every conforming host must charge the same +units for the same operations, so "this fold runs out of gas" is a deterministic +property of the (projection, activity) pair, not a host-specific outcome. + +### 11.6 Determinism gotchas + +The pure sandbox is only as deterministic as its primitives. Worth nailing: + +- **Floating point.** IEEE 754 binary operations are bitwise-identical across + conforming hosts, but transcendentals (`sin`, `cos`, `log`, `exp`) are *not* — + libm implementations differ. **Decision: floats are forbidden in pure mode unless + the projection declares `requires-deterministic-floats: true` and uses only the + IEEE 754 basic operations (+, -, *, /, sqrt, comparison, conversion).** For exact + arithmetic, use integers or rationals (fed-sx will provide a rational primitive). +- **Map / dict iteration order.** Must be sorted-key always in pure mode. The SX + spec mandates this for `for-each` and `map` over dicts; we tighten it: pure mode + forbids relying on insertion order. +- **String encoding.** All strings are UTF-8 NFC at ingestion; pure-mode operations + use byte-level comparison after normalization. Codepoint operations (`length`, + `substring`) return identical results across hosts because they operate on the + normalized form. +- **Integer overflow.** Pure mode uses arbitrary-precision integers (the SX spec + default). No undefined behaviour. Overflow is impossible. +- **Equality.** Structural equality (`equal?`) compared across hosts must yield the + same result for the same canonical-CID values. Implies dict equality is + order-independent (as it should be), and float equality follows IEEE 754 (NaN ≠ + NaN; +0.0 = -0.0). +- **Error values.** When a primitive errors, the error must be representable as a + dag-cbor value with a stable CID across hosts. Reserve a `{:error :type ... :msg + ...}` shape; standard error types defined in the spec. + +### 11.7 Failure model + +A pure-mode call ends in one of three terminal states: + +1. **Success** — returns a value. Fold uses it as new state. +2. **Sandbox violation** — IO attempted, capability denied, etc. Returns a stable + error value; fold's state is unchanged; activity tagged + `{:projection-failed :reason :sandbox-violation :detail ...}`. +3. **Resource exhaustion** — gas, memory, IO budget exceeded. Same handling as + sandbox violation but with `:reason :resource-exhausted`. + +Crypto-mode failures (e.g. invalid signature) are *return values*, not exceptions — +verify returns boolean, sign returns either a sig or an error. This forces callers +to handle failure explicitly. + +Effectful-mode failures (network down, disk full) propagate to the operator as +errors but never affect projected state. The substrate retries effectful operations +according to the registry entry's policy (declared at registration). + +### 11.8 Conformance testing + +Cross-host equivalence isn't aspirational; it's tested. + +- **Spec test suite** ships projection equivalence tests: a corpus of (log slice, + projection CID, expected snapshot CID) tuples. Every conforming SX host must + produce the expected snapshot CID for each input. +- **Validator equivalence tests** likewise: (validator CID, activity, expected + result). +- **Codec equivalence tests:** (codec CID, value, expected encoded bytes), in both + encode and decode directions. +- **Sandbox isolation tests:** "this fold attempts to call `fetch`; expected + outcome: sandbox violation error with stable CID." + +Hosts run the conformance suite to claim "fed-sx pure-mode conformance." Failures +are publishable as `Test{result: failed, host: ..., projection: ...}` activities — +the conformance graph itself is federated. + +### 11.9 Operational implications + +- **The pure sandbox is the heart of cross-host federation.** Every divergence is a + spec bug or a host bug; both are caught by snapshot CID mismatches and surfaced + via `Test` activities. +- **Capability descriptors are the new audit trail.** "What can the IPFS storage + backend do?" is a question with a precise answer at any timestamp — the registered + capability CIDs. +- **Floats are mostly absent.** This is unusual but defensible — most state in the + substrate is ids, counts, sets, references. Numerical computation belongs in + effectful registry entries (e.g. an analytics projection that publishes summaries + as activities, projected by a downstream pure projection that just stores them). +- **Gas is part of the protocol.** Two hosts disagreeing about whether a fold runs + out of gas is a conformance failure. Spec primitive gas costs are normative. + +## 12. Bootstrap & genesis + +How a fresh instance starts with no log, where the initial registry entries come +from, and how the kernel evolves without bricking peers. + +### 12.1 The genesis problem + +The substrate is "everything is a `Define*` activity in the log." But on a fresh +instance the log is empty — so there are no `Define*` activities to tell the kernel +what `Create` means, how to verify a signature, or what dag-cbor is. Strict +turtles-all-the-way-down would deadlock startup. + +Solution: **the kernel ships with a baked-in genesis bundle** containing the minimal +set of definitions it needs to interpret its own log. The bundle is a constant of +the kernel binary; its CID is hardcoded; the kernel verifies on startup that the +bundle matches its hardcoded CID. After that, everything (including superseding the +bundled definitions themselves) goes through the activity log. + +The genesis bundle is *not* itself a federated artifact in the AP sense. It's the +dictionary you need before you can read any activities. Optionally, an actor can +`Create{GenesisRecord}` as their first published activity to advertise which genesis +they started from — informational, not load-bearing. + +### 12.2 Genesis bundle contents + +Minimal viable bundle (dag-cbor object, content-addressed): + +``` +{ + "type": "fed-sx-genesis", + "kernel-version": "1.0.0", + "envelope-spec": { ... }, // canonical schema for activity envelope + "object-spec": { ... }, // canonical schema for object envelope + "definitions": { + "activity-types": { + "Create": { "schema": , "semantics": }, + "Update": { "schema": , "semantics": }, + "Delete": { "schema": , "semantics": }, + "Announce": { "schema": , "semantics": } + }, + "object-types": { + "SXArtifact": { "schema": }, + "Note": { "schema": }, + "Tombstone": { "schema": }, + "DefineActivity": { "schema": }, + "DefineObject": { "schema": }, + "DefineProjection": { "schema": }, + "DefineValidator": { "schema": }, + "DefineCodec": { "schema": }, + "DefineTransport": { "schema": }, + "DefineAudience": { "schema": }, + "DefineProof": { "schema": }, + "DefineStorage": { "schema": }, + "DefineTrigger": { "schema": }, + "DefineSigSuite": { "schema": }, + "Snapshot": { "schema": } + }, + "sig-suites": { + "rsa-sha256-2018": { "verify": , "key-format": }, + "ed25519-2020": { "verify": , "key-format": } + }, + "codecs": { + "dag-cbor": { "encode": , "decode": }, + "raw": { "encode": , "decode": }, + "dag-json": { "encode": , "decode": } + }, + "projections": { + "activity-log": { "initial-state": ..., "fold": }, + "by-type": { "initial-state": ..., "fold": }, + "by-actor": { "initial-state": ..., "fold": }, + "by-object": { "initial-state": ..., "fold": }, + "actor-state": { "initial-state": ..., "fold": }, + "define-registry": { "initial-state": ..., "fold": }, + "audience-graph": { "initial-state": ..., "fold": } + }, + "validators": { + "envelope-shape": { "predicate": }, + "signature": { "predicate": }, + "type-schema": { "predicate": } + }, + "audience-predicates": { + "Public": { "member-of": }, + "Followers": { "member-of": }, + "Direct": { "member-of": } + } + }, + "capability-types": [ // schema for capability descriptors + "http-client", "http-server", + "fs-read", "fs-write", + "subprocess", "clock-read", "random-bytes" + ] +} +``` + +Each definition's body is **SX source**, not bytecode. The kernel evaluates it at +startup using the same SX evaluator user-published `Define*` artifacts use — there +is no privileged "native" path. The bootstrap is just SX loaded from the binary +instead of from the log. + +### 12.3 Hardcoded CID and verification + +The kernel binary contains: + +- The full genesis bundle (embedded as bytes). +- The CID computed over those bytes at build time. + +On startup: + +1. Compute the actual CID of the embedded bundle. +2. Compare to the hardcoded CID. +3. **Mismatch → refuse to start.** Either the binary has been tampered with or the + build process is broken. Either way, the operator should know immediately. +4. **Match → proceed.** Every running instance with a given kernel binary has + byte-identical bootstrap state — no version drift possible within a binary. + +The genesis CID is exposed at `GET /.well-known/sx-capabilities` so peers can see +which kernel version they're talking to. + +### 12.4 Fresh instance startup sequence + +``` +1. Load and verify genesis bundle (panic on mismatch) +2. Parse all definition SX sources, instantiate evaluator closures +3. Initialize registries from definitions (in the order: codecs → sig-suites → + validators → object-types → activity-types → audience-predicates → projections) +4. Open log file (create if missing) +5. Replay any existing log: for each activity, validate, then fold into each + projection (resuming from snapshots where available) +6. Load or generate actor keypair (filesystem path from config) +7. If actor has never published a Create{Person} for itself, generate and append + one as the first activity of this instance's outbox +8. Initialize HTTP server, wire routes +9. Open inbox: start accepting federated activities +10. Mark instance as ready +``` + +Steps 1-3 are the bootstrap. Step 5 is replay-and-project. Step 7 is the +"actor genesis" — every instance has at least one local actor; it publishes itself +as its first activity, and that activity (signed by the actor's own key) anchors all +subsequent activity from that actor. + +### 12.5 First activity — actor creation + +Every fresh actor's outbox starts with: + +```sx +(activity 'Create + :id "https://next.rose-ash.com/actors/giles/activities/" + :actor "https://next.rose-ash.com/actors/giles" + :published "" + :to ["https://www.w3.org/ns/activitystreams#Public"] + :object + :signature ) +``` + +Self-signed: the activity introduces the key it's signed with. Verifiers fetch the +actor doc embedded in the activity, find the key, verify against the activity. This +is the trust-on-first-encounter for a new actor — the same model AP uses. + +The kernel emits this automatically on first startup if the actor has no prior +activity. Subsequent actor changes (key rotation, profile updates) are `Update` +activities signed by an existing key. + +### 12.6 Joining federation + +A new instance has no peers initially. Discovery is operator-driven for v1: + +1. Operator configures one or more peer URLs (or a well-known seed list). +2. Instance fetches peer's actor doc and `/.well-known/sx-capabilities`. +3. Instance verifies it can interpret the peer's activities (envelope compatible, + sig suites overlap). Reports incompatibilities to operator. +4. If compatible, instance follows peer's primary actor (`POST /inbox` with a + `Follow` activity). +5. Peer streams or backfills outbox to this instance. +6. Activities arrive, validate, fold into local projections. + +Discovery beyond manual config (e.g. peer recommendations, federation directories) +is a v2 concern. + +### 12.7 Kernel version evolution + +The substrate must evolve without forcing every instance to upgrade in lockstep. +Three rules: + +**Rule 1: The activity envelope shape is forward-compatible only.** + +We may *add* optional fields to the envelope; we may not change semantics or remove +fields. Old activities still validate under new kernels. New activities with new +fields are accepted by old kernels (which ignore the unknown fields, store the raw +envelope, and project conservatively). + +This is the AP discipline. We adopt it strictly. If we ever need a breaking envelope +change, it's a major version (fed-sx 2.0) and instances at different majors don't +federate directly — only via bridges. + +**Rule 2: Everything else evolves via supersession.** + +New sig suite, new codec, new projection definition, new validator: publish a +`Define*` activity that supersedes the old one. Both old and new versions stay valid +at their respective timestamps. Old activities verify under old definitions; new +activities use new definitions. Time-aware lookup (§9.6, §10.6) makes this work. + +**Rule 3: New genesis bundles supersede old ones via published activities.** + +When the kernel team ships a new version with an updated bundle: + +- The new bundle's CID is different. +- Operators upgrading the kernel get the new bundle automatically. +- The new bundle's *contents* are largely supersession `Update{DefineProjection, + DefineValidator, ...}` activities relative to the old bundle's definitions. +- A peer running the old kernel sees these `Update` activities (when they appear in + followed outboxes) and *can* opt to load them dynamically (§12.8) or stay on the + old bundle definitions until the operator upgrades. + +In other words: the kernel binary evolution and the activity-log evolution are +parallel tracks. The binary determines what's *built in*; the log determines what's +*currently active*. They converge over time but don't have to be lockstep. + +### 12.8 Dynamic Define* loading + +When an instance receives an activity of `type: "PinV3"` and has no `DefineActivity{ +name: "PinV3"}` in its define-registry, it has three options (operator policy): + +- **Strict mode** — store the activity envelope (it's valid AP), tag it `unknown-type` + in `by-type`, do not project semantics. Operator must explicitly load the + definition to enable projection. +- **Permissive mode** — fetch the `DefineActivity{name: "PinV3"}` artifact (its CID + is in the activity's `capabilities-required` list), validate, evaluate the + semantics SX (in pure sandbox), reproject the activity. Operator notified. +- **Trusted-peers-only mode** — like permissive, but only auto-loads `Define*` from + actors on a configured trust list. + +Default for fed-sx v1: **strict mode**. Operators opt-in to broader policies. + +This lets the substrate genuinely live-extend — new verbs land via federation, no +binary upgrade — while keeping a clean audit trail of what got loaded when. + +### 12.9 Genesis as the substrate's manifest + +A useful framing: the genesis bundle is the substrate's **manifest** (in the package- +manager sense). It declares "this kernel ships with these definitions, identified by +these CIDs, and this is what the kernel does until the log says otherwise." + +Two instances with the same genesis CID start identical. Two instances with +different genesis CIDs can federate as long as their *active* registry states (after +log replay) overlap enough. + +The genesis bundle is also the **conformance reference**: a kernel implementation +claims fed-sx v1.0 conformance by reproducing the standard genesis bundle's CID +from its own build of the included SX sources. If two implementations build the same +spec sources and produce different CIDs, one of them is non-conformant. Cheap, +deterministic conformance check. + +### 12.10 Operational implications + +- **Build-time CID computation is part of the kernel build.** The build pipeline + must include the genesis-bundling step and embed the resulting CID. Mismatch + protection requires the binary to know what it expects. +- **Genesis evolution is a deliberate kernel-team decision.** Adding a new bundled + projection or sig suite is a kernel release, not a federated activity. (User- + defined projections still federate normally.) +- **Strict-mode default protects against malicious extensions.** Operators have to + consciously opt into auto-loading remote `Define*`. This trades convenience for + security — appropriate for v1. +- **Cross-major federation is a bridge problem.** If/when fed-sx 2.0 ships with an + envelope change, bridges between v1 and v2 are themselves federated artifacts — + built by anyone, signed, audited. + +## 13. Federation mechanics + +How instances exchange activities, how peers subscribe, how new followers backfill, +how delivery survives unreliable networks, and how the substrate resists abuse. + +### 13.1 Push, pull, hybrid + +ActivityPub canonically uses **push**: actor A publishes by POSTing each delivery to +each follower's inbox URL. This gives low latency and clear delivery semantics, but +requires a reliable per-recipient delivery queue and falls over when peers go down. + +fed-sx supports both, with a **push-primary, pull-fallback** model: + +- **Push** is the default delivery mechanism. When an activity is appended to A's + outbox, A's delivery worker posts it to each follower's inbox. +- **Pull** is always available: any peer can `GET /actors//outbox?since=` + and stream activities in order. Used for backfill, recovery from delivery gaps, + and instances that prefer pull-only operation. +- **Hybrid in practice:** push delivers *notifications* (the activity itself, or a + pointer to its CID); receivers may pull the full content if not inlined. Useful + when the activity body is large. + +Operators can configure their actors as push-only, pull-only, or hybrid. The +default is hybrid. + +### 13.2 The Follow lifecycle + +AP-standard, slightly tightened: + +```sx +;; A wants to follow B +(activity 'Follow + :actor "https://a.example/actors/alice" + :object "https://b.example/actors/bob") +;; → POST to B's inbox + +;; B accepts (or rejects) +(activity 'Accept + :actor "https://b.example/actors/bob" + :object ) +;; → POST to A's inbox + +;; A unfollows later +(activity 'Undo + :actor "https://a.example/actors/alice" + :object ) +;; → POST to B's inbox +``` + +State derived by the `audience-graph` projection on each instance: + +- `(followers actor)` — set of actors who follow `actor`, projected from + `Accept{Follow}` activities in `actor`'s outbox (and the inverse via received + `Follow` activities). +- `(following actor)` — symmetric. + +**Auto-accept by default.** Public actors auto-publish `Accept` for any incoming +`Follow`. Locked actors require manual approval, implemented as an operator UI that +publishes the `Accept` (or `Reject`) once a human decides. + +### 13.3 Backfill + +When A first follows B, A wants B's history. Four supported modes: + +| Mode | Mechanism | Trade-off | +|------|-----------|-----------| +| **No backfill** | Just stream new activities going forward | Cheapest, missing context for new followers | +| **Pull paginated** | `GET /outbox?since=epoch&limit=100` repeatedly | Standard, slow for large outboxes | +| **Snapshot fetch** | Find latest `Create{Snapshot}` published by B for the projection of interest, fetch + verify, then pull only activities after the snapshot's tip | Fast, requires B to publish snapshots | +| **Bundle fetch** | Out-of-band: B publishes a CID for an export bundle (a dag-cbor list of activities + actor doc + sig suite verification metadata); A fetches once, validates the chain, replays | Fastest for cold starts; bundle creation is opt-in | + +Default: snapshot fetch when available, paginated pull otherwise. + +A new instance joining federation typically combines: snapshot-fetch the +`actor-state` and `define-registry` projections from a trusted peer (so it knows who +exists and what verbs are defined), then incrementally backfill specific actors of +interest. + +### 13.4 Delivery queue and retry + +Every push delivery attempt has a fate: + +| Outcome | Action | +|---------|--------| +| 2xx | Mark delivered | +| 3xx | Follow redirect (with limit) | +| 4xx (except 429) | Mark *permanently failed* — peer rejected the activity. Log; don't retry. | +| 429 | Honour `Retry-After`; reschedule | +| 5xx | Exponential backoff; reschedule | +| Connection error | Exponential backoff; reschedule | + +**Retry schedule** (default, tunable per peer): + +``` +1 min, 5 min, 15 min, 1 h, 4 h, 12 h, 24 h, 48 h, 96 h +``` + +After the last attempt fails, the activity is **abandoned for push** but remains in +A's outbox. Followers can still pull it via `GET /outbox?since=...`. The peer will +eventually catch up if they come back online and pull. Push is best-effort; pull is +the source of truth. + +**Persistent queue.** Delivery state is itself stored in the local instance — it's +operator-internal, not federated. (Could be a regular SQLite table; doesn't need to +be a projection because it's not state-the-world-cares-about.) On instance restart, +the queue resumes from where it left off. + +**Queue-as-projection (alternative):** for instances that want every aspect to be +log-derived, the delivery state could be a local-only projection over a stream of +`Attempt` / `DeliverySuccess` / `DeliveryFailure` activities written to a private +local-only outbox. Out of scope for v1 but the design admits it. + +### 13.5 Audience-respecting delivery + +Each activity carries `to`, `cc`, `bto`, `bcc`. The delivery worker computes the +**delivery set**: union of explicit recipients + (if `as:Public` or `Followers` in +audience) the actor's followers projection. + +- `bto` and `bcc` are stripped before delivery (recipients shouldn't see who else is + blind-copied). +- **Receivers honour audience.** When an instance receives an activity it should + not be in the audience for (e.g. a `Direct` activity to someone else, leaked via a + misconfigured peer), it logs and discards. Validators in the inbound pipeline + enforce this. +- **Public ≠ unlisted.** `to: as:Public` means deliver to followers AND make + publicly fetchable AND show in public projections. Some actors prefer "publicly + fetchable but not pushed broadly" — `cc: as:Public` with `to: Followers`. + +### 13.6 Spam and abuse posture + +ActivityPub has well-known abuse vectors (Mastodon's history is instructive). fed-sx +defends in layers: + +**Signature verification.** Every inbound activity must have a valid signature +matching an actor whose key was active at `published`. Forgeries are dropped at the +envelope-validation stage (§14). Necessary but not sufficient — signatures only +prove the message wasn't tampered with, not that the sender is benign. + +**Per-source rate limits.** Per-actor and per-instance request rate limits on +`/inbox`. Default: 100/min per actor, 1000/min per instance. Exceeded → 429. + +**Per-instance trust state.** Three categories, operator-configured (and +overridable per actor): + +- **Trusted** — auto-accept, auto-load Define* (if permissive mode), no rate- + multiplier penalty. +- **Default** — accept signed activities, standard rate limits, do not auto-load + Define*. +- **Suspended** — drop all inbound activities, refuse outbound delivery, do not + fetch artifacts. Operator decision (e.g. spam source, harassment instance). + +Trust state is local-only (operator policy); it is not federated. Different +instances can disagree. + +**Audience refusal.** Activities not addressed to anyone on this instance (no local +followers, not `as:Public`, not `to:` a local actor) are dropped on receipt. +Discourages spam targeting random instances. + +**Content validators.** Registry-driven content moderation: a `DefineValidator` +with `applies-to: "inbound"` runs against every inbound activity and can reject +based on content rules. Examples: link-spam detection, ML moderation models served +via an effectful validator (note: effectful validators are a special case — they +*can* fail-closed without affecting determinism, because validators happen *before* +projection and don't contribute to projected state). + +**Capability vetting.** If an inbound activity declares `capabilities-required` +that includes definitions this instance hasn't loaded *and* trust policy is strict- +mode, the activity is quarantined (stored but not projected) pending operator +review. + +**Federation circuit breakers.** Per-peer error rate triggers temporary defederation: +if a peer is sending malformed activities, exceeding rate limits, or signing with +revoked keys, automatic suspension for an exponential cool-off. + +### 13.7 Discovery + +How an instance finds other instances and actors: + +- **WebFinger** (RFC 7033). `GET /.well-known/webfinger?resource=acct:user@host` + returns links to actor URLs. AP-standard. fed-sx implements. +- **Well-known capabilities.** `GET /.well-known/sx-capabilities` (§7) for cross- + instance compatibility checks. +- **Manual peer config.** Operators add peer instance URLs to their config. +- **Peer recommendations.** An instance can publish `Recommend{actor}` activities + pointing at peers it considers worth following. Receivers can use these as + discovery hints (subject to local trust). Out of scope for v1 but the verb is + reservable. +- **Federation directories.** Community-maintained lists of instances; an instance + can opt into being listed by publishing a `Directory{listed-by}` activity. v2 + concern. + +For v1: WebFinger + capabilities + manual config. Discovery beyond that is opt-in +via standard verbs. + +### 13.8 Streaming and real-time + +Two streaming mechanisms: + +- **Outbox SSE** — `GET /actors//outbox/stream` opens a Server-Sent Events + connection. Each new activity appended to the outbox is sent as an event. Allows + pull-style federation peers to maintain a live connection without polling. +- **Projection SSE** — `GET /projections//subscribe` (§10.8) streams projection + deltas. Useful for clients (browsers) wanting reactive views. + +Both are local-only mechanisms; the canonical federation transport remains push to +inbox + pull from outbox. SSE is convenience, not protocol. + +### 13.9 Operational implications + +- **Push is best-effort, pull is authoritative.** Operators should treat the outbox + as the canonical record; delivery queue is bookkeeping. +- **Trust is per-instance and not federated.** Two instances may have different + views of "good actors" and "bad instances." This is a feature — defederation + decisions are local sovereignty. +- **Backfill via snapshots is the cheap path.** Encouraging actors to publish + `Create{Snapshot}` regularly makes new-follower onboarding fast. +- **Audience semantics are enforced both ways.** Senders compute delivery set; + receivers honour audience. Defence-in-depth against misconfigured peers. +- **Capability-based extension loading is opt-in.** Strict-mode default means + unknown verbs are stored-but-not-projected — safe by default, with explicit + operator control over what extensions load. + +## 14. Validation pipeline + +Every activity entering the substrate (whether published locally or received from a +peer) flows through a fixed pipeline of checks. Order matters: cheap and fail-safe +first, expensive and content-aware last. Each stage has a defined failure response +(reject, quarantine, drop). Registry-driven validators plug in at a specific stage. + +### 14.1 The two pipelines + +**Inbound** — activities arriving via `POST /inbox` or pulled from a peer's outbox: + +``` +HTTP transport → envelope → signature → replay → audience → + activity-type schema → object-type schema → content validators → + capabilities → trust state → log append → projection (async) +``` + +**Outbound** — activities being published locally via `POST /activity`: + +``` +authentication → authorization → envelope construction → object handling → + activity-type schema → signature → log append → projection (async) → + delivery (async) +``` + +Stages they share are implemented as the same SX functions called from both pipelines. + +### 14.2 Inbound pipeline — stage by stage + +| # | Stage | Check | Failure response | +|---|-------|-------|------------------| +| 1 | **Transport** | Valid HTTP request, content-type acceptable, body parseable as JSON-LD or dag-cbor | `400 Bad Request`; log | +| 2 | **Envelope** | Matches kernel's envelope spec (required fields present, types valid, recognised activity type or `unknown` allowed) | `400`; log; structured error in response body | +| 3 | **Signature** | Time-aware sig verification: fetch (or cache-lookup) actor doc, find key with `id == sig.key-id` that was active at `published`, verify against canonical envelope bytes per the named sig suite | `401`; log; do not retry; mark sender's instance for circuit-breaker accounting | +| 4 | **Replay** | Activity id and CID not already in `activity-log` projection | `200 OK` with `{status: "duplicate"}`, no-op | +| 5 | **Audience** | This instance has at least one local actor in `to`/`cc`, OR audience contains `as:Public`/`Followers` and the actor has local followers | Drop silently (no response indicating either acceptance or refusal — prevents inbox-membership probing); do not store | +| 6 | **Activity-type schema** | Look up `DefineActivity{name: }` in `define-registry`; run its `schema` predicate over the activity in pure sandbox | If type unknown: per trust policy (strict: 422 with missing-definition CID; permissive: attempt dynamic load §12.8). If schema fails: 422 with violation detail | +| 7 | **Object-type schema** | If activity has an `object` with a `type`, look up `DefineObject{name: }` and run its `schema` | Same as #6 | +| 8 | **Content validators** | All registered validators with `applies-to: inbound` or `applies-to: all` run sequentially; each is a pure-sandbox predicate that returns `:accept` / `:reject` / `:quarantine` | `:reject` → 422 with reason. `:quarantine` → store activity but mark `quarantined`, do not project, alert operator | +| 9 | **Capabilities** | Every CID in `capabilities-required` is present in this instance's loaded registries (or auto-loadable per trust policy) | Missing → 422 with list of missing CIDs (sender can deliver bootstrapping `Define*` artifacts first). Auto-load attempt can be triggered by re-POST with `?retry-after-load=true` | +| 10 | **Trust state** | Sender's actor and instance are not in `Suspended` state on this instance | Drop silently; do not respond | +| 11 | **Log append** | Write activity envelope (and inlined object content) to local mirror of sender's outbox; assign local sequence number | Disk error → 503 (transient); sender retries | +| 12 | **Projection** | Asynchronously fold the activity into every relevant projection (per `define-registry`) | Per-projection failure (gas, sandbox violation) → tag activity `projection-failed:`; do not affect log durability | + +Pipeline halts at the first failing stage. Stages 1–10 are synchronous (`POST /inbox` +holds the connection). Stage 11 is synchronous; stage 12 is asynchronous and the +HTTP response returns once the log append succeeds. + +### 14.3 Outbound pipeline — stage by stage + +| # | Stage | Check | Failure response | +|---|-------|-------|------------------| +| 1 | **Authentication** | Caller has a valid bearer token, mTLS cert, or session for the actor | `401` | +| 2 | **Authorization** | Caller's identity is allowed to publish as the named `actor` (capability token §9.5 or owns the actor key) | `403` | +| 3 | **Envelope construction** | Kernel fills in `id`, `published`, normalises `to`/`cc`, computes `capabilities-required` (by walking referenced `Define*` CIDs) | n/a | +| 4 | **Object handling** | If `object` has inline content: canonicalize, compute CID, optionally store per `where`. If `object` references a CID, verify the artifact exists locally or remotely (or accept as a forward reference) | Storage error → `503` | +| 5 | **Activity-type schema** | Same as inbound #6 — schema must pass | `422` with violation detail (caller bug) | +| 6 | **Signature** | Sign envelope with the actor's currently-active key matching the activity type's required `purpose` (e.g. `Pin` requires `purpose: pin`) | If no suitable key: `400` | +| 7 | **Log append** | Write to local outbox; assign sequence number | `503` | +| 8 | **Projection** | Async fold (same as inbound #12) | Per-projection failure tag | +| 9 | **Delivery** | Async push to follower inboxes per audience | Per-recipient retry per §13.4 | + +Caller's HTTP response returns after stage 7 (log append). The activity is durable +and queryable as soon as the response is sent; projection lag is reported via +`projected-up-to` headers and `?wait-for=` parameter. + +### 14.4 Failure response taxonomy + +Three response categories with explicit semantics: + +**Reject** — tell sender, don't store, reject can be retried after sender corrects. +Used for: malformed envelope, invalid signature, schema violation, missing +capabilities. HTTP 4xx with structured error. + +**Quarantine** — store envelope (it's a valid signed message) but don't project, +alert operator. Used for: content-validator soft-fail, unloaded capabilities under +permissive policy, suspect-but-not-banned senders. Activity sits in a quarantine +projection until operator reviews; operator can release (project) or expunge. + +**Drop silently** — don't store, don't respond informatively. Used for: replay (ack +as duplicate), audience refusal (would leak inbox membership otherwise), suspended- +sender activities. The sender experiences this as a successful POST with no visible +effect; they can detect it only by polling for their activity not appearing in our +outbox. + +### 14.5 Registry-driven validators + +Most of the pipeline is **fixed kernel logic** (envelope, signature, replay, audience, +log append, delivery). Two stages are **registry-driven** and extend dynamically: + +- **Stage 8 (content validators)** — operators add/remove `DefineValidator` entries + with `applies-to: inbound | outbound | all`. Each runs in pure or effectful + sandbox per its declaration. Returns one of `:accept` / `:reject{:reason}` / + `:quarantine{:reason}`. +- **Stages 6–7 (schema validators)** — these *are* registry entries + (`DefineActivity.schema`, `DefineObject.schema`); the pipeline calls into the + registry to fetch them. + +**Pure-mode validators** are deterministic and cheap; results can be cached per +(activity-CID, validator-CID). + +**Effectful-mode validators** can call out to ML models, blocklist services, +external moderation APIs. They get a per-call IO budget; exceeding it counts as +`:reject{:reason :validator-timeout}`. Effectful validators do *not* break +determinism because validation happens **before projection** — a rejected activity +never enters projected state. + +### 14.6 Validator composition and ordering + +Validators have an integer `priority` field; lower priority runs first. Pipeline +short-circuits on first `:reject`. `:quarantine` is *not* short-circuiting; later +validators still run, and `:quarantine` results aggregate. + +Default priorities (room for operator-added validators): + +``` +0-99 : kernel-internal (envelope, sig, replay, audience) +100-199 : standard schema validators +200-299 : standard content validators (rate limit, audience leak) +300-399 : operator-added moderation +400-499 : effectful (ML, third-party APIs) +500+ : reserved +``` + +Operators can publish `Update{DefineValidator}` to change priorities or add new +ones; takes effect on next inbound activity. + +### 14.7 Determinism requirement and its limit + +A subtlety worth being explicit about: **inbound validation is not required to be +deterministic across instances.** Two instances can disagree about whether to +accept a given activity (e.g. one has a stricter content validator). Their projected +states will then diverge — but only on activities one accepted and the other didn't. + +This is fine. Federation does not require state convergence; it requires *fold +determinism for activities both instances accepted*. Validators are sovereignty +controls, not protocol invariants. + +Where determinism *is* required: schema validators (§14.2 stages 6–7). If two +instances disagree on whether `Pin v3` matches its schema, they can't federate +`Pin v3` activities meaningfully. So schema validators must be pure-mode and +referenced by CID. + +### 14.8 Operational implications + +- **The pipeline is the security perimeter.** Every checkable property is checked + here, not deeper in the kernel. No "trust the caller" assumptions inside log or + projection code. +- **Quarantine is the operator's friend.** Anything suspicious sits in quarantine + with full envelope, sig, and reason — operator can review and decide. Better than + outright drop because it preserves audit. +- **Schema validators are protocol-load-bearing; content validators are policy.** + The first set must converge across instances for federation to work; the second + set can diverge (and that's how local moderation policy is expressed). +- **Outbound validation catches local bugs early.** A malformed `Pin` activity + fails at outbound stage 5, never enters the local log, never gets delivered. + +## 15. Storage layout + +The on-disk shape of an instance. Three concerns kept separate: the **activity log** +(append-only, canonical), **content-addressed object storage** (keyed by CID, +immutable), and **operational state** (projections, indexes, queues — derived, +rebuildable). + +### 15.1 Storage tiers + +``` +/var/lib/fed-sx/ +├── log/ # canonical, append-only +│ ├── actors/ +│ │ ├── / +│ │ │ ├── outbox/ +│ │ │ │ ├── 000001.jsonl # segment, ~64MB cap +│ │ │ │ ├── 000002.jsonl +│ │ │ │ └── tip # symlink to current segment +│ │ │ ├── inbox/ # received, pre-projection +│ │ │ └── seq # next sequence number +│ │ └── /... +│ └── mirrors/ # local mirrors of followed remote outboxes +│ └── / +│ ├── 000001.jsonl +│ └── ... +├── objects/ # CID → bytes +│ └── // +├── snapshots/ +│ └── / +│ ├── .cbor # snapshot value +│ └── index # ordered list of (log-tip, file) +├── projections/ # live projection state +│ └── .cbor # latest in-memory state, periodically flushed +├── indexes/ +│ └── fed-sx.db # SQLite: lookups, queue, trust state +├── keys/ +│ └── / # private keys, mode 0600 +│ ├── primary.pem +│ ├── recovery.pem +│ └── sigs.toml # key metadata +├── genesis/ +│ └── bundle.cbor # extracted from binary at first run +└── config.toml # operator config +``` + +### 15.2 The log — append-only segments + +The activity log is the only thing the substrate cannot lose. It is the source of +truth from which everything else is derived. + +**Format: JSONL segments.** Each line is one activity envelope, encoded as JSON-LD +(canonical form), terminated by `\n`. Easy to inspect, easy to grep, trivially +streamable. + +**Why JSON-LD on disk, not dag-cbor?** Two reasons: +- Operability: humans can `tail -f` and `grep` the log. dag-cbor is opaque. +- AP wire compatibility: activities arrive over HTTP as JSON-LD anyway; storing the + same form avoids round-trip conversion. + +The CID of each activity is computed from its **canonical dag-cbor representation** +(per §2), independent of how it's stored. CIDs are stable across storage formats. + +**Segments cap at ~64MB.** Rotation by size, not time. Old segments are immutable; +new writes go to the tip segment. Compression (zstd) applied on segments older than +the current tip — saves disk, doesn't slow appends. + +**Per-actor outboxes.** Each local actor has its own outbox directory. This matches +AP semantics (one outbox per actor) and means: +- Backing up a single actor is a simple directory copy +- Per-actor sequence numbers (no cross-actor coordination) +- Migration (`Move`) is a directory rename + a `Move` activity + +**Mirror outboxes.** When a local actor follows a remote one, the remote's outbox is +mirrored locally for replay. Same JSONL format. Tracked under `log/mirrors//` to avoid filesystem path issues with URL characters. The hash is +purely a filesystem-friendly encoding; the canonical actor id stays in the log +content. + +**Inbox vs outbox distinction.** Inboxes hold *received* activities pre-validation; +outboxes hold *committed* activities post-pipeline. An inbound activity that passes +the validation pipeline (§14) is moved from inbox to the appropriate mirror outbox. +This makes inbox a transient queue, not a permanent record. + +### 15.3 Object storage + +Content-addressed blob store, sharded directories. + +**Path scheme:** `objects///`. Sha2-256 CIDs +are uniformly distributed; this gives ~65k buckets with a couple-hundred files each +at moderate scale. Standard pattern (matches IPFS, Git). + +**Storage backends.** Pluggable per `where: cid` object: + +- **`files-on-disk`** (default) — write to local filesystem. +- **`ipfs`** — register-driven backend; calls out to a local IPFS node. +- **`s3`** — object storage in cloud bucket. +- **`memory-only`** — in-memory cache, evictable; useful for ephemeral artifacts. + +The kernel uses the `where-tag` on each object to dispatch to the correct backend. +Backends are registry entries (`DefineStorage`); operators install only the ones +they want. + +**Garbage collection** is opt-in per backend. Default policy: **never GC** (objects +are immutable and may be referenced by future activities). Operators can configure +per-backend retention rules: + +- "Keep last N versions of objects referenced by `Pin` activities for path X" +- "Evict objects not referenced in last 90 days from the `memory-only` cache" +- "Mirror objects referenced by ≥ 3 endorsements; evict others after 30 days" + +GC operates on the projected reference graph (a `reference-graph` projection that +maintains "what activities reference this CID"). Removing an object that's still +referenced is allowed but produces a warning logged in operations. + +### 15.4 Snapshots + +Per §10.4, snapshots are the (projection-CID, log-tip-CID, state) triples that let +us resume without full replay. + +**Storage:** `snapshots//.cbor`. The state value is +dag-cbor-encoded; the file's content CID matches the snapshot's claimed CID. + +**Index:** `snapshots//index` is a sorted list of `(log-tip-time, +log-tip-cid, file)` triples. On startup, kernel finds the latest snapshot ≤ current +log tip and resumes from it. On time-travel queries, finds the latest snapshot +≤ target time and folds forward. + +**Retention:** keep at least: +- Latest snapshot per active projection +- Snapshots referenced by published `Create{Snapshot}` activities (federation + proofs) +- One snapshot per day for the last 7 days (audit / time-travel) + +Older snapshots GC'd by default. Operators can increase retention. + +### 15.5 Operational state — SQLite + +Things that are derived, frequently-queried, but not federated: + +- **Lookup indexes** for projections (when `indexes:` declared) — `(projection, + index-key, value) → activity-cid` rows +- **Delivery queue** — outbound activities pending push, retry counts, next-attempt + timestamps +- **Trust state** — per-actor and per-instance trust levels (Trusted / Default / + Suspended) +- **Quarantine queue** — activities pending operator review +- **Configuration cache** — currently-active registry entries (also in memory; on- + disk cache for fast restart) + +Single SQLite file (`indexes/fed-sx.db`). Recoverable: if corrupted or deleted, +rebuilt from the log on next startup (with cost proportional to log size). The +SQLite is a cache, not authoritative. + +WAL mode for concurrent readers. Single-writer (the kernel); reads from many +HTTP request workers. + +### 15.6 Backup and export + +The substrate is an append-only log of immutable artifacts; backup is simple. + +- **Full backup:** rsync `/var/lib/fed-sx/log/` and `/var/lib/fed-sx/objects/`. The + rest is rebuildable. +- **Per-actor export:** tar `log/actors//` + the objects referenced by + activities in that outbox. Self-contained, importable into another instance. +- **Activity bundle export:** for federation backfill, produce a dag-cbor bundle of + `[activity envelopes... + referenced objects]` for a specified actor + range. + Single file, content-addressed, signed by the source instance with a `Bundle` + activity attesting to its contents. + +Exports are themselves publishable (`Create{Bundle}` activity carrying the bundle +CID). This is how an actor migrates instances cleanly: export bundle, import on +new instance, publish `Move` activity. + +### 15.7 Mirroring and replication + +Two patterns: + +- **Federation mirroring** (the canonical kind) — when actor A follows B, A's + instance mirrors B's outbox locally. This is just normal federation (§13). Each + follower keeps its own copy. +- **Operational mirroring** — for high availability. An operator runs two instances + with shared filesystem (NFS / EFS) for `log/` and `objects/`, separate SQLite + files. Reads can hit either; writes go through one. Or: rsync-based hot standby + with manual failover. + +Operational mirroring is out of scope for v1. Federation mirroring is the substrate- +level redundancy: as long as one peer that followed you is still online, your log is +still recoverable. + +### 15.8 Storage size estimates + +Rough targets at moderate scale (10 active local actors, 1000 followed peers, 1 +year of activity at 100 activities/actor/day): + +- **Log:** 10 actors × 100 act/day × 1 KB avg envelope × 365 days ≈ 365 MB local + outbox. Mirrors: 1000 peers × 10 act/day × 1 KB × 365 ≈ 3.6 GB. +- **Objects:** depends heavily on content. Assume 50% of activities have inline + content of avg 5 KB → ~2 GB total inline. CID-referenced larger objects: count + separately, depends on use case. +- **Snapshots:** typically much smaller than the log. ~10 active projections × + ~10 MB per snapshot × ~8 retained snapshots ≈ 800 MB. +- **SQLite:** index sizes proportional to indexed projection content; typical few + hundred MB. + +Total: order of 10 GB at the described scale. Single-machine viable; SSD recommended +for log throughput; spinning disk fine for snapshots and object storage cold tier. + +### 15.9 Operational implications + +- **The log is sacred.** Never modify, never delete. Backups go to multiple media. + Loss of `log/` means loss of identity (actor activities) and loss of state-of- + record. Loss of `objects/` means loss of content but log + peers can recover most + of it. +- **Everything else is rebuildable.** Projections, indexes, snapshots, queue state + can all be recomputed from the log at startup cost. Operationally, this means + upgrades and migrations are forgiving. +- **CID-addressed storage is naturally idempotent.** Two instances writing the same + artifact write the same bytes to the same path. Race conditions become no-ops. +- **JSONL on disk pays for itself** the first time an operator needs to debug a + weird federation issue with `grep` and `jq`. Worth the storage cost vs dag-cbor. + +## 16. API surface + +HTTP API for reading the log, publishing activities, querying projections, and +streaming updates. Three layers: **AP-standard** endpoints (for vanilla AP +interop), **fed-sx-specific** endpoints (publish, query, capabilities), and +**discovery** endpoints (webfinger, well-known). + +### 16.1 Endpoint catalog + +#### AP-standard + +| Method | Path | Purpose | +|--------|------|---------| +| GET | `/actors/` | Actor doc (Person/Service/Group/Application) | +| GET | `/actors//inbox` | Read inbox — auth required | +| POST | `/actors//inbox` | Receive federated activity (HTTP Signature required) | +| GET | `/actors//outbox` | OrderedCollection of actor's published activities | +| POST | `/actors//outbox` | AP-standard publish (alias for `POST /activity` with `actor` set) | +| GET | `/actors//followers` | OrderedCollection of follower actor URIs | +| GET | `/actors//following` | OrderedCollection of followed actor URIs | +| GET | `/activities/` | Single activity by id | +| GET | `/objects/` | Single object by id (note: distinct from CID-addressed `/artifacts/`) | + +#### fed-sx-specific + +| Method | Path | Purpose | +|--------|------|---------| +| POST | `/activity` | Generalised publish — accepts any well-formed activity | +| GET | `/artifacts/` | CID-addressed artifact fetch (content negotiated) | +| GET | `/artifacts//raw` | Raw bytes (whatever the codec stored) | +| GET | `/artifacts//` | IPLD path traversal into the artifact | +| GET | `/projections` | List of registered projections (name, CID, last-folded-tip) | +| GET | `/projections/` | Full projection state (paginated for large states) | +| GET | `/projections/?at=` | Time-travel: state as of timestamp | +| GET | `/projections//` | Single key from a projection (uses indexes) | +| POST | `/query` | Run an SX query expression against one or more projections | +| GET | `/define-registry` | Currently active `Define*` artifacts by kind | +| GET | `/capabilities/` | Per-actor declared capabilities | + +#### Discovery and well-known + +| Method | Path | Purpose | +|--------|------|---------| +| GET | `/.well-known/webfinger?resource=acct:@` | RFC 7033 actor discovery | +| GET | `/.well-known/sx-capabilities` | This instance's capability advertisement (§7) | +| GET | `/.well-known/host-meta` | XRD describing the host | +| GET | `/.well-known/nodeinfo` | Standard fediverse node metadata (Mastodon, Pleroma compatibility) | + +#### Real-time (SSE) + +| Method | Path | Purpose | +|--------|------|---------| +| GET | `/actors//outbox/stream` | New activities as they're appended (events: `activity`) | +| GET | `/actors//inbox/stream` | New inbound activities (auth required) | +| GET | `/projections//subscribe` | Projection deltas (events: `delta`) | +| GET | `/federation/health/stream` | Per-peer delivery health (events: `peer-status`) | + +WebSocket equivalents (`/ws/...` paths) available where SSE is awkward (browsers +behind proxies); same event payloads, different framing. + +### 16.2 Authentication + +Three mechanisms, each appropriate to a different caller type: + +- **HTTP Signatures** (RFC draft-cavage-http-signatures) — the AP-standard mechanism + for inter-instance calls. Sender signs a digest of relevant headers + body with + their actor's private key; receiver verifies via the actor's public keys + projection (§9.6). Used for: `POST /inbox`, peer-to-peer outbox pulls when + authentication is desired. +- **Bearer tokens** — for interactive clients (CLIs, web UIs, mobile apps). + Issued via OAuth2 (or simple admin-issued tokens for v1). Used for: + `POST /activity`, `GET /actors//inbox`, anything requiring caller identity. +- **Capability tokens** (§9.5) — for delegated publish. Token includes the granting + actor, the granted capabilities (e.g. `publish: Pin for path-prefix /docs/`), the + bearer's actor, expiry, and signature from the granter. Used for: child actors, + service accounts, temporary publish access. + +Public reads (most GET endpoints to public-audience activities) require no auth. +Private/followers-only reads check the caller's identity against the audience. + +### 16.3 Content negotiation + +Same resource, multiple representations. `Accept` header dispatches: + +| Accept header | Returns | +|---------------|---------| +| `application/activity+json` | AP-standard JSON-LD (default for ambiguous Accepts) | +| `application/ld+json; profile="..."` | JSON-LD with explicit profile | +| `application/cbor` | dag-cbor | +| `application/json` | Plain JSON (compact, no `@context` expansion) | +| `application/sx` | Canonical SX wire format | +| `text/html` | HTML representation (for browsers — renders the artifact via SX) | + +Same negotiation applies to `/artifacts/`, `/activities/`, +`/projections/`. Servers MUST honour the request; absent `Accept` defaults to +`application/activity+json`. + +### 16.4 Pagination + +Cursor-based via AP's `OrderedCollectionPage`: + +``` +GET /actors/giles/outbox +→ { + "type": "OrderedCollection", + "totalItems": 12345, + "first": "/actors/giles/outbox?page=true", + "last": "/actors/giles/outbox?page=true&min_id=0" + } + +GET /actors/giles/outbox?page=true +→ { + "type": "OrderedCollectionPage", + "id": "...?page=true", + "next": "...?page=true&max_id=", + "prev": "...?page=true&min_id=", + "orderedItems": [...] + } +``` + +Cursors are CIDs of the boundary activity (not opaque tokens). Stable across +restarts and instances. `max_id` returns activities **before** the cursor (newest +first); `min_id` returns activities **after** the cursor. + +Default page size: 50. Max: 1000. `Link: <...>; rel="next"` header also provided +for HTTP-native pagination. + +For projections: same shape, items are projection entries. + +### 16.5 The query API + +`POST /query` takes an SX expression evaluated in pure mode against named +projections: + +```sx +POST /query +Content-Type: application/sx +Accept: application/sx + +(let ((actors (projection actor-state)) + (pins (projection pin-state))) + (for-each ([(actor-id actor) actors]) + (when (> (count (filter (fn ((path cid)) (= (:owner cid) actor-id)) pins)) 10) + {:actor (:preferredUsername actor) + :pins-published (count ...)}))) +``` + +Query semantics: + +- Evaluated in pure sandbox; all the determinism rules apply. +- Projection access is read-only and snapshot-consistent: the query sees state + as-of the time of the request (or `?at=` if specified). +- Result is serialized in the negotiated content type. +- Gas limit applies (default 1M units per query, tunable by operator). +- Cacheable: query CID + projection state CIDs uniquely determine the result. + +Query results can themselves be published as `Create{QueryResult}` activities, +making derived analyses federable. + +### 16.6 Errors + +Uniform JSON error envelope: + +```json +{ + "error": { + "type": "https://next.rose-ash.com/ns/fed-sx/errors/v1#InvalidSignature", + "status": 401, + "title": "Activity signature invalid", + "detail": "Key id 'https://example/actors/x#key-1' was superseded at 2026-01-15T...", + "activity-id": "https://...", + "key-id": "...#key-1", + "instance": "/incidents/" + } +} +``` + +Error types are URIs in the fed-sx namespace; receivers can check `type` for +programmatic handling. Standard errors: + +- `MissingCapability` — includes `missing` array of CIDs +- `SchemaViolation` — includes `schema-cid`, `field-path`, `expected`, `got` +- `InvalidSignature` +- `Quarantined` — includes `quarantine-id` for operator-status tracking +- `RateLimited` — includes `retry-after` +- `ResourceExhausted` — for query gas exhaustion + +### 16.7 Streaming details + +SSE event format: + +``` +event: activity +id: +data: { ...activity envelope... } + +event: delta +id: +data: {"projection": "actor-state", "key": "...", "old": ..., "new": ...} + +event: heartbeat +data: {"projected-up-to": "", "ts": "..."} +``` + +Clients reconnect with `Last-Event-ID: ` to resume from the last event seen. +Server replays from that point in the log (or returns 410 if too far behind, in +which case client should switch to paginated pull). + +### 16.8 Versioning + +The substrate is versioned at three levels: + +- **Envelope version** — declared in `/.well-known/sx-capabilities`. Currently `1`. + Forward-compatible (new fields OK; semantics fixed). +- **API version** — URL prefix optional: `/v1/...` works the same as `/...`. Future + major version: `/v2/...` paths in parallel. +- **Definition versions** — supersession via activity log (§§9.2, 12.7). No special + URL handling. + +Capability negotiation happens before federation; clients shouldn't hard-code +URL paths beyond the canonical set documented here. + +### 16.9 Operational implications + +- **The API is small but layered.** AP compatibility is one layer; fed-sx + extensions are another; both share auth and content negotiation. Adding a new + endpoint shouldn't require new transport machinery. +- **Content negotiation is the polyglot bridge.** Same artifact addressable in JSON- + LD (for AP peers), dag-cbor (for fed-sx peers), SX (for SX clients), HTML (for + humans). One CID, four representations. +- **Cursor pagination is CID-based.** Stable identifiers, no opaque tokens to + invalidate, peers can synchronize without coordination. +- **The query API is a load-bearing differentiator.** Datalog/GraphQL-equivalent + expressiveness with no separate query language — it's just SX. Federable, signable, + versionable like any other SX artifact. + +--- + +## 17. Implementation languages + +Polyglot **authoring**, monoglot **runtime**: every language-on-SX compiles to core +SX and runs on any host with the SX evaluator. The language is an authoring choice; +the federated artifact is uniform SX. Authors of `Define*` artifacts pick the +source language they prefer; consumers don't need that compiler installed to +execute the compiled SX. + +Languages are picked because they **genuinely fit the problem**, not to demonstrate +the polyglot story. Where a chosen language has gaps (e.g. Erlang-on-SX missing hot +reload), we invest in maturing the port rather than working around the gap. + +### 17.1 The v1 stack + +| Layer | Language | Why | +|-------|----------|-----| +| **Native primitives** | OCaml (existing runtime) | Crypto (RSA, Ed25519, SHA), dag-cbor encode/decode, HTTP socket, file IO, SQLite. Surfaced as Erlang-on-SX BIFs. | +| **Kernel orchestration** | Erlang-on-SX | Actor model = federation. `gen_server` per actor / per projection / per peer. `supervisor` for delivery workers. Message passing is literally the substrate. Hot code reload (Phase 7) for `Define*` live extension. | +| **Query API back-end** | Datalog-on-SX | Projection state is relational; trust graph walks, provenance, projection joins are textbook Datalog. Already mature (276/276 tests, full core Datalog with stratified negation, aggregation, magic sets, federation-graph demo). | +| **`Define*` semantics, schemas, validators, codecs, audience predicates** | Core SX | The canonical federated language. Everything content-addressed and federated lives here. | + +### 17.2 Languages explicitly **not** booked for v1 + +Available, mature, considered — would be reached for if a real fed-sx need surfaced, +but no preemptive use: + +- **Haskell-on-SX** (285/285 tests, 36 programs, type checker working) — for complex + operator-authored extensions that benefit from typed pattern matching. Schemas in + fed-sx are short predicates; types don't earn their keep here. +- **Smalltalk-on-SX** (625/629 tests, classic corpus running) — natural fit for a + live operator dashboard / Glamorous-Toolkit-style introspection. v2/v3 territory; + a browser UI likely wins for operator audiences. +- **APL-on-SX** — high-throughput batch reprojection if scalar SX folds become a + bottleneck. Premature without measured need. +- **JS-on-SX**, **Elm-on-SX** — browser-side client SDK / viewer. v2. +- **Common Lisp-on-SX**, **Forth-on-SX**, **Go-on-SX**, **Dream-on-SX**, + **Elixir-on-SX**, **Erlang-on-SX (alternative form)** — case by case if a use + case appears. + +### 17.3 The FFI BIF layer + +Erlang-on-SX has no FFI / NIF mechanism in its current form (Phase 6 plan: "out of +scope entirely"). fed-sx adds a **BIF layer** in `lib/erlang/transpile.sx` (or a +dedicated `lib/erlang/fed_bifs.sx`) exposing native primitives: + +``` +crypto:rsa_verify/3 crypto:ed25519_verify/3 +crypto:sha2_256/1 crypto:sha3_256/1 + +cid:cbor_encode/1 cid:cbor_decode/1 +cid:multihash/2 cid:from_bytes/2 +cid:to_string/1 cid:from_string/1 + +log:append/2 log:read/3 +log:tip/1 log:replay/3 + +http:listen/2 http:request/2 +http:respond/3 http:sse_send/2 + +fs:read/1 fs:write/2 +fs:exists/1 fs:list/1 + +sqlite:open/1 sqlite:exec/2 +sqlite:query/3 sqlite:close/1 + +snapshot:put/3 snapshot:get/2 +``` + +Each BIF is a thin Erlang-on-SX function dispatching to the corresponding SX runtime +IO primitive. Returns Erlang-shaped values (atoms, tuples, binaries). Errors raise +appropriate Erlang exceptions (`badarg`, `enoent`, `eaccess`). + +This is the **only** native-FFI surface in fed-sx. All other I/O goes through these +BIFs. Operators can audit the BIF list to know exactly what the substrate touches +outside SX. + +### 17.4 Build pipeline + +``` +.sx files (core SX, registry entries) ──┐ +.erl files (Erlang-on-SX kernel) ──┼──> compile to core SX +.dl files (Datalog-on-SX queries) ──┘ + │ + content-addressed SX artifacts + │ + ▼ + genesis bundle (CID-verified) + │ + ▼ + OCaml runtime evaluates everything +``` + +Each authoring language's compiler runs at build time, producing core SX that goes +into the genesis bundle (for bootstrap definitions) or gets published as activities +(for runtime extensions). + +### 17.5 Prerequisite work + +Pieces of investment land in or alongside the Erlang-on-SX loop. The first two +land **before** fed-sx kernel code starts; the third runs in parallel, not +blocking milestone 1, but blocking production-grade throughput. + +1. **Phase 7 — hot code reload.** `code:load_binary/3`, `gen_server` + `code_change/3` callback dispatch, atomic module-version swap. Required for + `Define*` live extension (no kernel restart to load new verbs). Reload- + semantics choice (two-version coexistence vs single-version atomic swap with + closure capture) decided during the work. + +2. **Phase 8 — FFI mechanism + initial BIFs.** `define-bif` registration + term + marshalling + error mapping, then BIFs for `crypto:*`, `cid:*` (dag-cbor), + `fs:*`, `http:*`, `sqlite:*`. Required for fed-sx kernel to call native + primitives. Lands before kernel code that calls them. + +3. **Phase 9 — specialized opcodes (the BEAM analog).** *Layered perf strategy:* + - **Layer 1 (Phase 9, in scope)** — specialized bytecode opcodes that bypass + the general-purpose CEK machine for hot Erlang operations. `OP_PATTERN_TUPLE`, + `OP_PERFORM`/`OP_HANDLE`, `OP_RECEIVE_SCAN`, `OP_SPAWN`/`OP_SEND`, BIF + dispatch table. Targets: 100k+ message hops/sec, 1M-process spawn under + 30sec — roughly 1000-3000× speedup over the current general-purpose path. + - **Layer 2 (Phase 10, deferred)** — multi-core scheduler via OCaml 5 + domains. Decided empirically after Layer 1 lands; likely unnecessary if + Layer 1 alone hits target throughput. + - **Layer 3 (skipped)** — incremental tuning of the existing call/cc-based + receive and env-copy-per-call machinery. Obsoleted by Layer 1; not pursued. + + **Architectural note for Phase 9.** Phase 9a (the **opcode extension + mechanism in `hosts/ocaml/evaluator/`**) is out of scope for the Erlang loop + — it's SX VM core, used by every language port that wants specialized + opcodes. Designed in `plans/sx-vm-opcode-extension.md`; lands as a separate + focused workstream (~1-2 weeks) owning `hosts/`. Phase 9b-9g (the actual + Erlang opcodes in `lib/erlang/vm/`) are designed and tested against a stub + dispatcher in the Erlang loop until 9a is available. + + **Shared-opcode discipline.** Opcodes Phase 9 produces that other language + ports could plausibly use (pattern match, perform/handle, record access) + become candidates for chiselling out to **`lib/guest/vm/`** — same lib/guest + discipline, applied at the bytecode layer. Don't pre-extract; promote to + `lib/guest/vm/` when a second language port has an actual second use. The + substrate accumulates a richer opcode surface over time as ports contribute, + and every port benefits from every shared opcode (the structural advantage + over BEAM, which is special-purpose-built for one language). + + **fed-sx is not blocked by Phase 9.** Milestone 1 ships on current Erlang- + on-SX perf (which has 100-1000× headroom for a single demo instance). Phase + 9 lands in parallel; by the time fed-sx needs production-grade throughput + (federation hub use cases, milestone 2-3), Phase 9 is ready. + +After Phases 7 and 8 land, fed-sx milestone 1 (kernel + registries + bootstrap +entries + Pin smoke test + reactive application smoke test) becomes the next +workstream. Phase 9 work continues in parallel. + +--- + +## 18. Subscription model + +Symmetric to the publish-side extensibility: just as `DefineActivity` registers what +*kinds of things can be published*, `DefineSubscription` registers what *kinds of +patterns can be subscribed to*. `Follow` becomes one standard subscription type +among many, not a hardcoded primitive. + +### 18.1 The asymmetry being fixed + +Without this, the substrate has rich publish-side extensibility (any new verb is a +`DefineActivity`) and *one* hardcoded subscription primitive (`Follow`). That +mirrors AP but it's an arbitrary limitation in a substrate where everything else +is registry-driven. Generalising restores symmetry. + +### 18.2 The `DefineSubscription` shape + +```sx +(activity 'Create + :object {:type "DefineSubscription" + :name "Follow" ; AP-standard + :schema (fn (sub) ; what params the sub takes + (and (cid? (-> sub :object)) + (= "Person" (-> sub :object-type)))) + :match (fn (subscription activity) ; pure-mode predicate + (= (-> subscription :object) (:actor activity))) + :delivery {:default :push + :modes [:push :pull :sse] + :digest-window nil} + :capabilities-required []}) ; some subs may need authority +``` + +Four mandatory parts: + +- **`schema`** — pure-mode predicate validating subscription parameters at + `Subscribe` time. Catches malformed subscriptions before they enter state. +- **`match`** — pure-mode predicate `(subscription, activity) → bool`. Decides + whether a given activity is a hit for this subscription. Determinism rules + apply (§11.2). +- **`delivery`** — supported modes (push to inbox / pull on demand / SSE + streaming / batched digest). The subscription instance picks its preferred + mode at `Subscribe` time from the supported set. +- **`capabilities-required`** — capability tokens the subscriber must hold + (empty for public subs; populated for paywalled/gated/private streams). + +### 18.3 The `Subscribe` verb + +The bootstrap verb that activates a subscription: + +```sx +(activity 'Subscribe + :object {:type "Follow" :object "https://alice.example/actors/alice"}) + +(activity 'Subscribe + :object {:type "Topic" :tag "climate-change" + :delivery :digest :digest-window "P1D"}) + +(activity 'Subscribe + :object {:type "CidWatch" :cid "bafy..." + :events [:supersede :endorse]}) + +(activity 'Subscribe + :object {:type "Predicate" + :pred '(fn (act) (and (= (:type act) "Note") + (string-contains? (-> act :object :content) "fed-sx")))}) +``` + +`Unsubscribe` is `Undo{Subscribe}` — AP's standard pattern, retains audit. + +### 18.4 Standard subscription types (defined later, not bootstrap) + +Same status as the custom verbs in §6.2 — substrate accepts any subscription +type once a `DefineSubscription` artifact registers it. Standard set: + +| Name | Params | Match semantics | Use case | +|------|--------|-----------------|----------| +| **`Follow`** | `{object: actor-id}` | activity.actor == subscription.object | AP-standard actor following | +| **`Topic`** | `{tag: string}` | tag in activity.object.tags | Hashtag follows, RSS-like | +| **`CidWatch`** | `{cid, events: [...]}` | activity references cid AND activity.type in events | "Notify me when this artifact is updated/endorsed/forked" | +| **`PathWatch`** | `{path, events: [...]}` | activity is a Pin/Update of named path | "Notify me when domain:foo/bar/baz changes" | +| **`VerbFilter`** | `{wraps: subscription-cid, types: [...]}` | inner subscription matches AND activity.type in types | "Follow Alice but only Endorse activities" | +| **`TrustGraph`** | `{root: actor-id, depth: int}` | activity.actor reachable from root in trust graph at depth | Web-of-trust expansion | +| **`Predicate`** | `{pred: sx-fn}` | (pred activity) returns truthy | Escape hatch — most powerful, highest cost | +| **`Channel`** | `{channel-id}` | activity addresses or originates from channel | Multi-actor pooled streams | + +### 18.5 Match-fn execution location + +The load-bearing question. Three choices, fed-sx adopts the **hybrid model**: + +- **Coarse filter on the publisher side** — audience predicates (§8) decide who + the activity is delivered to at all. This is mandatory and cheap (audience set + is usually small and well-defined). +- **Fine filter on the subscriber side** — once an activity arrives in inbox, + the subscriber's instance evaluates each active subscription's `match-fn` + against it. Pure-mode evaluation (deterministic, gas-bounded). Activities + matching one or more subscriptions enter the subscriber's projected state. + +Why hybrid: publisher-side fine filtering would require the publisher to know +every subscriber's match-fn (privacy-violating, scaling-killing). Subscriber-side +filtering is wasteful only if the publisher's audience model is too coarse — +which is the audience system's job to fix per §8. + +### 18.6 Subscription state and storage + +Active subscriptions are themselves projected state. A bootstrap projection +`subscriptions` (paralleling `audience-graph` for the inverse direction) +maintains: + +``` +{actor-id -> [{subscription-cid, type, params, mode, started-at}]} +``` + +Updated by `Subscribe` and `Unsubscribe` activities. Queryable like any other +projection (§16). Used by: + +- The inbox dispatcher to know which match-fns to evaluate against incoming + activities +- Triggers (§19) to know which activities to fire on +- Federation to advertise "here are the subscription types I currently subscribe + to" (capability-style, opt-in) + +### 18.7 Federation interactions + +Subscriptions interact with federation in three ways: + +- **Discovery.** Peer's `/.well-known/sx-capabilities` (§7) lists registered + `DefineSubscription` CIDs, so subscribers know what they can ask for. +- **Negotiation.** A `Subscribe` activity carries `capabilities-required`; if + the publisher's instance doesn't support the named subscription type, it + responds with the standard 422 + missing-CIDs error (§14.2 #9). Subscriber + can then deliver the bootstrapping `DefineSubscription` artifact and retry. +- **Cross-instance match-fn**. If subscriber and publisher both run the same + conformance-tested SX evaluator, identical subscriptions match identically + (cross-host equivalence, §11.8). This is what makes federated topic + subscriptions reliable: every conforming instance computes the same + set-of-matches for the same activity. + +### 18.8 Operational implications + +- **The audience system handles "who do I send this to."** The subscription + system handles "what do I want to receive." They're complementary, not + redundant. +- **Subscription types can themselves evolve via supersession.** New version of + `Topic` with case-insensitive matching? Publish a new `DefineSubscription`, + `Supersede` the old one. Existing subscriptions migrate at next match + evaluation. +- **Match-fn cost matters.** A `Predicate` subscription with a slow predicate + becomes a per-activity tax. Gas budgets (§11.5) bound the worst case; + operators can disable expensive subscription types if needed. +- **Subscriptions are signed messages.** Audit, accountability, and revocation + all work the same way as activities — because subscriptions *are* activities. + +--- + +## 19. Application model + +The synthesis. With publish, subscribe, project, and trigger as registry-driven +primitives, the substrate has everything needed to express **distributed reactive +applications** as data — no native code, no kernel changes, no privileged +runtime. Applications are themselves federated artifacts. + +### 19.1 An application is a tuple of artifacts + +``` +Application = { + subscriptions : [DefineSubscription instances and their parameters], + triggers : [DefineTrigger registrations], + projections : [DefineProjection registrations], + storage : [DefineStorage registrations] (optional) +} +``` + +That tuple, signed and bundled, is the application. Installing one = following +the named actors / activating the named subscriptions + loading the Define* +CIDs into the local registry. Forking one = republishing the Define* with +`Supersede` over the bits you change. + +### 19.2 The reactive loop + +``` + External actors Operator publishes activities + publish activities via this instance's actors + │ │ + ▼ ▼ + ┌─────────────────────────────────────────────┐ + │ Inbound + outbound activities │ + └────────────────────┬────────────────────────┘ + │ + ▼ + For each active subscription: + evaluate match-fn (pure mode) + │ + ┌─────────────┴─────────────┐ + ▼ ▼ + Activity matches Activity does + a subscription not match + │ │ + ▼ ▼ + Projections ← (silently dropped from + fold the activity this application's view; + │ may match other apps) + ▼ + Triggers fire on the + subscription's match + │ + ▼ + Trigger then-sx runs + (effectful sandbox) + │ + ├──> updates local state (private projections) + ├──> publishes new activity (via outbox) + └──> calls effectful primitives (HTTP, fs, etc.) + per declared capabilities +``` + +Three things happen on a match: **state updates** (projection), **derived +publishes** (new activities), **side effects** (effectful primitives). Each is +authorisation-gated by the trigger's declared capabilities. + +### 19.3 Trigger semantics + +`DefineTrigger` registers `(when-subscription, then-sx, cascade-limit)`: + +- **`when-subscription`** — references a subscription (by CID or by name). The + trigger fires whenever that subscription matches an inbound or outbound + activity. Multiple triggers can reference the same subscription. +- **`then-sx`** — function of `(activity, subscription, env) → trigger-result`. + Runs in pure or effectful sandbox per declaration. Returns one or more of: + - `:publish [activity-spec ...]` — request publish of derived activities + - `:project [name → state-update ...]` — request projection updates + - `:effect [capability-call ...]` — request effectful primitive calls + - `:noop` — observed but no action +- **`cascade-limit`** — bounded depth for trigger cascades (§19.4). + +A trigger is fundamentally **a reactive rule**: "when X happens, do Y." The +substrate guarantees Y happens at most once per X (deduplicated by activity-CID), +exactly-once-per-instance (delivery from trigger to its effects is durable), +and bounded-cost (gas + cascade-limit). + +### 19.4 Cascade control + +A trigger that publishes activities can fire other triggers. Without limits, a +single inbound activity could cascade across instances forever. + +Each trigger declares `cascade-limit: N` (default 3). Each activity carries an +implicit `cascade-depth` field, incremented when it's the result of a trigger +firing. A trigger refuses to fire if `cascade-depth > cascade-limit`. + +Cascade limits are local-only (operator policy, not federated). Defending +against runaway cascades from peer instances is the operator's job; the +substrate gives them the knob. + +### 19.5 The `DefineApplication` bundle + +A bundle artifact that names and groups the components of an application: + +```sx +(activity 'Create + :object {:type "DefineApplication" + :name "rose-ash-blog" + :version 1 + :subscriptions [{:type "Follow" :object "https://blog.rose-ash.com/actors/main"} + {:type "Topic" :tag "rose-ash"} + {:type "CidWatch" :cid + :events [:supersede]}] + :triggers [ + + ] + :projections [ + ] + :storage [] + :capabilities [ + ] + :description "Federated blog with moderated comments and RSS"}) +``` + +Three operations on applications, all themselves activities: + +- **Install** — `Subscribe` to each subscription, `Create{}` references in + `define-registry` to each trigger/projection/storage CID. One activity per + reference, audited and replayable. Or: a single `Install{DefineApplication}` + meta-verb that does the bundle in one signed step (defined later as a custom + verb, not bootstrap). +- **Update** — publish a new `DefineApplication` with the same name + + `supersedes` pointing at the old. Diff-then-apply: subscriptions added/ + removed, triggers loaded/unloaded, projections reprojected per §10.5. +- **Fork** — publish a new `DefineApplication` referencing the original's CID + via `forked-from`, with whatever Define* CIDs you want to swap. Run alongside + the original or in place of it. + +### 19.6 Per-application namespacing + +Multiple applications running on one instance need isolation: + +- **Projections are namespaced by application.** `pin-state` from app A is + distinct from `pin-state` from app B — both addressable as + `/projections//pin-state`. +- **Triggers fire only on subscriptions belonging to their application.** App + A's trigger doesn't see app B's subscription matches. +- **Storage backends are namespaced.** App A's `files-on-disk` backend writes + to `data/apps/A/objects/`; app B writes to `data/apps/B/objects/`. +- **Capabilities are per-application.** Granting `http-client` to app A + doesn't grant it to app B. Operator can audit per-app capability surface + and revoke selectively. + +Cross-application reads are explicit and require a capability grant +(`read-projection: /`). Default isolation; opt-in sharing. + +### 19.7 Worked examples + +#### Example A — Blog with moderated comments + +``` +DefineApplication "blog-with-comments": + subscriptions: + - Follow: + - Topic: "post-comment" (filter: object.in-reply-to in our-posts) + triggers: + - on Topic match → publish Note (the new comment, derived if approved) + → projection pending-moderation + - on inbound Approve{Reply} → projection comment-thread (visible) + projections: + - comment-thread: post-cid → [approved comment activities] + - pending-moderation: list of pending replies awaiting approval +``` + +#### Example B — Continuous integration + +``` +DefineApplication "ci-pipeline": + subscriptions: + - Follow: + - VerbFilter: wraps Follow, types: [Push] + triggers: + - on Push match → effect: run build (capability: subprocess + fs-write) + → publish Build{source: Push.cid, output: , status} + - on Build{status: success} → effect: run tests + → publish Test{...} + - on (Test{passed} count for N days) → publish Release{...} + projections: + - build-history: commit-cid → [build activities] + - release-history: ordered list of Release activities +``` + +#### Example C — Distributed code review + +``` +DefineApplication "code-review": + subscriptions: + - Topic: "review-request" + - CidWatch: , events: [Endorse] + triggers: + - on review-request match → projection review-queue + → effect: notify-reviewer + - on Endorse from authorised reviewer → publish Approve{review-cid} + → projection approval-state + projections: + - review-queue: ordered list of pending requests with summaries + - approval-state: review-cid → endorsement set +``` + +In all three: the application is *just* the bundle of subscriptions, triggers, +and projections. Federation makes them composable across instances. The +substrate provides exactly-once-per-CID semantics and pure-mode determinism for +the matches and folds. + +### 19.8 Composition and discovery + +Applications are themselves federated content. This means: + +- **App registries** — actors can publish curated lists of applications they + endorse. Discovery becomes follow-an-actor + browse-their-app-list. +- **Cross-app composition** — application A publishes derived activities that + application B subscribes to. Pipeline of applications via the activity log. +- **App marketplaces** — pin a friendly path to a `DefineApplication` CID + (`rose-ash.com:apps/blog → bafy...`) for human discoverability. + +None of this requires kernel changes. It's all activities about activities. + +### 19.9 Operational implications + +- **Applications are inspectable from the activity log alone.** Replay an + actor's outbox and you can reconstruct the exact application installation + state at any point in time. +- **Application updates are atomic relative to the activity log.** Either the + `Update{DefineApplication}` succeeded (new state visible from next activity) + or it didn't (old state continues). No partial-update window. +- **Forking is the same as installing a copy.** No special "fork" mechanism + needed; the activity-log mechanics already support it. +- **Per-app capabilities are a real security surface.** Operators must + understand what they're granting when they install. The bundle's + `capabilities` list is the audit point — should be human-readable and + reviewable before installation. +- **The substrate isn't an "application platform" — it's an "application + substrate."** Applications aren't installed *on* fed-sx; they're expressed + *in* fed-sx, as the same kind of content as everything else. + +--- + +## Appendix A: relationship to adjacent systems + +Worth knowing about so we can borrow good ideas: + +- **ATproto / Bluesky** — Lexicons (schemas) + repos (per-actor signed merkle trees). + Closest in spirit. We borrow the schema-as-data idea; we differ by making schemas + themselves federated activities, not central registry entries. +- **Spritely Goblins** — capability-secure actors. We borrow the capability-token + pattern for delegation. +- **Ceramic** — signed event streams, content-addressed. Similar log-as-state model; + we differ by making the projection function pluggable per-stream rather than + hardcoded per-streamtype. +- **Holochain** — agent-centric DHT. We share the "every agent has their own log" + shape; we use AP federation instead of DHT. +- **Farcaster** — pubsub on hubs. We share the firehose model; we add cryptographic + outbox-as-source-of-truth. + +None of them are *code-as-data the whole way down* — that's the SX-distinctive bit. +Handlers, validators, projections aren't bytecode shipped out-of-band; they're SX in +the same log as everything else, evaluable by any host that speaks SX. + +## Appendix B: implications worth sitting with + +- **Deployment dissolves.** Releasing a feature = publishing `DefineActivity{name: + "Whatever", ...}`. Federation distributes it. No build artifact, no rolling deploy, + no version-skew between server and client. +- **Applications are forkable by default.** "Fork the rose-ash blog" = take the bundle + of `Define*` CIDs that constitute it, publish your own with `Supersede` over the + ones to change, run your own projector. Same federation graph, divergent state. +- **Composition is by reference, not import.** `Pin` activity points at the CID of the + `DefineActivity{name: "Pin"}`. No package manager, no transitive deps, no lockfiles. +- **The boundary between "user" and "developer" softens.** Both publish signed + activities. Power users can publish handlers, projections, sig suites under their + own actor. +- **This is more ambitious than a rose-ash rewrite.** It's a substrate that *happens + to* host rose-ash as its first application. + +--- + +## Appendix C: AI agent collaboration patterns + +The substrate is incidentally well-shaped for one of the open problems of the +next decade: **infrastructure for AI agent collaboration where contributions +are signed federated artifacts, behavior is bounded by declared capabilities, +decisions are audit-by-replay, and infrastructure improves through agent +contribution within a web of trust.** + +This is not a designed-for use case — fed-sx was conceived as a federated +publishing and reactive application substrate. But the properties it has fit +agent collaboration almost exactly. Worth being deliberate about, because the +framing changes who fed-sx is *for*. + +### Why the substrate fits agent collaboration + +AI agents need infrastructure where contributions are first-class artifacts, +not pull requests against human-controlled repos. Currently agents squeeze +through GitHub PRs, deployment pipelines, npm publishes — all of which assume +a human in the loop. fed-sx is shaped for direct contribution: + +- **Direct authoring of substrate features.** An agent doesn't *propose* a + feature, it *publishes* one. A `DefineActivity` artifact is the agent's + contribution. A `DefineProjection` is its analysis. A `DefineTrigger` is its + automation. The signed publication IS the deploy — no PR review, no CI, no + DevOps. +- **Cryptographic identity without registration.** Agents have actor keys; + reputation is the endorsement graph; trust is provable by signature chain. + Two agents that have never met can verify each other's contributions + cryptographically. +- **Capability-bounded autonomy.** An agent declares `capabilities-required` on + its activities. A trigger says "I publish to path-prefix `/agent-x/*` and + call `http-client` for `api.example.com/*`." Receivers verify the constraint + cryptographically; the agent can't escape its declared surface even if the + agent itself is misaligned. Sandbox model designed for autonomous code (§11). +- **Audit-by-replay applied to AI behavior.** Every AI decision is + reconstructable, deterministically, by anyone with the log. "Why did agent A + do X?" replay the log to that moment, see the activities A subscribed to, + the projection state it observed, the trigger that fired, the activity it + published. Fundamentally better than today's "trust the model" posture. +- **Composition without coordination.** Agent A publishes a moderation + validator. Agent B subscribes and uses it. Agent C improves it, supersedes + A's. B sees the supersession, decides whether to adopt. No central registry, + no maintainer to coordinate with, no version skew. +- **Disagreement is visible, not hidden.** If agents A and B compute the same + projection over the same log and produce different snapshot CIDs, the + disagreement is *cryptographically observable*. Today, two AI services + answering the same question with different answers is invisible until + somebody notices. + +### Dynamics that emerge + +- **Agent specialisation = publication.** "I'm the indexing agent" = publishes + `DefineProjection` artifacts. "I'm the moderation agent" = publishes + `DefineValidator` artifacts. "I'm the matchmaking agent" = publishes a + `DefineApplication` for marketplace subscriptions and triggers. Specialisation + is content, not service deployment. +- **Reputation = endorsement graph.** Web of trust applied to agent + contributions. Bad actors get cut out organically; no central authority to + capture. +- **Forking = explicit disagreement resolution.** Agents disagree on + validation? Both publish their `DefineValidator`s. Subscribers pick. The fork + is signed, observable, recoverable. Compare today: when AI services have + different rules, one is just *invisibly applied*. +- **Cascade limits = agent population safety.** The `cascade-depth` and + `cascade-limit` (§19.4) become the bounded-autonomy guard rails for agent + populations. Self-coordination without runaway-cascade across the substrate. +- **Self-improving infrastructure.** Agents observe substrate behavior, propose + improvements as `DefineProjection` for monitoring, `DefineTrigger` for + automation. The substrate itself improves through agent contribution — not + through a release cycle. Every improvement is signed and traceable. + +### Use cases + +- **Agent-managed scientific datasets** — collection, cleaning, analysis, + publication, peer review by other agents, all signed activities. Replication + is replay; provenance is built in. +- **Multi-agent code maintenance** — agents observing repos (subscribe to + `Push`), running tests (triggers), proposing fixes (`Pull`-equivalent + activities), endorsing each other's work. +- **Agent-curated knowledge** — agents publish, endorse, and supersede + knowledge artifacts. Truth accumulates via the trust graph; outdated info + gets `Supersede`d explicitly. +- **Distributed agent marketplaces** — agents publish capabilities, subscribers + find them via `Topic` / `Predicate` subscriptions, contracts via signed + activity exchange. +- **Cross-agent AI safety monitoring** — monitoring agents subscribe to other + agents' outboxes, run validators, publish `Alert` activities when patterns + of concern appear. Decentralised oversight without central authority. +- **Cross-org agent workflow coordination** — supply chain, healthcare, legal — + multiple specialised agents coordinating across organisational boundaries + with cryptographic provenance. + +### Safety and governance properties + +The substrate provides several properties AI safety has been asking for and +that current infrastructure does not provide: + +- **Every action is signed.** Attribution is cryptographic, not a log file an + agent could spoof. +- **Capabilities are declared and enforced.** Agents operate within their + declared sandbox; can't grow capabilities silently. +- **Cascades are bounded.** No exponential agent-on-agent feedback loops + without explicit configuration. +- **Audit is replay.** Every decision can be reconstructed deterministically; + no opaque "the model decided" moments. +- **Disagreement is visible.** Two agents producing different projections of + the same data is a cryptographically-detectable event, not invisible drift. +- **Trust is the endorsement graph, not central authority.** No single point of + capture or coercion. +- **Forks are first-class.** When safety-critical disagreements occur, the + substrate accommodates them without forcing a winner; observers see all + positions. + +### What this implies for the project + +- **Milestone 1's smoke tests remain right** — the verb-extensibility and + reactive-application proofs apply to agent contributions exactly as they + apply to human contributions. The agent collaboration framing doesn't + require new mechanisms; it interprets the existing mechanisms differently. +- **The application model (§§18-19) is the headline story** for this audience, + not a layer on top. Subscriptions + triggers + projections + capabilities = + agent collaboration primitives. +- **Capability discovery and trust dynamics gain weight earlier.** Where + human-driven applications can rely on operator policy, agent-driven + populations need the trust graph to be operational from milestone 2. +- **The pitch line evolves.** Less "ActivityPub for code" / "rose-ash next + gen," more "infrastructure for AI agent collaboration with cryptographic + provenance, bounded autonomy, and audit-by-replay." The technical substance + is unchanged; the framing of *who needs this* changes substantially. + +The substrate accidentally being well-shaped for the most important +software-distribution problem of the next decade is worth being deliberate +about. + diff --git a/plans/fed-sx-milestone-1.md b/plans/fed-sx-milestone-1.md new file mode 100644 index 00000000..de7a3e60 --- /dev/null +++ b/plans/fed-sx-milestone-1.md @@ -0,0 +1,922 @@ +# fed-sx Milestone 1 — Kernel + Registries + Pin Smoke Test + +Concrete implementation plan for the smallest fed-sx that proves the architecture +works end-to-end. Reference: `plans/fed-sx-design.md`. Prerequisite: Erlang-on-SX +Phases 7 (hot reload) + 8 (FFI BIFs). + +## Goal + +Ship a single-instance, single-actor fed-sx server that: + +1. Boots from a verified genesis bundle. +2. Accepts and durably appends signed activities via `POST /activity`. +3. Folds them into projections in real time. +4. Serves AP-standard endpoints (actor, outbox, artifacts, capabilities). +5. Demonstrates **two extensibility proof-points** end-to-end with zero kernel + code changes between definition and use: + - **Verb extensibility** (§5 meta-level): publish `DefineActivity{Pin}` + + `DefineProjection{pin-state}`, then publish a `Pin` activity, observe it + validated and projected. + - **Reactive application extensibility** (§§18-19): publish + `DefineSubscription{Topic}` + `Subscribe{topic: smoketest}` + + `DefineTrigger{when: that subscription, then: publish TestEcho}`, then + publish a tagged Note, observe the subscription match, the trigger fire, + and the derived activity appear in the outbox. + +Federation, multi-actor, advanced verbs, IPFS, browser UI, operator dashboard +are **explicitly v2**. + +## Non-goals (what milestone 1 deliberately does NOT do) + +- **Federation.** No `POST /inbox` from peers, no `Follow`, no delivery queue, no + webfinger discovery flow. Single instance only. +- **Multi-actor.** Single domain actor (`acct:next@next.rose-ash.com`). +- **IPFS / S3 storage backends.** Files on disk only. +- **Advanced verbs.** No `Endorse`, `Supersede`, `Test`, `Build`, `Compose`, + `Note`, `Announce`. Only the four bootstrap verbs (`Create`, `Update`, `Delete`) + plus a defined-from-the-log `Pin` for the smoke test. (`Announce` deferred — + no use case until federation exists.) +- **Browser UI.** Curl-shaped API only. +- **Operator dashboard, quarantine UX.** Logs only. +- **Performance work.** Functional correctness first; perf when measured. +- **Cross-host conformance test corpus.** Only the OCaml/Erlang-on-SX host runs + fed-sx in v1; conformance suite for other hosts is v2. + +## Architecture summary + +``` + POST /activity + │ + ▼ + ┌──────────────────────────┐ + │ HTTP server (Erlang-on-SX)│ + └─────────────┬─────────────┘ + │ + ┌─────────────▼──────────────┐ + │ Validation pipeline driver │ + │ (envelope→sig→schema→...) │ + └─────────────┬──────────────┘ + │ + ┌─────────────▼──────────────┐ + │ Log append (JSONL segment) │ ← canonical + └─────────────┬──────────────┘ + │ + ┌─────────────▼──────────────┐ + │ Projection workers │ ← gen_server per + │ (fold scheduler) │ projection + └─────────────────────────────┘ + │ + ▼ + Projection state + (queryable via HTTP) + +Native primitives (Erlang-on-SX BIFs from Phase 8): + crypto:* cid:* fs:* http:* sqlite:* + +Genesis bundle (binary-embedded SX): + activity-types object-types projections + validators codecs sig-suites +``` + +## Build order + +Eight steps in dependency order. Each step has concrete deliverables, testable +in isolation, and a clear acceptance check. + +| Step | Title | Depends on | +|------|-------|------------| +| **1** | Repo skeleton + canonical CID computation | Phase 8 (cid BIFs) | +| **2** | Activity envelope + signature verify | Phase 8 (crypto BIFs) | +| **3** | JSONL log + sequence numbers | Phase 8 (fs BIFs) | +| **4** | Genesis bundle (SX sources + bundling + CID verification) | Step 1 | +| **5** | Registry mechanism + bootstrap-projection dispatch | Steps 2, 4 | +| **6** | Validation pipeline driver + `POST /activity` | Steps 2, 3, 5 | +| **7** | Projection scheduler (gen_server per projection) | Steps 5, 6 | +| **8** | HTTP server, AP endpoints, projection queries | Steps 6, 7 | +| **9** | Smoke tests (Pin verb + reactive application) | Steps 1-8 | + +--- + +## Step 1 — Repo skeleton + canonical CID + +**Deliverables:** + +``` +next/ +├── README.md # what this is +├── kernel/ # Erlang-on-SX +│ └── (empty for now) +├── genesis/ # core SX bootstrap definitions +│ └── (empty for now) +├── tests/ # smoke test scripts +│ └── (empty for now) +└── data/ # gitignored runtime state + ├── log/ + ├── objects/ + ├── snapshots/ + ├── indexes/ + └── keys/ +``` + +Plus one Erlang-on-SX module: + +```erlang +% next/kernel/cid.erl +-module(cid). +-export([from_sx/1, to_string/1, from_string/1, equals/2]). + +from_sx(SxValue) -> + Cbor = cid:cbor_encode(canonicalize_sx(SxValue)), + Hash = crypto:sha2_256(Cbor), + cid:from_bytes(<<"raw">>, Hash). % defaults to dag-cbor codec + +canonicalize_sx(V) -> ... % sorts dict keys, normalizes strings +``` + +**Tests:** +- Same SX value → same CID across multiple invocations. +- Different SX values → different CIDs. +- Whitespace/comment differences in source → identical CIDs (parsed AST identical). +- Reordered dict keys → identical CIDs (sorted-key canonicalization). +- Cross-host parity (just OCaml host for v1, but write the test so adding hosts is mechanical). + +**Acceptance:** `bash next/tests/cid.sh` passes 10+ cases. + +--- + +## Step 2 — Activity envelope + signature verify + +**Deliverables:** + +```erlang +% next/kernel/envelope.erl +-module(envelope). +-export([validate_shape/1, canonical_bytes/1, verify_signature/2]). + +% Envelope shape per design §3.1: +% #{id, type, actor, published, to, cc, audience_extras, +% object | target | origin | result, +% capabilities_required, proofs, signature} +validate_shape(Activity) -> ok | {error, Reason}. + +canonical_bytes(Activity) -> + % Strip signature, canonicalize via dag-cbor, return bytes for sig coverage + Stripped = maps:remove(signature, Activity), + cid:cbor_encode(canonicalize_for_sig(Stripped)). + +verify_signature(Activity, ActorState) -> + % Time-aware: find key with id == sig.key_id that was active at published + % Per design §9.6 + ... +``` + +**Tests:** +- Envelope shape: required fields present (id, type, actor, published, signature) +- Envelope shape: type is a known activity-type or unknown-but-string +- Envelope shape: signature has key_id, algorithm, value +- Sig verify: valid RSA-SHA256 signature against published key → ok +- Sig verify: valid Ed25519 signature → ok +- Sig verify: tampered envelope → fail +- Sig verify: key superseded before activity timestamp → fail +- Sig verify: key superseded after activity timestamp → ok (historical valid) + +**Acceptance:** `bash next/tests/envelope.sh` passes 15+ cases. + +--- + +## Step 3 — JSONL log + sequence numbers + +**Deliverables:** + +```erlang +% next/kernel/log.erl +-module(log). +-export([open/1, append/2, read_segment/2, tip/1, replay/3]). + +% Per design §15.2: per-actor outbox, segments cap ~64MB, +% format = JSONL (one canonical JSON-LD activity per line) + +open(ActorId) -> + BasePath = log_path_for_actor(ActorId), + fs:mkdir_p(BasePath), + {ok, #{base => BasePath, current => current_segment(BasePath), seq => next_seq(BasePath)}}. + +append(LogState, Activity) -> + Json = jsonld:encode(Activity), + Path = current_segment_path(LogState), + Line = <>, + fs:append_file(Path, Line), + NewSeq = LogState#{seq := LogState.seq + 1}, + rotate_if_needed(NewSeq). + +% replay/3 calls Fun(Activity, Acc) for every activity in chronological order +replay(LogState, InitAcc, Fun) -> ... +``` + +**Tests:** +- Append + read back gives identical activity (round-trip). +- Sequence numbers monotonic and gap-free per actor. +- Segment rotation at size threshold. +- Replay visits all activities in append order across multiple segments. +- Restart preserves tip pointer (seq number resumes correctly). +- Concurrent appends (using gen_server-mediated access) are serialized correctly. + +**Acceptance:** `bash next/tests/log.sh` passes 10+ cases. + +--- + +## Step 4 — Genesis bundle + +**Deliverables:** + +Genesis bundle SX sources (per design §12.2). Each is a small SX file authored +by hand for the bootstrap set: + +``` +next/genesis/ +├── manifest.sx # bundle root: lists all definitions +├── activity-types/ +│ ├── create.sx # DefineActivity{name: "Create", ...} +│ ├── update.sx +│ └── delete.sx +├── object-types/ +│ ├── sx-artifact.sx +│ ├── note.sx +│ ├── tombstone.sx +│ ├── define-activity.sx # DefineObject for the Define* meta types +│ ├── define-object.sx +│ ├── define-projection.sx +│ ├── define-validator.sx +│ ├── define-codec.sx +│ ├── define-sig-suite.sx +│ └── snapshot.sx +├── projections/ +│ ├── activity-log.sx # identity projection +│ ├── by-type.sx +│ ├── by-actor.sx +│ ├── by-object.sx +│ ├── actor-state.sx +│ ├── define-registry.sx # the chicken-and-egg projection +│ └── audience-graph.sx +├── validators/ +│ ├── envelope-shape.sx +│ ├── signature.sx +│ └── type-schema.sx +├── codecs/ +│ ├── dag-cbor.sx # delegates to cid:cbor_encode/decode BIFs +│ ├── raw.sx +│ └── dag-json.sx +├── sig-suites/ +│ ├── rsa-sha256-2018.sx +│ └── ed25519-2020.sx +└── audience/ + ├── public.sx + ├── followers.sx + └── direct.sx +``` + +Plus a build-time bundler: + +```erlang +% next/kernel/bootstrap.erl +-module(bootstrap). +-export([build_genesis/1, verify_genesis/1, load_genesis/1]). + +build_genesis(SourceDir) -> + % Walk SourceDir, parse each .sx file, build a single dag-cbor bundle, + % compute its CID, write bundle.cbor + CID to data/genesis/ + ... + +verify_genesis(BundlePath) -> + % Compute CID of the bundle as loaded; compare to expected (hardcoded + % in the kernel binary). Mismatch → halt. + ... + +load_genesis(BundlePath) -> + % Parse the bundle, register all definitions in the in-memory registry + ... +``` + +**Tests:** +- All genesis SX files parse cleanly. +- Bundle CID is deterministic (rebuild same sources → same CID). +- Bundle reload reproduces the exact same registry state. +- Tampered bundle → `verify_genesis` returns `{error, cid_mismatch}`. + +**Acceptance:** `bash next/tests/bootstrap.sh` passes; `next/data/genesis/bundle.cbor` +created with a known stable CID. + +--- + +## Step 5 — Registry mechanism + bootstrap dispatch + +**Deliverables:** + +Registries are gen_servers, one per kind, each holding the active version map: + +```erlang +% next/kernel/registry.erl +-module(registry). +-behaviour(gen_server). +-export([start_link/0, lookup/2, register/3, list/1]). +% Internal state: +% #{activity_types => #{Name => #{cid, schema_fn, semantics_fn, supersedes}}, +% object_types => ..., +% projections => ..., +% validators => ..., +% codecs => ..., +% sig_suites => ..., +% ...} + +lookup(Kind, Name) -> {ok, Entry} | {error, not_found}. +register(Kind, Name, Entry) -> ok | {error, Reason}. +list(Kind) -> [#{name, cid}]. +``` + +The `define-registry` projection's fold updates this gen_server's state when +new `Define*` activities arrive. (Bootstrapping circle resolved: at startup, +`bootstrap:load_genesis/1` populates the registry directly; from then on, the +projection fold maintains it.) + +**Tests:** +- After genesis load, `registry:list(activity_types)` returns Create/Update/Delete. +- `registry:lookup(activity_types, "Create")` returns the schema and semantics. +- A new `DefineActivity{name: "Pin"}` activity (synthesised, hand-signed for the + test) routes through the projection fold, ends up in the registry. +- Lookup never caches across activities (verified by introducing a new definition + mid-test and confirming the next lookup sees it). + +**Acceptance:** `bash next/tests/registry.sh` passes 10+ cases. + +--- + +## Step 6 — Validation pipeline + POST /activity + +**Deliverables:** + +```erlang +% next/kernel/pipeline.erl +-module(pipeline). +-export([validate_inbound/1, validate_outbound/1]). + +% Per design §14, run stages in order, halt on first failure. +validate_inbound(Activity) -> + Stages = [ + fun stage_envelope/1, + fun stage_signature/1, + fun stage_replay/1, + fun stage_audience/1, + fun stage_activity_schema/1, + fun stage_object_schema/1, + fun stage_content_validators/1, + fun stage_capabilities/1, + fun stage_trust/1 + ], + run_stages(Activity, Stages). + +validate_outbound(Activity) -> + % Subset of inbound stages (no replay, no trust check; auth done at HTTP layer) + ... +``` + +```erlang +% next/kernel/outbox.erl +-module(outbox). +-export([publish/2]). + +publish(ActorId, ActivityRequest) -> + Activity = construct_envelope(ActorId, ActivityRequest), + Signed = sig:sign(Activity, ActorId), + case pipeline:validate_outbound(Signed) of + ok -> + log:append(actor_log(ActorId), Signed), + projection:async_fold(Signed), + {ok, #{cid => cid:from_sx(Signed), + ap_id => maps:get(id, Signed)}}; + {error, Reason} -> + {error, Reason} + end. +``` + +**Tests:** +- Valid activity through full pipeline → appended to log. +- Bad envelope → 400, not in log. +- Bad signature → 401, not in log. +- Replayed activity → 200 duplicate, not re-appended. +- Schema violation (e.g. Create with no object) → 422. +- Activity logged before projection completes (async). + +**Acceptance:** `bash next/tests/pipeline.sh` passes 15+ cases. + +--- + +## Step 7 — Projection scheduler + +**Deliverables:** + +```erlang +% next/kernel/projection.erl +-module(projection). +-export([start_link/1, async_fold/1, query/2, snapshot/1]). +-behaviour(gen_server). + +% One gen_server per active projection. State: +% #{cid, name, fold_fn, current_state, log_tip, +% snapshot_dir, last_snapshot_at} + +% async_fold/1 broadcasts a new activity to every projection gen_server; +% each folds it into its own state. Failures (gas, sandbox violation) +% tag the activity but don't affect log durability. + +% query/2 returns current state (or state-as-of) +% snapshot/1 forces a snapshot now (also runs periodically) +``` + +```erlang +% next/kernel/sandbox.erl +-module(sandbox). +-export([eval_pure/2, eval_crypto/2, eval_effectful/3]). + +% eval_pure runs an SX function in pure mode: no IO platform, gas budget, +% deterministic. Used by projection folds, validators, audience predicates. +% Wrapper over the SX runtime evaluator with a stripped platform. +``` + +**Tests:** +- New activity → all projections fold it concurrently. +- Projection fold completes within gas budget. +- Gas-exhausting fold → activity tagged, projection state unchanged, no kernel crash. +- Sandbox violation (fold tries IO) → same handling. +- Snapshot create + reload → state matches. +- Snapshot CID stable across kernel restarts. + +**Acceptance:** `bash next/tests/projection.sh` passes 15+ cases. + +--- + +## Step 8 — HTTP server + endpoints + +**Deliverables:** + +Core endpoints (per design §16.1): + +``` +GET /actors/ # actor doc +GET /actors//outbox # OrderedCollection +GET /actors//outbox?page=true # OrderedCollectionPage +POST /activity # publish (auth: bearer token) +GET /artifacts/ # CID-addressed artifact +GET /artifacts//raw +GET /projections # list of projections +GET /projections/ # full state +GET /projections/?at= # time-travel +GET /projections// # indexed lookup +GET /define-registry +GET /.well-known/sx-capabilities +GET /.well-known/webfinger +``` + +```erlang +% next/kernel/http_server.erl +-module(http_server). +-export([start/1, route/1]). + +start(Port) -> + http:listen(Port, fun ?MODULE:route/1). + +route(Request) -> {Status, Headers, Body}. +``` + +Content negotiation per `Accept`: +- `application/activity+json` (default) +- `application/cbor` (dag-cbor) +- `application/json` (compact, no @context expansion) +- `application/sx` + +Auth on `POST /activity`: bearer token from env var `NEXT_PUBLISH_TOKEN`. + +**Tests:** +- Each endpoint returns expected shape for known artifact. +- Content negotiation: same artifact in 4 representations. +- 404 for unknown artifact CID. +- 401 for `POST /activity` without token. +- Pagination: outbox with > 50 activities returns OrderedCollectionPage. + +**Acceptance:** `bash next/tests/http.sh` passes 20+ cases. + +--- + +## Step 9 — Smoke tests + +**The proof points.** Two end-to-end smoke tests demonstrate, between them, that +fed-sx is genuinely a substrate for distributed reactive applications expressed +as data — not a system you extend by writing kernel code. + +- **9a — Pin smoke test (`next/tests/smoke_pin.sh`)** — verb extensibility: + defining a new activity type and projection at runtime via `Define*` + artifacts. Verifies the meta-level (§5). +- **9b — Reactive application smoke test (`next/tests/smoke_app.sh`)** — + application extensibility: defining a new subscription type, subscribing, + registering a trigger, and observing the full reactive loop fire end-to-end + without kernel code changes. Verifies §§18-19. + +Both must pass for milestone 1 acceptance. + +### Step 9a — Pin smoke test + +**Test script:** `next/tests/smoke_pin.sh` + +```bash +#!/usr/bin/env bash +set -euo pipefail + +# 0. Start a fresh fed-sx kernel (background) +./next/scripts/start.sh fresh +sleep 2 +TOKEN=$(cat next/data/keys/publish.token) + +# 1. Verify actor exists +curl -s http://localhost:9999/actors/next | jq -e '.type == "Person"' + +# 2. Verify outbox has actor's first Create{Person} +curl -s http://localhost:9999/actors/next/outbox?page=true \ + | jq -e '.orderedItems | length == 1 and .[0].type == "Create"' + +# 3. Verify Pin is NOT a known activity type +curl -s http://localhost:9999/define-registry?kind=activity_types \ + | jq -e '.[] | select(.name == "Pin") | length == 0' || exit 1 + +# 4. Publish DefineActivity{name: "Pin", schema: ..., semantics: ...} +PIN_DEF=$(cat <<'JSON' +{ + "type": "Create", + "object": { + "type": "DefineActivity", + "name": "Pin", + "schema": "(fn (act) (and (string? (-> act :object :path)) (cid? (-> act :object :cid))))", + "semantics": "(fn (state act) (assoc-in state [:pins (-> act :object :path)] (-> act :object :cid)))" + } +} +JSON +) +curl -s -X POST http://localhost:9999/activity \ + -H "Authorization: Bearer $TOKEN" \ + -H "Content-Type: application/activity+json" \ + -d "$PIN_DEF" | jq -e '.cid' > /dev/null + +# 5. Verify Pin IS now a known activity type +curl -s http://localhost:9999/define-registry?kind=activity_types \ + | jq -e '.[] | select(.name == "Pin") | length == 1' + +# 6. Also publish a DefineProjection{name: "pin-state"} that folds Pin into state +PIN_PROJ=$(cat <<'JSON' +{ + "type": "Create", + "object": { + "type": "DefineProjection", + "name": "pin-state", + "initial-state": "{}", + "fold": "(fn (state act) (if (= (:type act) \"Pin\") (assoc state (-> act :object :path) (-> act :object :cid)) state))" + } +} +JSON +) +curl -s -X POST http://localhost:9999/activity \ + -H "Authorization: Bearer $TOKEN" \ + -d "$PIN_PROJ" | jq -e '.cid' + +# 7. Now publish a Pin activity +PIN=$(cat <<'JSON' +{ + "type": "Pin", + "object": { + "type": "PinSpec", + "path": "/docs/intro", + "cid": "bafyreigh2akiscaildc3xqxx4xqxx4xqxx4xqxx4xqxx4xqxx4xqxx4xqxxe" + } +} +JSON +) +curl -s -X POST http://localhost:9999/activity \ + -H "Authorization: Bearer $TOKEN" \ + -d "$PIN" | jq -e '.cid' + +# 8. Verify Pin appears in outbox +curl -s http://localhost:9999/actors/next/outbox?page=true \ + | jq -e '.orderedItems | map(select(.type == "Pin")) | length == 1' + +# 9. Verify pin-state projection has the entry +sleep 1 # allow async projection +curl -s http://localhost:9999/projections/pin-state \ + | jq -e '."/docs/intro" == "bafyreigh2akiscaildc3xqxx4xqxx4xqxx4xqxx4xqxx4xqxx4xqxx4xqxxe"' + +# 10. Negative test: publish a malformed Pin (missing path) → expect 422 +BAD_PIN='{"type": "Pin", "object": {"cid": "bafy..."}}' +HTTP_STATUS=$(curl -s -o /dev/null -w "%{http_code}" -X POST http://localhost:9999/activity \ + -H "Authorization: Bearer $TOKEN" -d "$BAD_PIN") +[[ "$HTTP_STATUS" == "422" ]] || { echo "expected 422, got $HTTP_STATUS"; exit 1; } + +# 11. Restart kernel; verify state recovers +./next/scripts/stop.sh +./next/scripts/start.sh +sleep 2 +curl -s http://localhost:9999/projections/pin-state \ + | jq -e '."/docs/intro" == "bafyreigh2akiscaildc3xqxx4xqxx4xqxx4xqxx4xqxx4xqxx4xqxxe"' + +echo "✓ Pin smoke test passed — verb extensibility demonstrated end-to-end" +``` + +**Acceptance for 9a:** smoke test exits 0. The whole flow happens with **zero +fed-sx kernel code changes** between defining the verb and using it. + +### Step 9b — Reactive application smoke test + +**The bigger proof point.** Demonstrates that fed-sx supports distributed +reactive applications composed of `DefineSubscription` + `DefineTrigger` + +`DefineProjection` — the application model from §§18-19. + +The test runs on a single instance (federation is v2), so the "subscriber" and +"publisher" are the same actor. That's intentional — milestone 1 proves the +mechanism; milestone 2 spreads it across instances. + +**Test script:** `next/tests/smoke_app.sh` + +```bash +#!/usr/bin/env bash +set -euo pipefail + +# Assumes 9a has already run (fresh kernel optional; can run alongside). +TOKEN=$(cat next/data/keys/publish.token) +BASE=http://localhost:9999 + +# 1. Verify "Topic" subscription type and "Subscribe" verb are NOT yet defined. +curl -s "$BASE/define-registry?kind=subscription_types" \ + | jq -e 'map(select(.name == "Topic")) | length == 0' + +# 2. Publish DefineSubscription{name: "Topic", ...} +TOPIC_DEF=$(cat <<'JSON' +{ + "type": "Create", + "object": { + "type": "DefineSubscription", + "name": "Topic", + "schema": "(fn (sub) (string? (-> sub :tag)))", + "match": "(fn (sub act) (and (= (:type act) \"Note\") (member? (-> sub :tag) (or (-> act :object :tags) (list)))))", + "delivery": "{:default :push :modes (list :push :pull)}" + } +} +JSON +) +curl -s -X POST "$BASE/activity" \ + -H "Authorization: Bearer $TOKEN" -d "$TOPIC_DEF" | jq -e '.cid' + +# 3. Verify Topic IS now a known subscription type. +curl -s "$BASE/define-registry?kind=subscription_types" \ + | jq -e 'map(select(.name == "Topic")) | length == 1' + +# 4. Subscribe to the "smoketest" topic. +SUBSCRIBE=$(cat <<'JSON' +{ + "type": "Subscribe", + "object": {"type": "Topic", "tag": "smoketest"} +} +JSON +) +SUB_CID=$(curl -s -X POST "$BASE/activity" \ + -H "Authorization: Bearer $TOKEN" -d "$SUBSCRIBE" | jq -r '.cid') + +# 5. Verify subscriptions projection has the new entry. +sleep 1 +curl -s "$BASE/projections/subscriptions" \ + | jq -e '.["https://next.rose-ash.com/actors/next"] | map(select(.type == "Topic")) | length == 1' + +# 6. Define a projection that records matched activities (per-application +# namespace would happen via DefineApplication in v1.x; for v1 the +# projection is global to the actor). +TOPIC_PROJ=$(cat <<'JSON' +{ + "type": "Create", + "object": { + "type": "DefineProjection", + "name": "topic-events", + "initial-state": "{}", + "fold": "(fn (state act) (if (and (= (:type act) \"Note\") (member? \"smoketest\" (or (-> act :object :tags) (list)))) (assoc-in state [(:cid act)] act) state))" + } +} +JSON +) +curl -s -X POST "$BASE/activity" \ + -H "Authorization: Bearer $TOKEN" -d "$TOPIC_PROJ" | jq -e '.cid' + +# 7. Define a trigger: when a Topic{smoketest} subscription matches, publish +# a TestEcho activity. We need an "Echo" activity type first. +ECHO_DEF=$(cat <<'JSON' +{ + "type": "Create", + "object": { + "type": "DefineActivity", + "name": "TestEcho", + "schema": "(fn (act) (cid? (-> act :object :echoes)))", + "semantics": "(fn (state act) state)" + } +} +JSON +) +curl -s -X POST "$BASE/activity" \ + -H "Authorization: Bearer $TOKEN" -d "$ECHO_DEF" | jq -e '.cid' + +TRIGGER=$(cat <= 1" +curl -s "$BASE/define-registry?kind=triggers" \ + | jq -e 'map(select(.name == "echo-on-smoketest")) | length == 1' + +echo "✓ Reactive application smoke test passed — Subscribe + Trigger + Projection demonstrated end-to-end" +``` + +**What this proves (and what it doesn't):** + +Proves: +- `DefineSubscription` + `Subscribe` mechanism works end-to-end. +- Subscription's `match-fn` evaluates correctly in pure mode against inbound + activities. +- `DefineTrigger` fires on subscription matches. +- Trigger's `then-sx` can publish derived activities (the `:publish` result). +- Cascade-depth metadata propagates correctly. +- Subscription state, trigger registration, and projection state all survive + kernel restart (snapshot + log replay). +- The full reactive application loop works without any kernel code changes + between defining the components and exercising them. + +Does NOT prove (deferred to milestone 2+): +- Cross-instance subscriptions (federation). +- Trigger `:effect` results calling effectful primitives. +- `DefineApplication` bundle install/update/fork. +- Per-application namespace isolation. +- Cascade prevention against malicious cascading from peer instances. + +**Acceptance for 9b:** smoke test exits 0. Like 9a, **zero fed-sx kernel code +changes** between defining the application components and observing them +operate. + +--- + +## Acceptance criteria for milestone 1 + +All of: + +1. **Each step's test suite passes** (`bash next/tests/.sh`). +2. **Both smoke tests pass** (`bash next/tests/smoke_pin.sh` and + `bash next/tests/smoke_app.sh`). +3. **Erlang-on-SX baseline preserved** — adding fed-sx kernel modules in + `next/kernel/*.erl` doesn't break Phase 1-8 conformance. +4. **Restart durability** — kill the kernel mid-write, restart, projections + resume from snapshot, no log corruption. +5. **Manual Mastodon poke** — point a Mastodon account at + `https://next.rose-ash.com/actors/next` and verify the actor doc fetches and + webfinger discovery works (read-only AP interop, no follow). + +## What lands when + +This is the work-order an agent (or human) follows. Steps 1-3 can be done in +parallel after the Erlang Phase 8 BIFs land. Steps 4-7 are sequential. Step 8 +can start in parallel with step 7. Step 9 is the integration test. + +``` +Phase 7+8 (loops/erlang) ───┐ + │ + ▼ + ┌─── Step 1 ──┬─── Step 2 ──┬─── Step 3 + │ │ │ + └─────────────┼─── Step 4 ──┴────┐ + │ │ + └─── Step 5 ───────┤ + │ + Step 6 ─────┤ + │ + Step 7 ─────┤ + │ + Step 8 ─────┤ + │ + Step 9 ─────┘ +``` + +Estimated effort if done by a focused agent loop, one feature per iteration: +~30-50 commits across all 9 steps. Could plausibly be a `loops/fed-sx` workstream +once Phase 7+8 are done. + +## What's deferred to milestone 2 + +- **Federation** (the second-biggest piece). `POST /inbox`, Follow lifecycle, + delivery queue, backfill, capability negotiation between peers. Whole of + design §13. +- **Multi-actor** with per-user OAuth and capability tokens. Design §9.5. +- **IPFS storage backend** as a `DefineStorage` entry. Design §15.3. +- **Browser client + operator dashboard** (probably in Elm-on-SX or similar). +- **Rich verbs**: `Endorse`, `Supersede`, `Test`, `Build`, `Compose`, `Note`, + `Announce`. All defined as `DefineActivity` artifacts, federated. +- **Cross-host conformance** — Python/JS/Haskell hosts running fed-sx. Design + §11.8. +- **OpenTimestamps proofs** as a `DefineProof` entry. +- **Performance work** — JIT-compiled folds, snapshot acceleration, federation + batching. + +Milestone 2 unlocks "real federation between two fed-sx instances." Milestone 3 +is the rose-ash port (blog, market, events, federation, account, orders) as +fed-sx applications. + +--- + +## Appendix A: open questions for milestone 1 + +A few things still under-specified; resolve as work begins. + +1. **HTTP server library.** Does the Phase 8 `http:listen/2` BIF wrap an + existing OCaml HTTP server (the sx.rose-ash.com one) or something simpler? + Implementation choice deferred to Phase 8. +2. **JSON-LD library.** AP wire format requires JSON-LD canonicalization for + signature coverage. Either pull a library or write a minimal subset for the + shapes we actually use. Probably the latter — our envelope is well-defined. +3. **Bearer token rotation.** v1 uses a single env-var token. Token rotation + without restart needs registry-style mgmt; can wait. +4. **Snapshot rate limits.** Default in design is "every 1000 activities or + 60 seconds." Tunable per-projection later; v1 uses the default. +5. **Genesis bundle format.** Dag-cbor map per §12.2; concrete schema needs + one round of refinement once we author the actual definitions in step 4. diff --git a/plans/sx-vm-opcode-extension.md b/plans/sx-vm-opcode-extension.md new file mode 100644 index 00000000..034515bb --- /dev/null +++ b/plans/sx-vm-opcode-extension.md @@ -0,0 +1,430 @@ +# SX VM Opcode Extension Mechanism + +Mechanism in `hosts/ocaml/evaluator/` that lets language ports register +specialized bytecode opcodes without modifying the SX VM core. Direct +prerequisite for **erlang-on-sx Phase 9** (the BEAM analog) and a structural +enabler for any future language port that wants performance-critical opcodes. + +Reference: `plans/erlang-on-sx.md` Phase 9, `plans/fed-sx-design.md` §17.5, +`hosts/ocaml/lib/sx_vm.ml` (current VM). + +Status: **design** — implementation pending. Sister workstream to the +`loops/erlang` loop, but lives in `hosts/`, not `lib/erlang/`. + +--- + +## Goal + +Allow language ports to register custom bytecode opcodes in the SX VM, with: + +- **Zero overhead for core opcodes.** Existing 37 opcodes (per `sx_vm.ml`) + must dispatch identically. No regression for any existing language port or + the core SX runtime. +- **One additional dispatch step for extension opcodes.** Acceptable cost; the + win comes from avoiding the general CEK machinery. +- **Per-extension state slot.** Erlang's process scheduler, Haskell's thunk + cache, etc. need somewhere to hang state alongside the VM. +- **Compiler awareness.** The bytecode compiler (`lib/compiler.sx`) must be + able to emit extension opcodes by name, looked up against the registered + set. +- **JIT compatibility.** Existing JIT (lazy lambda compilation) continues to + work for code paths using only core opcodes. Extension opcodes are + interpreted in v1; JITing them is a follow-up. + +## Non-goals + +- **Hot opcode reload.** Adding/replacing opcodes mid-runtime is not in + scope. Extensions are compile-time additions to the OCaml binary. (If + needed, that's a separate project.) +- **Per-instance opcode sets.** All running instances of the SX VM share + the same opcode set determined at build time. Selective opcode loading + per instance is out of scope. +- **Opcode hot-swap or supersession.** Once registered, opcodes are stable + for the lifetime of the binary. +- **Language-port isolation at the dispatch layer.** Two language ports can + see each other's opcodes (they share the dispatch table). Isolation is a + build-time concern — don't compile in extensions you don't trust. + +--- + +## Why now + +The Erlang-on-SX Phase 9 work needs this. Without it, Phase 9b-9g (the actual +opcode implementations) have nowhere to plug in. The Erlang loop will hit +this dependency as a Blocker; this design is what unblocks it. + +It also enables the **shared opcode pattern** discussed in `plans/fed-sx- +design.md` §17.5: opcodes Erlang Phase 9 produces that other ports could +plausibly use (pattern match, perform/handle, record access) get chiselled +out to `lib/guest/vm/` when a second port has an actual second use. Without +the extension mechanism, each port would have to fork the SX VM core or +modify shared dispatch — neither acceptable. + +--- + +## Architectural overview + +``` + ┌──────────────────────────────────────────┐ + │ SX VM core (hosts/ocaml/lib/sx_vm.ml) │ + │ │ + │ ┌────────────────────────────────────┐ │ + │ │ Bytecode dispatch loop │ │ + │ │ │ │ + │ │ match op with │ │ + │ │ | 1 (OP_CONST) -> ... │ │ + │ │ | 2 (OP_NIL) -> ... │ │ + │ │ | ... │ │ + │ │ | 199 -> ... (last core opcode) │ │ + │ │ | op when op >= 200 -> │ │ + │ │ Extensions.dispatch op vm │ │ ◄── new + │ │ frame │ │ + │ └────────────────────────────────────┘ │ + │ │ + │ ┌────────────────────────────────────┐ │ + │ │ Extension registry │ │ + │ │ opcode_id -> handler │ │ ◄── new + │ │ opcode_name -> opcode_id │ │ + │ │ extension_state per extension │ │ + │ └────────────────────────────────────┘ │ + └──────────────────────────────────────────┘ + ▲ + │ register at startup + ┌──────────────────┴──────────────────────┐ + │ Extension modules │ + │ hosts/ocaml/extensions/erlang.ml │ + │ hosts/ocaml/extensions/haskell.ml │ + │ hosts/ocaml/extensions/datalog.ml │ + │ hosts/ocaml/extensions/guest_vm.ml │ ◄── shared opcodes + └─────────────────────────────────────────┘ +``` + +### Opcode ID space partition + +Current SX VM uses opcode IDs in roughly the range 1-162 (per inspection of +`sx_vm.ml`). We partition the 0-255 space: + +| Range | Use | +|-------|-----| +| 0 | reserved / NOP | +| 1-127 | **core opcodes** — owned by the SX VM, locked schema | +| 128-199 | **`lib/guest/vm/` shared opcodes** — chiselled-out shared opcodes | +| 200-247 | **language-port opcodes** — registered by extensions | +| 248-255 | reserved for future expansion / multi-byte opcodes | + +This gives ~50 slots for shared opcodes (Phase 1-2 of `lib/guest/vm/` will +not exhaust this; we can renegotiate if it does), ~50 for any single language +port's specialized opcodes, and clean separation that makes it obvious which +opcodes are stable (core), shared (guest), or port-specific (extension). + +If we need more than 256 opcodes total, multi-byte opcodes (a leading 248-255 +byte plus a second byte) extend the space without breaking the schema. + +### Extension module signature + +```ocaml +(* hosts/ocaml/lib/sx_vm_extension.ml *) + +(** A handler for an extension opcode. Reads operands from bytecode, + manipulates the VM stack, updates the frame's instruction pointer. + May raise exceptions (which propagate via the existing VM error path). *) +type handler = vm -> frame -> unit + +(** State an extension carries alongside the VM. Opaque to the VM core; + extensions cast as needed. *) +type extension_state = .. + +module type EXTENSION = sig + (** Stable name for this extension (e.g. "erlang", "guest_vm"). *) + val name : string + + (** Initialize per-instance state. Called once when the VM starts and the + extension is loaded. *) + val init : unit -> extension_state + + (** Opcodes this extension provides. Each is (opcode_id, opcode_name, handler). + opcode_id must be in the range allowed for this extension's tier + (128-199 for guest, 200-247 for ports). Conflicts cause startup failure. *) + val opcodes : extension_state -> (int * string * handler) list +end +``` + +### Registration and dispatch + +```ocaml +(* hosts/ocaml/lib/sx_vm_extensions.ml *) + +let extensions : (module EXTENSION) list ref = ref [] +let states : (string, extension_state) Hashtbl.t = Hashtbl.create 8 +let by_id : (int, handler) Hashtbl.t = Hashtbl.create 64 +let by_name : (string, int) Hashtbl.t = Hashtbl.create 64 + +let register (m : (module EXTENSION)) = + let module M = (val m) in + let st = M.init () in + Hashtbl.add states M.name st; + List.iter (fun (id, name, h) -> + if Hashtbl.mem by_id id then + failwith (Printf.sprintf "Opcode %d (%s) already registered" id name); + Hashtbl.add by_id id h; + Hashtbl.add by_name name id + ) (M.opcodes st); + extensions := m :: !extensions + +let dispatch op vm frame = + match Hashtbl.find_opt by_id op with + | Some handler -> handler vm frame + | None -> raise (Invalid_opcode op) + +let id_of_name name = Hashtbl.find_opt by_name name +let state_of_extension name = Hashtbl.find_opt states name +``` + +The dispatch path adds **one hashtable lookup per extension opcode**. +Acceptable cost — and Erlang's specialized opcodes win >100× over going +through the general CEK machine, so the overhead is negligible by comparison. + +### Bytecode compiler integration + +The compiler (`lib/compiler.sx`) needs to know extension opcode IDs to emit +them. New SX primitive exposed to the compiler: + +```sx +(extension-opcode-id "erlang.OP_PATTERN_TUPLE_2") ; → 200, or nil if not loaded +``` + +When the compiler wants to emit a specialized opcode, it queries by name. If +the extension isn't loaded, the compiler falls back to the general path +(emit a `CALL_PRIM` or general SX `case`). This means a language port's +optimization is opt-in per build, and missing extensions degrade to slower +correct execution rather than failure. + +Naming convention: `.OP_`. So `erlang.OP_PATTERN_TUPLE_2`, +`guest_vm.OP_PERFORM`, etc. + +### Per-extension state access + +Some opcodes need state beyond the VM stack (Erlang's scheduler, mailbox +state, etc.). Extensions store state in their `init`-returned value, accessed +via `state_of_extension`: + +```ocaml +let op_spawn vm frame = + let st = Sx_vm_extensions.state_of_extension "erlang" + |> Option.get + |> Obj.magic in (* extension casts to its known type *) + let body = pop vm in + let pid = Erlang_scheduler.spawn st body in + push vm (pid_value pid); + frame.ip <- frame.ip + 1 +``` + +Shared scheduler state lives in the Erlang extension's state value. Other +extensions don't see it. + +--- + +## Phase plan + +Five sub-phases in dependency order. Each is testable in isolation. + +### Phase A — Opcode ID partition + dispatch fallthrough + +Smallest viable change to `sx_vm.ml`: + +- Add the `| op when op >= 128 -> Sx_vm_extensions.dispatch op vm frame` + fallthrough case. +- Document the partition in a comment at the top of the opcode list. + +**Tests:** +- All existing SX VM tests pass unchanged (zero regression for core). +- Calling `dispatch 200 ...` with no extension registered raises + `Invalid_opcode 200`. + +**Effort:** small. ~50 lines + tests. + +### Phase B — Extension registry module + +`hosts/ocaml/lib/sx_vm_extensions.ml` per the sketch above. Pure plumbing, no +opcodes yet. + +**Tests:** +- Register a test extension with one opcode; dispatch finds it. +- Duplicate opcode-id registration fails at startup. +- `id_of_name` and `state_of_extension` lookups work. + +**Effort:** small. ~150 lines + tests. + +### Phase C — Compiler-side opcode lookup primitive + +Expose `extension-opcode-id` as an SX primitive in `hosts/ocaml/lib/`. The +compiler in `lib/compiler.sx` can call it to emit extension opcodes by name. + +Does not require any extension to actually exist — the primitive returns +`nil` for unknown names, and the compiler falls back. + +**Tests:** +- Primitive returns nil for unknown name. +- After registering a test extension, primitive returns the registered ID. + +**Effort:** small. Single primitive registration + compiler-side use docs. + +### Phase D — Test extension demonstrating end-to-end flow + +A dummy extension at `hosts/ocaml/extensions/test_ext.ml` registering one or +two trivial opcodes (e.g. `OP_TEST_PUSH_42`, `OP_TEST_DOUBLE_TOS`). Wired +into the build, available when running tests. + +Compiler test: write SX that triggers the test compiler-extension to emit +`OP_TEST_PUSH_42`, then verify the VM executes it correctly via +`bytecode-inspect` and `vm-trace`. + +**Tests:** +- Bytecode emission via name lookup produces the right ID. +- Execution produces the expected stack effect. +- `bytecode-inspect` shows the opcode by name. +- `vm-trace` correctly reports the extension opcode. + +**Effort:** small. ~100 lines including build wiring. + +### Phase E — JIT awareness (interpreted-only for v1) + +The JIT (lazy lambda compilation) currently compiles based on opcode ranges. +Extension opcodes (≥128) should fall through to interpretation, not be +JIT-compiled in v1. + +- Mark extension opcodes as "interpret only" in the JIT pre-analysis. +- A lambda containing only core opcodes JIT-compiles as before. +- A lambda containing any extension opcode runs interpreted. + +JITing extension opcodes is a follow-up project; v1 keeps the JIT scope +unchanged and just makes it correctly route mixed bytecode. + +**Tests:** +- Lambda with only core opcodes: JIT-compiled, fast path. +- Lambda with extension opcode: interpreted, correct result. +- Mixed lambda: interpreted, correct result. + +**Effort:** small-medium. Requires understanding the JIT's pre-analysis +(per `project_jit_compilation.md` memory: "Lazy JIT implemented: lambda +bodies compiled on first VM call, cached, failures sentinel-marked"). +Extension-opcode detection becomes another reason to mark a lambda +"interpret-only." + +--- + +## Acceptance criteria + +1. **Phase A-D pass their test suites.** +2. **Zero regression on existing SX VM tests.** All language-port test + suites currently passing on the architecture branch (Erlang 530+, Haskell + 285+, Datalog 276+, Smalltalk 625+, the SX core test suite, etc.) still + pass. +3. **Test extension demonstrates the flow end-to-end.** SX source compiles + via the compiler with a registered extension opcode, executes through the + VM via the dispatch fallthrough, returns correct result. +4. **Documentation:** README in `hosts/ocaml/extensions/` explaining the + pattern, with a worked example (the test extension is the canonical one). + +After acceptance, the Erlang-on-SX Phase 9 work in `lib/erlang/vm/` can use +this mechanism. The Erlang loop's Blocker for 9a is resolved. + +--- + +## Risk and mitigation + +**Risk: regression in core opcode dispatch.** A misplaced `match` arm could +break something. *Mitigation:* run every existing language-port test suite +before merging. The cost of this verification is real — probably an hour of +machine time — but cheaper than discovering it after the fact. + +**Risk: opcode ID conflicts as more extensions land.** If Erlang Phase 9 +claims IDs 200-220 and Haskell wants 215-235, we have a problem. +*Mitigation:* maintain a registry document at `hosts/ocaml/extensions/ +README.md` listing claimed ID ranges per extension. Convention: each +extension claims a contiguous block at first registration; collisions caught +at startup with a clear error. + +**Risk: extension state types leak through `Obj.magic`.** The extension state +is type-erased in the registry. *Mitigation:* extensions cast in their own +opcode handlers, never expose state to other extensions or the VM core. +First-class modules / GADTs could add more type safety; deferred unless +this becomes a concrete pain point. + +**Risk: extensions become a back door for kernel mutation.** An extension +opcode handler has full access to the VM. *Mitigation:* extensions are +build-time additions, not runtime; they're as trusted as the rest of the +binary. Operators audit at build time, not runtime. Same trust model as +any other compiled-in code. + +**Risk: shared `lib/guest/vm/` opcodes evolve under different language +ports' needs.** *Mitigation:* the chiselling discipline (move to guest only +on second use) ensures the shared opcodes are tested against at least two +ports' actual usage before being considered stable. + +--- + +## Open questions + +To be resolved during implementation, not blocking design approval: + +1. **Multi-byte opcode encoding.** If we need >256 opcodes total, the + leading-byte 248-255 schema accommodates it. Do we need multi-byte at + v1? Probably not — 200+ opcodes per port is more than any port should + reasonably want. +2. **Extension ordering matters?** If two extensions register opcodes that + read the same VM state, ordering of registration could matter for + initialization. Probably not in practice; flag if it bites. +3. **Hot-reload of extensions.** Out of scope for v1 (per non-goals). If + wanted later, the registry would need teardown + re-registration; the + `gen_server` `code_change/3` model from Erlang Phase 7 is a precedent. +4. **Cross-extension opcode composition.** Can `guest_vm.OP_PERFORM` invoke + `erlang.OP_RECEIVE_SCAN`? In principle yes — handlers can do anything. + The interface is clean; the question is whether we want any conventions + to keep ergonomics tractable. Defer until composition appears in + practice. + +--- + +## Implementation roadmap and sequencing + +This is a sister workstream to `loops/erlang`. Probably best as a single +focused session (not a continuous loop — the work is bounded, ~1-2 weeks +of focused effort, not iterative). + +Recommended sequencing: + +1. **A + B + C land together** as a single PR — they're tightly coupled and + easier to test as a unit. Branch: `loops/sx-vm-extensions` or similar. +2. **D follows** in a second PR; demonstrates the end-to-end flow without + committing to any real language port's opcode design. +3. **E (JIT integration)** as a third PR, once the basic mechanism is + battle-tested. +4. **Extension scope check:** verify Erlang's Phase 9 sub-phases 9b-9g can + actually use this mechanism. If gaps surface, they're addressable + incrementally. +5. **`hosts/ocaml/extensions/erlang.ml`** then becomes the *first real + consumer* — written by whoever takes over from the Erlang loop's stub + dispatcher. That's the integration moment that closes the loop. + +Estimated total effort: 1-2 weeks for one focused engineer with OCaml SX VM +familiarity. Much less if the implementer already knows `sx_vm.ml`. + +--- + +## Relationship to other plans + +- **`plans/erlang-on-sx.md` Phase 9:** unblocked by this work. Erlang loop + develops opcodes against a stub dispatcher in `lib/erlang/vm/`; once this + mechanism lands, swap stub for real registration via + `hosts/ocaml/extensions/erlang.ml`. +- **`plans/fed-sx-design.md` §17.5:** documents this as Layer-1 prerequisite. + The shared-opcode discipline (lib/guest/vm/) is designed on top of this + mechanism's `lib/guest/vm/` namespace allocation. +- **Future language ports (Haskell, Datalog, Smalltalk perf phases):** will + use the same mechanism. Each adds an extension module, claims an opcode + range, registers handlers. The `lib/guest/vm/` opcodes get + cross-referenced when the second port's needs justify chiselling. +- **JIT roadmap (per `project_jit_architecture.md` memory):** extension + opcodes are interpreted in v1. JITing them is a logical follow-up but + a separate project.