diff --git a/plans/erlang-on-sx.md b/plans/erlang-on-sx.md
index 8ca25a63..9492e3c3 100644
--- a/plans/erlang-on-sx.md
+++ b/plans/erlang-on-sx.md
@@ -122,10 +122,30 @@ Replace today's hardcoded BIF dispatch (`er-apply-bif`/`er-apply-remote-bif` in
 - [ ] `sqlite:open/1`, `sqlite:close/1`, `sqlite:exec/2`, `sqlite:query/2` — **BLOCKED** (no SQLite primitive). See Blockers.
 - [x] Tests: 1 round-trip per BIF; suite name `ffi`; conformance scoreboard auto-picks it up — **+14 ffi tests** at 637/637 total. Suite covers the 3 implemented file BIFs (9 tests: write-ok, read-ok-tag, payload-is-binary, byte_size content, missing-enoent, bad-path-enoent, binary-payload round-trip, delete-ok, read-after-delete-enoent) plus 5 negative asserts (one per blocked BIF — `crypto:hash`/`cid:from_bytes`/`file:list_dir`/`httpc:request`/`sqlite:exec`) so this suite fails fast if a future iteration adds a wrapper without registering proper tests. Target "+40 ffi tests" was relative to the original 5-BIF-family plan; with 5 of those families blocked on host primitives, the achievable count is 14 — the suite scaffolding is what matters and is ready to accept the remaining tests when the primitives land.
 
+### Phase 9 — specialized opcodes (the BEAM analog)
+
+**Driver:** Erlang-on-SX going through the general-purpose CEK machine has architectural perf ceilings (call/cc per receive, env-copy per call, mailbox rebuild on delete). The fix is specialized bytecode opcodes that bypass the general machinery for hot Erlang operations. Targets: 100k+ message hops/sec, 1M-process spawn in under 30sec. Layered perf strategy: Layer 1 (this) = specialized opcodes; Layer 2 (Phase 10, deferred) = multi-core scheduler.
+
+**Architectural note:** opcodes get developed in `lib/erlang/vm/` (in scope). The **opcode extension mechanism in `hosts/ocaml/`** (Phase 9a) is **out of scope** for this loop — log as Blocker until a session that owns `hosts/` lands it. Sub-phases 9b-9g design and test opcodes against a stub dispatcher in the meantime; integrate when 9a is available.
+
+**Shared-opcode discipline:** opcodes that another language port could plausibly use (pattern match, perform/handle, record access) get prepared for **chiselling out to `lib/guest/vm/`** when a second use materialises. Same lib/guest pattern, applied at the bytecode layer. Don't pre-extract; do annotate candidates in commit messages.
+
+- [ ] **9a — Opcode extension mechanism** (in `hosts/ocaml/evaluator/`) — **OUT OF SCOPE for this loop**. Log as Blocker. Lets `lib/<lang>/vm/` register opcodes without modifying SX VM core. Design lives in `plans/sx-vm-opcode-extension.md`.
+- [ ] **9b — `OP_PATTERN_TUPLE` / `OP_PATTERN_LIST` / `OP_PATTERN_BINARY`**: specialized pattern-match opcodes for Erlang's bread-and-butter `case` clauses. Replace SX-`case` dispatch on the hot path. Tests: every pattern shape, including nested. Conformance must remain 637/637 + all prior. Candidate for chiselling to `lib/guest/vm/match.sx`.
+- [ ] **9c — `OP_PERFORM` / `OP_HANDLE`** (algebraic effects style): replace the call/cc + raise/guard machinery used for `receive` suspension. Pure Erlang interface unchanged; underlying mechanism specialized. Candidate for chiselling (Scheme call/cc, OCaml 5 effects, miniKanren all want the same thing).
+- [ ] **9d — `OP_RECEIVE_SCAN`**: built on 9c. Specialized opcode for selective receive — scans mailbox in pattern order, suspends + binds on match. Should give 10-100× speedup on receive-heavy workloads (ring benchmark, bank, fib_server).
+- [ ] **9e — `OP_SPAWN` / `OP_SEND` + lightweight scheduler**: per-process register/heap layout, scheduler that runs Erlang bytecode units rather than going through general SX evaluator each time. Process record fields become VM register slots. Target: spawn cost under 50µs, send cost under 5µs.
+- [ ] **9f — BIF dispatch table**: `OP_BIF_<name>` for hot BIFs (`length/1`, `hd/1`, `tl/1`, `element/2`, `lists:reverse/1`, etc.) — direct dispatch, no registry lookup. Cold BIFs continue through the general dispatch path.
+- [ ] **9g — Conformance + perf bench**: full Phase 1-8 conformance must pass on the new VM. Ring benchmark target: **100k+ hops/sec at N=1000** (current ~30/sec → ~3000× speedup target). 1M-process spawn target: **under 30 seconds** (current ~9h extrapolation → ~1000× speedup target). Document achieved numbers in `lib/erlang/bench_ring_results.md`.
+
+**Acceptance:** ring benchmark hits the 100k hops/sec target. All prior phase tests pass. Two opcodes chiselled to `lib/guest/vm/` (or annotated as candidates with a written rationale).
+
 ## Progress log
 
 _Newest first._
 
+- **2026-05-14 Phase 9 scoped + supporting plan files synced** — Copied three plan files from `/root/rose-ash/plans/` (architecture branch) that this worktree was missing: `fed-sx-design.md` (124KB, the substrate design referenced from Phase 7/8 drivers), `fed-sx-milestone-1.md` (33KB, first concrete implementation milestone), `sx-vm-opcode-extension.md` (19KB, the prerequisite for Phase 9a — designs how `lib/<lang>/vm/` registers opcodes against the OCaml SX VM core). Then appended **Phase 9 — specialized opcodes (the BEAM analog)** to `plans/erlang-on-sx.md` covering sub-phases 9a-9g: 9a (opcode extension mechanism in `hosts/ocaml/`) is out-of-scope for this loop (will be logged as a Blocker when the next iteration tries to start it); 9b-9g (PATTERN_TUPLE/LIST/BINARY, PERFORM/HANDLE, RECEIVE_SCAN, SPAWN/SEND + lightweight scheduler, BIF dispatch table, conformance + perf bench) can be designed and tested against a stub dispatcher in the meantime. Targets: ring benchmark 100k+ hops/sec at N=1000 (~3000× speedup), 1M-process spawn under 30sec (~1000× speedup). Plan framing intact for Phase 7/8 — those reflect the actual implementation done in this loop; the architecture-branch framing diverges in language but the work is equivalent. No code touched this iteration. Total **637/637** unchanged.
+
 - **2026-05-14 ffi test suite extracted, conformance scoreboard auto-picks it up** — New `lib/erlang/tests/ffi.sx` with its own counter trio (`er-ffi-test-count`/`-pass`/`-fails`) and `er-ffi-test` helper following the same pattern as runtime/eval/ring tests. The 10 file BIF eval tests from the previous iteration moved out of `eval.sx` (eval dropped from 395 to 385 tests) and into the new suite where they're now 9 tests (consolidated the two write+read tests). `conformance.sh` updated: added `ffi` to `SUITES` array with `er-ffi-test-pass`/`-count` symbols, added `(load "lib/erlang/tests/ffi.sx")` after `fib_server.sx`, added `(epoch 109) (eval "(list er-ffi-test-pass er-ffi-test-count)")`. Scoreboard markdown auto-updated to include the row. Suite also asserts that the 5 blocked BIFs (`crypto:hash`, `cid:from_bytes`, `file:list_dir`, `httpc:request`, `sqlite:exec`) are NOT yet registered — turns a future "added the wrapper but forgot to extend ffi tests" into a hard failure. One eval-comparison gotcha en route: SX's `=` does identity equality on dicts so comparing two separately-constructed `(er-mk-atom "true")` values is false; the existing eval suite has an `eev-deep=` helper that handles this, but the simpler fix in ffi was to extract `:name` via `ffi-nm` and compare strings. Total **637/637** (+14 ffi). Phase 8 fully ticked aside from the BLOCKED bullets — those remain unchecked with explicit Blockers references.
 
 - **2026-05-14 file BIFs landed; crypto/cid/list_dir/http/sqlite blocked on missing host primitives** — Three new FFI BIFs registered in `runtime.sx`: `file:read_file/1`, `file:write_file/2`, `file:delete/1`. Each wraps the SX-host primitive (`file-read`, `file-write`, `file-delete`) inside a `guard` that converts thrown exception strings into Erlang `{error, Reason}` tuples. New helper `er-classify-file-error` does loose pattern-matching on the error message using `string-contains?` to map to standard POSIX-style reasons: `"No such"` → `enoent`, `"Permission denied"` → `eacces`, `"Not a directory"` → `enotdir`, `"Is a directory"` → `eisdir`, fallback `posix_error`. Filenames coerce through `er-source-to-string` so SX strings, Erlang binaries, and Erlang char-code lists all work. Read returns `{ok, Binary}` (bytes via `(map char->integer (string->list ...))` then `er-mk-binary`); write returns bare `ok`; delete returns bare `ok`. Bootstrap registrations added at the bottom of `er-register-builtin-bifs!` under `"file"`. 10 new eval tests: write-then-read round-trip, ok-tag, payload is binary, byte_size content, missing-file `enoent`, delete-ok, read-after-delete `enoent`, write to non-existent dir `enoent`, binary payload (5 raw bytes) round-trip preserving byte count. Blockers entry added covering five Phase 8 BIFs whose host primitives don't exist in this SX runtime: `crypto:hash/2`, `cid:from_bytes/1`/`to_string/1`, `file:list_dir/1`, `httpc:request/4`, `sqlite:open/exec/query/close`. Fix path documented inline (architecture-branch iteration to register OCaml-side primitives). Total **633/633** (+10 eval).
diff --git a/plans/fed-sx-design.md b/plans/fed-sx-design.md
new file mode 100644
index 00000000..62e811d1
--- /dev/null
+++ b/plans/fed-sx-design.md
@@ -0,0 +1,2638 @@
+# fed-sx — Federated SX Activity Substrate
+
+A federated, content-addressed, extensible application substrate where the unit of
+computation is a signed activity, the unit of state is a pure SX projection over the
+activity log, and the substrate's own extensibility (new verbs, new object types, new
+projections, new validators) is itself published through the same mechanism.
+
+Status: **design** — not yet implemented. Target subdomain: `next.rose-ash.com`.
+Target location in repo: `next/` (new top-level dir, sibling to `blog/`, `market/`,
+etc.). Stack: pure SX-on-OCaml. Implementation language(s) to be chosen after design
+is complete.
+
+---
+
+## 1. Premise
+
+ActivityPub's data model — actors, signed activities, inboxes/outboxes — generalises
+beyond social posting to any domain where state evolves via signed messages. fed-sx
+takes that generalisation seriously:
+
+- The unit of communication is a **signed AP activity**.
+- The unit of content is an **AP object**, content-addressed by **CID** (multihash +
+  multicodec, default `dag-cbor` over the parsed SX AST).
+- State is the **deterministic fold** of pure SX functions over the activity log.
+- The substrate is **self-extending**: new activity types, object types, projections,
+  validators, codecs, transports, and signature suites are themselves published as
+  `Define*` activities — federated like any other content.
+
+Three commitments make the rest fall into place:
+
+1. **The kernel is dumb.** It only knows envelope shape, signature verification,
+   append-to-log, fetch-by-id, transport in/out. It does not know what `Create` or
+   `Pin` *mean*.
+2. **Everything else is registry-driven.** Verbs, object types, validators, projections,
+   codecs, transports, audiences, proofs, sig suites — all looked up in registries the
+   kernel calls into.
+3. **The registries are themselves publishable.** New entries arrive as `Define*`
+   activities. Bootstrap registries load from a known set of CIDs at startup; everything
+   else is replayed from the log.
+
+Result: the only code that ever needs to change in the kernel is the envelope itself.
+New verbs = published SX, federated like any other artifact.
+
+---
+
+## 2. CIDs and content addressing
+
+Every artifact has a CID. Default codec is **dag-cbor** over the parsed SX AST (not
+the raw text). This buys:
+
+- **Sub-AST addressing for free.** Each nested structure has an implicit CID; IPLD can
+  walk paths like `<file-cid>/components/card`. The "file CID *and* component CID"
+  question dissolves: every node is a CID, you choose the granularity at reference
+  time.
+- **Polyglot canonicalization.** JS, OCaml, Python only need to agree on AST shape +
+  CBOR's deterministic encoding (RFC 8949 §4.2.1). No byte-identical pretty-printer
+  required across hosts.
+- **Format immunity.** Reformatting, indent changes, equivalent-form normalisations
+  do not change the CID.
+- **Tooling fit.** sx-tree already has the parsed form in memory; computing or
+  verifying a CID is just an encode + hash.
+
+Costs accepted:
+- One spec to maintain: SX↔CBOR mapping (number → CBOR int/float, string → text,
+  symbol → tag, keyword → tag, list → array, dict → map). ~50 lines of code per host.
+- Author's exact source text is not preserved; re-pretty-print on fetch.
+- "Why don't these CIDs match" requires comparing CBOR (a `cid-explain` tool helps).
+
+The CID format itself is multicodec-agile: the substrate also accepts `raw`,
+`dag-json`, `dag-pb`, etc. when seen, dispatched via the codec registry.
+
+---
+
+## 3. Kernel surface (fixed — get this right)
+
+The kernel is the only thing that's hard to change later. Everything else is in
+registries. Two envelope shapes plus five operations.
+
+### 3.1 Activity envelope
+
+```
+{ id, type, actor, published,
+  to, cc, audience-extras,
+  object | target | origin | result,    # AP slots, opaque to kernel
+  capabilities-required: [...],         # so receivers can refuse cleanly
+  proofs: [...],                        # OTS, on-chain, multi-sig — all opaque
+  signature: { key-id, algorithm, value, covered-fields } }
+```
+
+### 3.2 Object envelope
+
+```
+{ id, type, cid, media-type,
+  where: inline | cid | url,
+  content?, link? }                     # only one populated based on `where`
+```
+
+### 3.3 Kernel verbs
+
+The only verbs implemented directly by the kernel:
+
+- **Append signed activity** to outbox (after envelope check + sig verify + validator
+  pipeline).
+- **Verify signature** against actor's published keys, time-aware (which key was
+  active at `published`).
+- **Fetch** by `id` or by `cid`.
+- **Receive at inbox** (verify + dispatch to registered handlers).
+- **Replay log** to rebuild registries on boot.
+
+Everything else is registry-resolved.
+
+---
+
+## 4. Registries
+
+Each registry has a default-populated set (loaded from genesis-bundled CIDs) and
+accepts new entries via `Define*` activities. Default entries themselves are SX
+artifacts — versioning, audit, replacement work the same way as user content.
+
+| Registry | Bootstrap defaults | Extended by |
+|----------|-------------------|-------------|
+| **Activity types** | `Create`, `Update`, `Delete`, `Announce` | `DefineActivity{type, schema-sx, semantics-sx}` |
+| **Object types** | `SXArtifact`, `Note`, `Image`, `Tombstone` | `DefineObject{type, schema-sx, render-hint}` |
+| **Validators** | envelope shape, signature, type-schema | `DefineValidator{applies-to, predicate-sx}` |
+| **Projections** | identity, by-type, by-cid, by-actor, actor-state, define-registry, audience-graph, by-object | `DefineProjection{name, fold-sx, query-sx}` |
+| **Codecs** | dag-cbor, raw, dag-json | `DefineCodec{multicodec, encode-sx, decode-sx}` |
+| **Hash algorithms** | sha2-256 | multihash table — agile by spec |
+| **Transports** | http-inbox-push | `DefineTransport{name, deliver-sx, receive-sx}` |
+| **Audience predicates** | `Public`, `Followers`, direct | `DefineAudience{name, member-of-sx}` |
+| **Subscription types** | `Follow` (AP-standard) | `DefineSubscription{name, schema-sx, match-sx, delivery}` |
+| **Proof types** | (none) | `DefineProof{type, attach-sx, verify-sx}` |
+| **Storage backends** | files-on-disk | `DefineStorage{where-tag, put-sx, get-sx}` |
+| **Triggers** | (none) | `DefineTrigger{when-subscription, then-sx, cascade-limit}` |
+| **Signature suites** | rsa-sha256 (AP-compatible) | `DefineSigSuite{name, sign-sx, verify-sx}` |
+| **Application bundles** | (none) | `DefineApplication{name, subscriptions, triggers, projections, storage}` |
+
+Adding `Pin`, `Endorse`, `Supersede`, `Test`, `Build`, `Compose`, etc. later is just
+publishing `DefineActivity` artifacts — no kernel diff, no redeploy required if
+registries are hot.
+
+---
+
+## 5. The meta-level
+
+A `DefineActivity` is itself an AP `Create` activity over an `SXArtifact` of a
+specific type:
+
+```sx
+(activity 'Create
+  :object {:type "DefineActivity"
+           :name "Pin"
+           :schema (fn (act)
+             (and (string? (-> act :object :path))
+                  (cid? (-> act :object :cid))))
+           :semantics
+           '(fn (act state)
+             (assoc-in state [:pins (-> act :object :path)]
+                       (-> act :object :cid)))})
+```
+
+When the kernel receives an activity with `type: "Pin"` it looks up the registered
+semantics from a `DefineActivity{name: "Pin"}` artifact, runs the SX, projects the new
+state. The semantics are themselves content-addressed and federated — every receiver
+runs the same code.
+
+Same pattern handles `DefineProjection`, `DefineValidator`, etc. The substrate is
+genuinely self-extending.
+
+---
+
+## 6. Verbs
+
+### 6.1 Bootstrap verbs (milestone 1)
+
+The substrate exposes `POST /activity` (not `POST /publish`) — generalised entry
+point that takes any well-formed AP activity, validates, signs, appends to outbox.
+`(publish sx)` is sugar at the SX layer for `Create{SXArtifact}`.
+
+Day-one verbs (cost ~zero once `/activity` exists):
+
+- **`Create`** — the publish primitive.
+- **`Update`** — supersede a previous activity (correct metadata, change a path
+  mapping). Distinct from "publishing new content" — new content is always a new
+  `Create` with a new CID.
+- **`Delete`** — tombstone. AP-native; readers honour it.
+- **`Announce`** — boost another actor's artifact into your outbox. Comes free.
+- **`Subscribe`** — generalised subscription verb (parallel to publish/`Create`).
+  Wraps any registered `DefineSubscription` type. `Follow` is the standard AP
+  `Subscribe{Follow{actor: ...}}` for wire compatibility. See §18.
+- **`Unsubscribe`** — `Undo` of a prior `Subscribe`. Same shape as AP
+  `Undo{Follow}`.
+
+### 6.2 Custom verbs (designed-for, defined later)
+
+Substrate accepts these from day one (any signed activity can be appended); semantics
+projected once `DefineActivity` artifacts exist.
+
+- **`Pin`** — assign `domain:path/name → CID`. The future name-resolution layer made
+  of activities. Each pin is signed; the resolver replays the outbox to compute current
+  state.
+- **`Endorse`** (modelled on `Like`/`Approve`) — third-party signature on a CID.
+  Web-of-trust style code review without central authority.
+- **`Supersede`** — "CID A replaces CID B". Stronger than `Update`; readers can chase
+  the chain.
+- **`Test`** — published assertion that running CID A under conditions X yields result
+  Y. Test-as-artifact, federated.
+- **`Build`** — links a source CID to a compiled-output CID, with provenance.
+- **`Compose`** — derived artifact citing input CIDs. Provenance graph in the outbox
+  itself.
+- **`Note`** (AP-native) — comments / reviews / discussion attached to a CID.
+- **`Follow`** / **`Undo(Follow)`** — subscribe to another instance's outbox.
+
+The pattern that matters: your outbox isn't just "things published," it's an
+**append-only log of every assertion this actor makes about the SX universe.**
+
+---
+
+## 7. Capability discovery
+
+Two pieces:
+
+- **`GET /.well-known/sx-capabilities`** — JSON listing every registered activity-type,
+  object-type, codec, transport, sig-suite, proof-type. Each with the CID of the
+  `Define*` artifact that introduced it. Peers can diff capabilities before federating.
+- **`capabilities-required`** field on activities — sender declares "this needs `Pin`
+  semantics + `dag-cbor` codec." Receivers without those capabilities return a clean
+  422 referencing the missing CIDs; sender knows whether to replay-and-deliver the
+  bootstrapping `Define*` artifacts first.
+
+Federation degrades gracefully across instances at different versions.
+
+---
+
+## 8. Axes of flexibility (all designed-for)
+
+1. **Object types** beyond SXArtifact — `Note`, `Article`, `Image`, `Video`, `Question`,
+   `Event`, etc. via the object-type registry.
+2. **Storage tier per-object** — `where: inline | cid | url`. Tiny things inline; big
+   things to IPFS; legacy stuff URL-linked. Migrating storage backends doesn't migrate
+   the substrate.
+3. **Multihash + multicodec agility** — sha2-256 + dag-cbor by default; substrate
+   accepts blake3, raw, dag-json, dag-pb, etc.
+4. **Multi-key actors** — `publicKeys` array always; per-key `purpose`; multiple key
+   types (RSA for AP wire compat, Ed25519 modern). See §9.
+5. **Audience / visibility** — AP-native `to`, `cc`, `bto`, `bcc`. Public, followers,
+   direct, unlisted. Custom audiences via `DefineAudience`.
+6. **Outbox-as-database** — no source-of-truth other than the log. Projections are
+   recomputable views.
+7. **Programmable activities** — activities can carry SX. Reactive federation,
+   conditional pins, automated propose/test/release pipelines, all expressed as AP
+   activities.
+8. **Federation transport pluggable** — outbox is canonical; how peers exchange is
+   pluggable (HTTP push, pull, libp2p, polling).
+9. **Optional timestamp proofs** — every activity has an attachable `proofs` slot.
+   OpenTimestamps, on-chain merkle commit, third-party TSA all slot in without changing
+   activity semantics.
+
+Explicitly **not** pursuing for MVP:
+- Schema-version negotiation (premature; `@context` handles extension).
+- Configurable conflict-resolution per actor (last-signed-wins, log preserved for
+  audit).
+- Verb-specific kernel handlers (other than `Create`'s "compute CID, store body").
+
+---
+
+## 9. Identity & actor lifecycle
+
+### 9.1 Actor doc shape
+
+```jsonld
+{
+  "@context": ["https://www.w3.org/ns/activitystreams",
+               "https://w3id.org/security/v1",
+               "https://next.rose-ash.com/ns/fed-sx/v1"],
+  "type": "Person",                       // or Service, Group, Application
+  "id": "https://next.rose-ash.com/actors/giles",
+  "preferredUsername": "giles",
+  "inbox": "https://next.rose-ash.com/actors/giles/inbox",
+  "outbox": "https://next.rose-ash.com/actors/giles/outbox",
+  "followers": "...",
+  "following": "...",
+
+  "publicKeys": [                         // ARRAY from day one — never `publicKey`
+    { "id": "...#key-2026-05",
+      "type": "RsaVerificationKey2018",
+      "owner": "<actor-id>",
+      "publicKeyPem": "...",
+      "purpose": ["sign-activity", "sign-http"],
+      "created": "2026-05-14T...",
+      "expires": null,
+      "supersedes": null,
+      "supersededBy": null },
+    { "id": "...#key-ed25519-2026-05",
+      "type": "Ed25519VerificationKey2020",
+      "owner": "<actor-id>",
+      "publicKeyMultibase": "z6Mk...",
+      "purpose": ["sign-activity"],
+      "created": "2026-05-14T..." }
+  ],
+
+  "capabilities": "https://.../actors/giles/capabilities",  // what verbs they speak
+  "alsoKnownAs": ["did:web:rose-ash.com:giles", ...],       // bridge to DID, AP migration
+  "movedTo": null                                            // set on Move
+}
+```
+
+Key shape decisions:
+
+- **`publicKeys` array always.** Single-key actors have an array of length 1. AP
+  standard `publicKey` is *also* served as the first array element for back-compat
+  with vanilla AP servers (Mastodon etc. ignore the array).
+- **Per-key `purpose`** — separates signing weight. Day-to-day publish key vs. high-
+  value key for `Pin`/`Endorse` vs. delegated machine key. Validators can require
+  specific purposes per activity type (registry-driven).
+- **Multiple key types** — RSA for AP wire compat, Ed25519 for everything else
+  (smaller, faster, modern). Sig suite registry decides which suites are accepted.
+- **`supersedes` / `supersededBy`** — keys form a chain, not a snapshot. Old activities
+  still verify against historical keys.
+
+### 9.2 Key rotation
+
+Key rotation is itself an activity, signed by the *old* key (or a recovery key):
+
+```sx
+(activity 'Update
+  :object actor-id
+  :patch {:add-publicKey new-key
+          :supersede {old-key-id new-key-id}})
+```
+
+Kernel:
+1. Fetches actor's current state (a projection over their own outbox).
+2. Verifies activity is signed by a key with `purpose: rotate-key` (or any active key,
+   if registry allows).
+3. Appends. The actor-state projection now has the new key.
+
+Old activities still verify because the projection retains the historical key with
+`supersededBy` set — sig verification looks up "what keys were active at activity
+timestamp T."
+
+### 9.3 Key recovery / loss
+
+- **Recovery key** — separate key at actor creation, never used except to rotate.
+  Stored offline. `purpose: ["recover"]`. Validator allows
+  `Update{actor, patch: rotate-all-keys}` if signed by a recovery key.
+- **Social recovery** — designate N trusted actors, M-of-N can co-sign a recovery
+  `Update`. Implemented as a `DefineValidator` extension; multi-sig slot in `proofs`
+  makes it possible without changing the envelope.
+- **Total loss** — if both signing and recovery keys are gone, the actor is dead.
+  They publish a new actor with `alsoKnownAs: <old-actor-id>` from a fresh key.
+  Followers can choose to re-follow but there's no cryptographic continuity.
+
+### 9.4 Migration (`Move`)
+
+AP-native:
+
+```sx
+(activity 'Move
+  :object old-actor-id
+  :target new-actor-id)
+```
+
+Receivers update their follow lists. New actor's `alsoKnownAs` must include old
+actor — bidirectional handshake prevents hijacking.
+
+For fed-sx, `Move` should also carry an outbox migration hint (CID of an export bundle)
+so receivers can re-anchor projections without re-fetching activity-by-activity.
+
+### 9.5 Subordinate actors / delegation
+
+Two patterns supported:
+
+- **Service actors** (AP-native `type: Service`): bots, build servers, test runners.
+  Their own keys, their own outboxes, but `attributedTo` a parent actor.
+- **Capability tokens**: parent publishes `Authorize{actor: child, capabilities: [...],
+  expires: ...}` signed by parent. Child publishes activities normally with their own
+  key; receivers verify the capability chain when child invokes an authority they don't
+  own outright. Useful for: temporary publish access, delegated `Pin` rights for a
+  specific path prefix, multi-device.
+
+Both work *without* new kernel mechanism — just activities.
+
+### 9.6 Implications
+
+- **Sig verification is timestamp-aware.** Verifying an old activity needs the key
+  state at the time it was published — actor-state projection must support time-travel
+  queries.
+- **Inbox doesn't trust `keyId` blindly.** Fetches actor doc, projects current key
+  state, checks key was valid at `published`.
+- **Cross-instance identity via `alsoKnownAs` and DIDs.** Don't depend on DIDs but
+  slot them in for Bluesky-bridge, Solid-bridge, etc.
+
+---
+
+## 10. Projection model
+
+The architectural commitment: **state is what you get when you fold pure SX over the
+log.** No DB-of-record. Everything queryable is a projection.
+
+### 10.1 What a projection is
+
+A `DefineProjection` activity registers four things:
+
+```sx
+(activity 'Create
+  :object {:type "DefineProjection"
+           :name "actor-state"
+           :initial-state {}                        ; pure SX value
+           :fold (fn (state activity)               ; pure SX
+                   (case (:type activity)
+                     "Create"  (when (= "Person" (-> activity :object :type))
+                                 (assoc state (:id activity) (:object activity)))
+                     "Update"  (apply-patch state activity)
+                     "Move"    (set-moved state activity)
+                     state))
+           :snapshot-codec "dag-cbor"
+           :indexes [{:by :id} {:by :preferredUsername}]})
+```
+
+- **`name`** — query handle. Unique per actor; collisions resolved by CID + supersession.
+- **`initial-state`** — pure SX value used as state-zero.
+- **`fold`** — pure SX function `(state activity) → state`. The only thing the kernel
+  calls.
+- **`indexes`** — optional hint for materializing lookup paths.
+
+The CID of the `DefineProjection` artifact is the projection's identity. Two instances
+running the same projection are running the same CID's `fold` over the same log slice
+— equivalence is decidable.
+
+### 10.2 The fold contract — purity, determinism, gas
+
+The fold function must be **pure and deterministic**. Non-negotiable; it's what makes
+cross-instance equivalence and replay possible.
+
+- **No IO.** No HTTP, no file access, no DB calls, no clock. The activity carries its
+  own `published` timestamp.
+- **No randomness.** No host-seeded PRNG. (If pseudo-randomness is needed, seed from
+  the activity's CID — deterministic across hosts.)
+- **No mutation outside the returned state.**
+- **Bounded execution.** Each fold call gets a gas budget (default tunable, e.g. 100k
+  CEK steps). Exceeding it is a hard failure.
+
+Enforced at the SX evaluator level by running folds in a sandboxed environment with
+the IO platform stripped to nothing. Same sandbox model applies to validators and
+trigger semantics.
+
+**Cross-host equivalence guarantee:** for the same projection CID + same activity log
+slice, every conforming SX host (JS, OCaml, Python, Haskell-on-SX, …) must produce a
+state value with the same canonical CID. Tested via the spec test suite.
+
+### 10.3 Bootstrap projections
+
+The kernel cannot start without some projections, because the kernel itself uses them.
+Baked into the genesis bundle (see §11), superseded only by deliberate kernel-version
+upgrades.
+
+| Projection | What it computes | Used by |
+|------------|------------------|---------|
+| `activity-log` | Identity — every activity, indexed by id and CID | Everything |
+| `by-type` | `type → ordered list of activity-CIDs` | Most queries |
+| `by-actor` | `actor-id → ordered list of activity-CIDs` | Per-actor outbox view |
+| `by-object` | `object-CID → list of referencing activity-CIDs` | "Who pinned this?" |
+| `actor-state` | `actor-id → current actor doc with key history` | Sig verification (kernel) |
+| `define-registry` | `kind+name → currently-active Define* CID` | All other Define* lookups |
+| `audience-graph` | `actor → followers/following` | Federation push |
+
+`define-registry` is the bootstrap chicken-and-egg: it's the projection that knows
+which projections (and validators, codecs, etc.) are currently active. Kernel ships
+with it hardcoded; once running, every other projection (including a future replacement
+of `define-registry` itself) is a regular `DefineProjection` superseding it.
+
+### 10.4 Snapshotting
+
+Replaying the entire log on every restart is unacceptable past day one.
+
+- **Snapshot = `(activity-tip-CID, projection-state, projection-CID)` tuple,**
+  dag-cbor encoded, content-addressed.
+- **Snapshot rule** — every K activities (default 1000) and every T seconds (default
+  60), serialize, hash, store on disk.
+- **Resume** — on startup, find latest snapshot for each (projection-CID, log-tip),
+  load state, fold forward.
+- **Snapshot CID is verifiable** — anyone with the same log slice and projection-CID
+  can recompute and check the CID matches. This is the cross-instance agreement proof.
+
+Snapshots are themselves publishable as activities (`Create{Snapshot}`): an instance
+can publish "here's my computed state for projection X at log-tip Y, CID Z." Other
+instances can fetch and use as a starting point. **Federated state sharing falls out of
+federated activities.**
+
+Snapshots are pruning-friendly: keep latest + snapshots referenced by published
+`Create{Snapshot}` activities; everything else is GC-able.
+
+### 10.5 Reprojection on definition change
+
+When `DefineProjection{name: "actor-state"}` is superseded by a new CID with a
+different fold:
+
+1. `define-registry` projection sees the supersession; its state advances.
+2. New projection materialized **alongside** the old one — both kept live during
+   migration.
+3. New projection runs in catch-up mode: replay from genesis (or from deepest
+   compatible snapshot).
+4. When new projection catches up to log tip, queries cut over. Old projection state
+   can be retired.
+5. Snapshots of old version stay around as long as referenced (e.g. for time-travel
+   queries against historical state under old semantics).
+
+Changing a projection definition is **safe and online**. Cost: temporary state
+duplication during catch-up. Slow folds → slow migrations, but never breakage.
+
+For projections too expensive to fully reproject, `Update{DefineProjection}` can
+declare `migrationHint: <fn from old-state to new-state>` — opt-in, used at migrator's
+risk.
+
+### 10.6 Time-travel queries
+
+Folds are deterministic functions of `(initial-state, activity-list-prefix)`.
+Time-travel is fold-up-to:
+
+- `state-as-of(projection, activity-id-or-timestamp)` → walk to requested point,
+  return state.
+- Snapshots act as accelerators (resume from nearest snapshot ≤ target).
+- Used by sig verification ("what keys did this actor have when this activity was
+  signed?"), audit, "what did we believe last Tuesday."
+
+### 10.7 Projection composition
+
+**Projections do not directly read each other's state during folding.** Preserves
+locality and parallelism — every projection runs independently against the same log.
+
+Composition via:
+
+- **Query time** — `(query (projection actor-state) ...)` joins are SX expressions
+  over multiple projection states.
+- **Republishing as activities** — a projection that exposes its state as input to
+  others publishes `Create{Snapshot}` periodically. Downstream projections fold over
+  those.
+
+Direct cross-projection reads during fold introduce ordering, cycles, cache-
+invalidation problems we don't need.
+
+### 10.8 Querying
+
+Three layers:
+
+- **Raw projection state** — `GET /projections/<name>?at=<timestamp>` returns dag-cbor
+  (also JSON for tooling). Large states paginated by index.
+- **SX queries** — `POST /query` with an SX expression that runs against one or more
+  projection states in pure mode. Equivalent to Datalog/GraphQL.
+- **Materialized indexes** — declared on projection (`indexes:` field). Kernel
+  maintains as side-tables for `O(log n)` lookup.
+
+Real-time: clients `GET /projections/<name>/subscribe` (SSE), receive deltas as
+activities land. Delta is `(old-state, new-state, applied-activity-CID)`; clients can
+verify by re-folding.
+
+### 10.9 Lag, async, concurrency
+
+- **Append is sync; projection is async.** `POST /activity` returns once activity is
+  durably in the log. Projections run in a separate worker pool; query results carry
+  `projected-up-to` so callers know whether the latest write is visible.
+- **One worker per projection.** Folds are sequential, but projections run in parallel
+  with each other.
+- **Sync option** — `POST /activity?wait-for=projection-name` blocks until the named
+  projection has folded the new activity. Use sparingly.
+
+### 10.10 Failure modes
+
+| Failure | Response |
+|---------|----------|
+| **Gas exhaustion** | Activity tagged `projection-failed` for this projection. State unchanged. Operator alert. |
+| **SX runtime error** (assertion, type mismatch) | Same as gas: activity skipped, error logged, state unchanged. |
+| **Schema violation** | Caught earlier in validation pipeline, never reaches projection. |
+
+The log itself is always written successfully if it passes envelope + signature +
+validator checks. Projection failures don't gate appending — that would couple writes
+to arbitrary user-defined code.
+
+### 10.11 Operational implications
+
+- **Projection determinism is the linchpin.** If JS and OCaml ever produce different
+  state for the same log + projection, federation cracks. Spec test suite must cover
+  projection equivalence across hosts as a first-class requirement.
+- **Snapshots are eventual consensus.** Two instances publish `Create{Snapshot}` for
+  the same log+projection; if their CIDs match, they agree without coordination.
+- **Kernel reads its own projections.** `actor-state` for sig verification;
+  `define-registry` for every Define* lookup. Startup sequence must bootstrap these
+  before serving traffic.
+- **Reprojection cost is real.** Heavy projection changes mean replaying from genesis.
+  Encourage incremental schemas (small per-activity work, idempotent updates) and
+  provide profiling.
+
+---
+
+## 11. Sandbox & determinism
+
+The runtime contract that makes folds (and validators, triggers, semantics) safe to
+execute, and that guarantees every conforming SX host computes the same state from
+the same log.
+
+### 11.1 Three sandbox levels
+
+Different registry entries need different power. We define three nested execution
+modes; the registry entry declares which mode it requires.
+
+| Mode | Used by | IO | Clock | Random | Determinism |
+|------|---------|----|----|--------|-------------|
+| **pure** | folds, validators, audience predicates, semantics, trigger `when-sx` | none | activity's own `published` only | seeded from activity CID only | required across hosts |
+| **crypto** | sig suite verify, codec encode/decode | crypto primitives only | none | sign-only secure RNG | required across hosts (verify); single-host (sign) |
+| **effectful** | storage backends, transports, trigger `then-sx`, some proof verifiers | per-capability grant only | host clock | host RNG | not required; single-host |
+
+Default mode is **pure**. The other two are opt-in at registration time, and the
+registration is itself a signed activity — anyone can audit which extensions claim
+which powers.
+
+### 11.2 Pure sandbox (the load-bearing one)
+
+This is the mode every projection fold runs in. It must produce identical results on
+every conforming SX host, every time.
+
+**Allowed:**
+- All spec primitives in `spec/primitives.sx` that don't perform IO (arithmetic,
+  comparison, predicates, string ops, collection ops, dict ops, format helpers).
+- The activity being processed (full envelope), as the function's argument.
+- The current state value, as the function's argument.
+- A small set of fed-sx-specific deterministic primitives:
+  - `(activity-cid act)` → CID of the activity envelope
+  - `(activity-time act)` → ISO timestamp from `published`
+  - `(actor-state-as-of state-snapshot actor-id activity-time)` → if the projection
+    has been declared dependent on `actor-state` (see §10.7), reads from a snapshot
+    of that projection at the activity's timestamp
+  - `(seeded-rng cid)` → deterministic PRNG seeded from a CID, returns a stream of
+    uniform values
+
+**Forbidden:**
+- All IO: HTTP, file, network, stdin/stdout, environment.
+- Wall-clock access. The host's `now` is not in scope; the only time available is
+  `(activity-time act)`.
+- Host-seeded randomness. Only `seeded-rng` (CID-derived) is available.
+- Mutation outside the returned value. Enforced by the SX evaluator's lack of
+  ambient mutable bindings; folds may use local `let` and mutation within their own
+  closure but cannot reach outside.
+- Calling other registry entries by name. Composition happens at query time, not
+  fold time (see §10.7).
+
+**Enforced by:** evaluator runs the fold with the IO platform stripped to nothing.
+The fed-sx kernel constructs a `pure-platform` (no fetch, no query, no action, no
+DOM, no storage) and uses it as the sole evaluator platform when calling the fold.
+Any IO primitive call raises a hard error caught as a fold failure.
+
+### 11.3 Crypto sandbox
+
+Sig suites and codec encode/decode need hash + crypto + encoding primitives but
+nothing else. They're still deterministic across hosts (verify case) but get a
+narrower platform than effectful, wider than pure.
+
+**Additional primitives over pure:**
+- `(sha2-256 bytes)`, `(sha3-256 bytes)`, `(blake3 bytes)`, …
+- `(rsa-verify pubkey msg sig)`, `(ed25519-verify pubkey msg sig)`, …
+- `(rsa-sign privkey msg)`, `(ed25519-sign privkey msg)` — sign-only; requires the
+  caller to supply a secure RNG handle (which is *not* in pure mode)
+- `(cbor-encode value)`, `(cbor-decode bytes)` — for codecs implementing CBOR variants
+- `(base32-encode bytes)`, `(base58btc-encode bytes)`, `(multibase-encode tag bytes)`
+- `(multihash-encode tag digest-bytes)`, `(multihash-decode bytes)`
+- `(cid-encode codec mhash)`, `(cid-decode bytes)`
+
+**Sign vs verify:** verify is pure (deterministic). Sign is not — it consumes
+randomness. fed-sx draws a clean line: signing happens *outside* registry-entry SX
+(it's an operation the kernel/runtime performs on behalf of the actor with their
+private key); registry SX only ever *verifies*. This keeps the pure↔crypto distinction
+tractable.
+
+### 11.4 Effectful sandbox
+
+Storage backends, transports, trigger `then-sx`, and proof verifiers that need the
+network (e.g. blockchain RPC for on-chain proof verification) all need real IO.
+These are not used to compute projected state; they're how the substrate interacts
+with the outside world.
+
+**Capability-granted primitives.** The registration activity declares the
+capabilities the entry needs:
+
+```sx
+(activity 'Create
+  :object {:type "DefineStorage"
+           :where-tag "ipfs"
+           :capabilities [{:type "http-client" :allowlist ["http://localhost:5001/*"]}
+                          {:type "fs-read"    :path-prefix "/var/cache/fed-sx/ipfs/"}
+                          {:type "fs-write"   :path-prefix "/var/cache/fed-sx/ipfs/"}]
+           :put-sx (fn (cid bytes) ...)
+           :get-sx (fn (cid) ...)})
+```
+
+**Capability types** (initial set; extensible):
+
+- `http-client` with `allowlist` (URL prefix patterns)
+- `http-server` with `path-prefix` (mounts a sub-handler)
+- `fs-read` / `fs-write` with `path-prefix` (chroot-style)
+- `subprocess` with `command-allowlist`
+- `clock-read` (wall clock; granted if registry entry needs to timestamp something)
+- `random-bytes` (host CSPRNG)
+
+**No ambient authority.** Default capability set is empty; every capability is
+explicit, declared, signed, and auditable. A peer can refuse to load a registry
+entry whose capability claim is unacceptable to them.
+
+**Capabilities are content-addressed.** Each capability descriptor has a CID. The
+substrate maintains a registry of "capability CIDs that this instance trusts to
+honour" — operator policy, not protocol.
+
+### 11.5 Gas and resource accounting
+
+Each sandbox call gets a budget:
+
+- **CEK gas** — every evaluator step costs 1 unit; primitive calls cost a per-
+  primitive amount declared in `spec/primitives.sx`. Default budget: 100k units per
+  fold call. Tunable per-projection via `DefineProjection.gas-limit`.
+- **Memory ceiling** — peak heap size for the fold call. Default 64 MB. Tunable.
+- **IO budget** (effectful only) — bytes read/written and network calls per
+  invocation, granted separately per capability.
+- **Wall-clock budget** (effectful only) — max real-time before forced termination.
+
+Exceeding any budget is a hard failure; the call returns an error value, the fold's
+state is unchanged, and the activity is tagged for the projection.
+
+Gas accounting is part of the spec — every conforming host must charge the same
+units for the same operations, so "this fold runs out of gas" is a deterministic
+property of the (projection, activity) pair, not a host-specific outcome.
+
+### 11.6 Determinism gotchas
+
+The pure sandbox is only as deterministic as its primitives. Worth nailing:
+
+- **Floating point.** IEEE 754 binary operations are bitwise-identical across
+  conforming hosts, but transcendentals (`sin`, `cos`, `log`, `exp`) are *not* —
+  libm implementations differ. **Decision: floats are forbidden in pure mode unless
+  the projection declares `requires-deterministic-floats: true` and uses only the
+  IEEE 754 basic operations (+, -, *, /, sqrt, comparison, conversion).** For exact
+  arithmetic, use integers or rationals (fed-sx will provide a rational primitive).
+- **Map / dict iteration order.** Must be sorted-key always in pure mode. The SX
+  spec mandates this for `for-each` and `map` over dicts; we tighten it: pure mode
+  forbids relying on insertion order.
+- **String encoding.** All strings are UTF-8 NFC at ingestion; pure-mode operations
+  use byte-level comparison after normalization. Codepoint operations (`length`,
+  `substring`) return identical results across hosts because they operate on the
+  normalized form.
+- **Integer overflow.** Pure mode uses arbitrary-precision integers (the SX spec
+  default). No undefined behaviour. Overflow is impossible.
+- **Equality.** Structural equality (`equal?`) compared across hosts must yield the
+  same result for the same canonical-CID values. Implies dict equality is
+  order-independent (as it should be), and float equality follows IEEE 754 (NaN ≠
+  NaN; +0.0 = -0.0).
+- **Error values.** When a primitive errors, the error must be representable as a
+  dag-cbor value with a stable CID across hosts. Reserve a `{:error :type ... :msg
+  ...}` shape; standard error types defined in the spec.
+
+### 11.7 Failure model
+
+A pure-mode call ends in one of three terminal states:
+
+1. **Success** — returns a value. Fold uses it as new state.
+2. **Sandbox violation** — IO attempted, capability denied, etc. Returns a stable
+   error value; fold's state is unchanged; activity tagged
+   `{:projection-failed :reason :sandbox-violation :detail ...}`.
+3. **Resource exhaustion** — gas, memory, IO budget exceeded. Same handling as
+   sandbox violation but with `:reason :resource-exhausted`.
+
+Crypto-mode failures (e.g. invalid signature) are *return values*, not exceptions —
+verify returns boolean, sign returns either a sig or an error. This forces callers
+to handle failure explicitly.
+
+Effectful-mode failures (network down, disk full) propagate to the operator as
+errors but never affect projected state. The substrate retries effectful operations
+according to the registry entry's policy (declared at registration).
+
+### 11.8 Conformance testing
+
+Cross-host equivalence isn't aspirational; it's tested.
+
+- **Spec test suite** ships projection equivalence tests: a corpus of (log slice,
+  projection CID, expected snapshot CID) tuples. Every conforming SX host must
+  produce the expected snapshot CID for each input.
+- **Validator equivalence tests** likewise: (validator CID, activity, expected
+  result).
+- **Codec equivalence tests:** (codec CID, value, expected encoded bytes), in both
+  encode and decode directions.
+- **Sandbox isolation tests:** "this fold attempts to call `fetch`; expected
+  outcome: sandbox violation error with stable CID."
+
+Hosts run the conformance suite to claim "fed-sx pure-mode conformance." Failures
+are publishable as `Test{result: failed, host: ..., projection: ...}` activities —
+the conformance graph itself is federated.
+
+### 11.9 Operational implications
+
+- **The pure sandbox is the heart of cross-host federation.** Every divergence is a
+  spec bug or a host bug; both are caught by snapshot CID mismatches and surfaced
+  via `Test` activities.
+- **Capability descriptors are the new audit trail.** "What can the IPFS storage
+  backend do?" is a question with a precise answer at any timestamp — the registered
+  capability CIDs.
+- **Floats are mostly absent.** This is unusual but defensible — most state in the
+  substrate is ids, counts, sets, references. Numerical computation belongs in
+  effectful registry entries (e.g. an analytics projection that publishes summaries
+  as activities, projected by a downstream pure projection that just stores them).
+- **Gas is part of the protocol.** Two hosts disagreeing about whether a fold runs
+  out of gas is a conformance failure. Spec primitive gas costs are normative.
+
+## 12. Bootstrap & genesis
+
+How a fresh instance starts with no log, where the initial registry entries come
+from, and how the kernel evolves without bricking peers.
+
+### 12.1 The genesis problem
+
+The substrate is "everything is a `Define*` activity in the log." But on a fresh
+instance the log is empty — so there are no `Define*` activities to tell the kernel
+what `Create` means, how to verify a signature, or what dag-cbor is. Strict
+turtles-all-the-way-down would deadlock startup.
+
+Solution: **the kernel ships with a baked-in genesis bundle** containing the minimal
+set of definitions it needs to interpret its own log. The bundle is a constant of
+the kernel binary; its CID is hardcoded; the kernel verifies on startup that the
+bundle matches its hardcoded CID. After that, everything (including superseding the
+bundled definitions themselves) goes through the activity log.
+
+The genesis bundle is *not* itself a federated artifact in the AP sense. It's the
+dictionary you need before you can read any activities. Optionally, an actor can
+`Create{GenesisRecord}` as their first published activity to advertise which genesis
+they started from — informational, not load-bearing.
+
+### 12.2 Genesis bundle contents
+
+Minimal viable bundle (dag-cbor object, content-addressed):
+
+```
+{
+  "type": "fed-sx-genesis",
+  "kernel-version": "1.0.0",
+  "envelope-spec": { ... },                 // canonical schema for activity envelope
+  "object-spec": { ... },                   // canonical schema for object envelope
+  "definitions": {
+    "activity-types": {
+      "Create":   { "schema": <sx>, "semantics": <sx> },
+      "Update":   { "schema": <sx>, "semantics": <sx> },
+      "Delete":   { "schema": <sx>, "semantics": <sx> },
+      "Announce": { "schema": <sx>, "semantics": <sx> }
+    },
+    "object-types": {
+      "SXArtifact": { "schema": <sx> },
+      "Note":       { "schema": <sx> },
+      "Tombstone":  { "schema": <sx> },
+      "DefineActivity":   { "schema": <sx> },
+      "DefineObject":     { "schema": <sx> },
+      "DefineProjection": { "schema": <sx> },
+      "DefineValidator":  { "schema": <sx> },
+      "DefineCodec":      { "schema": <sx> },
+      "DefineTransport":  { "schema": <sx> },
+      "DefineAudience":   { "schema": <sx> },
+      "DefineProof":      { "schema": <sx> },
+      "DefineStorage":    { "schema": <sx> },
+      "DefineTrigger":    { "schema": <sx> },
+      "DefineSigSuite":   { "schema": <sx> },
+      "Snapshot":         { "schema": <sx> }
+    },
+    "sig-suites": {
+      "rsa-sha256-2018": { "verify": <sx>, "key-format": <sx> },
+      "ed25519-2020":    { "verify": <sx>, "key-format": <sx> }
+    },
+    "codecs": {
+      "dag-cbor":  { "encode": <sx>, "decode": <sx> },
+      "raw":       { "encode": <sx>, "decode": <sx> },
+      "dag-json":  { "encode": <sx>, "decode": <sx> }
+    },
+    "projections": {
+      "activity-log":     { "initial-state": ..., "fold": <sx> },
+      "by-type":          { "initial-state": ..., "fold": <sx> },
+      "by-actor":         { "initial-state": ..., "fold": <sx> },
+      "by-object":        { "initial-state": ..., "fold": <sx> },
+      "actor-state":      { "initial-state": ..., "fold": <sx> },
+      "define-registry":  { "initial-state": ..., "fold": <sx> },
+      "audience-graph":   { "initial-state": ..., "fold": <sx> }
+    },
+    "validators": {
+      "envelope-shape": { "predicate": <sx> },
+      "signature":      { "predicate": <sx> },
+      "type-schema":    { "predicate": <sx> }
+    },
+    "audience-predicates": {
+      "Public":    { "member-of": <sx> },
+      "Followers": { "member-of": <sx> },
+      "Direct":    { "member-of": <sx> }
+    }
+  },
+  "capability-types": [                     // schema for capability descriptors
+    "http-client", "http-server",
+    "fs-read", "fs-write",
+    "subprocess", "clock-read", "random-bytes"
+  ]
+}
+```
+
+Each definition's body is **SX source**, not bytecode. The kernel evaluates it at
+startup using the same SX evaluator user-published `Define*` artifacts use — there
+is no privileged "native" path. The bootstrap is just SX loaded from the binary
+instead of from the log.
+
+### 12.3 Hardcoded CID and verification
+
+The kernel binary contains:
+
+- The full genesis bundle (embedded as bytes).
+- The CID computed over those bytes at build time.
+
+On startup:
+
+1. Compute the actual CID of the embedded bundle.
+2. Compare to the hardcoded CID.
+3. **Mismatch → refuse to start.** Either the binary has been tampered with or the
+   build process is broken. Either way, the operator should know immediately.
+4. **Match → proceed.** Every running instance with a given kernel binary has
+   byte-identical bootstrap state — no version drift possible within a binary.
+
+The genesis CID is exposed at `GET /.well-known/sx-capabilities` so peers can see
+which kernel version they're talking to.
+
+### 12.4 Fresh instance startup sequence
+
+```
+1. Load and verify genesis bundle (panic on mismatch)
+2. Parse all definition SX sources, instantiate evaluator closures
+3. Initialize registries from definitions (in the order: codecs → sig-suites →
+   validators → object-types → activity-types → audience-predicates → projections)
+4. Open log file (create if missing)
+5. Replay any existing log: for each activity, validate, then fold into each
+   projection (resuming from snapshots where available)
+6. Load or generate actor keypair (filesystem path from config)
+7. If actor has never published a Create{Person} for itself, generate and append
+   one as the first activity of this instance's outbox
+8. Initialize HTTP server, wire routes
+9. Open inbox: start accepting federated activities
+10. Mark instance as ready
+```
+
+Steps 1-3 are the bootstrap. Step 5 is replay-and-project. Step 7 is the
+"actor genesis" — every instance has at least one local actor; it publishes itself
+as its first activity, and that activity (signed by the actor's own key) anchors all
+subsequent activity from that actor.
+
+### 12.5 First activity — actor creation
+
+Every fresh actor's outbox starts with:
+
+```sx
+(activity 'Create
+  :id           "https://next.rose-ash.com/actors/giles/activities/<uuid>"
+  :actor        "https://next.rose-ash.com/actors/giles"
+  :published    "<iso-timestamp>"
+  :to           ["https://www.w3.org/ns/activitystreams#Public"]
+  :object       <full actor doc with publicKeys array>
+  :signature    <signed by the new key over the activity envelope>)
+```
+
+Self-signed: the activity introduces the key it's signed with. Verifiers fetch the
+actor doc embedded in the activity, find the key, verify against the activity. This
+is the trust-on-first-encounter for a new actor — the same model AP uses.
+
+The kernel emits this automatically on first startup if the actor has no prior
+activity. Subsequent actor changes (key rotation, profile updates) are `Update`
+activities signed by an existing key.
+
+### 12.6 Joining federation
+
+A new instance has no peers initially. Discovery is operator-driven for v1:
+
+1. Operator configures one or more peer URLs (or a well-known seed list).
+2. Instance fetches peer's actor doc and `/.well-known/sx-capabilities`.
+3. Instance verifies it can interpret the peer's activities (envelope compatible,
+   sig suites overlap). Reports incompatibilities to operator.
+4. If compatible, instance follows peer's primary actor (`POST /inbox` with a
+   `Follow` activity).
+5. Peer streams or backfills outbox to this instance.
+6. Activities arrive, validate, fold into local projections.
+
+Discovery beyond manual config (e.g. peer recommendations, federation directories)
+is a v2 concern.
+
+### 12.7 Kernel version evolution
+
+The substrate must evolve without forcing every instance to upgrade in lockstep.
+Three rules:
+
+**Rule 1: The activity envelope shape is forward-compatible only.**
+
+We may *add* optional fields to the envelope; we may not change semantics or remove
+fields. Old activities still validate under new kernels. New activities with new
+fields are accepted by old kernels (which ignore the unknown fields, store the raw
+envelope, and project conservatively).
+
+This is the AP discipline. We adopt it strictly. If we ever need a breaking envelope
+change, it's a major version (fed-sx 2.0) and instances at different majors don't
+federate directly — only via bridges.
+
+**Rule 2: Everything else evolves via supersession.**
+
+New sig suite, new codec, new projection definition, new validator: publish a
+`Define*` activity that supersedes the old one. Both old and new versions stay valid
+at their respective timestamps. Old activities verify under old definitions; new
+activities use new definitions. Time-aware lookup (§9.6, §10.6) makes this work.
+
+**Rule 3: New genesis bundles supersede old ones via published activities.**
+
+When the kernel team ships a new version with an updated bundle:
+
+- The new bundle's CID is different.
+- Operators upgrading the kernel get the new bundle automatically.
+- The new bundle's *contents* are largely supersession `Update{DefineProjection,
+  DefineValidator, ...}` activities relative to the old bundle's definitions.
+- A peer running the old kernel sees these `Update` activities (when they appear in
+  followed outboxes) and *can* opt to load them dynamically (§12.8) or stay on the
+  old bundle definitions until the operator upgrades.
+
+In other words: the kernel binary evolution and the activity-log evolution are
+parallel tracks. The binary determines what's *built in*; the log determines what's
+*currently active*. They converge over time but don't have to be lockstep.
+
+### 12.8 Dynamic Define* loading
+
+When an instance receives an activity of `type: "PinV3"` and has no `DefineActivity{
+name: "PinV3"}` in its define-registry, it has three options (operator policy):
+
+- **Strict mode** — store the activity envelope (it's valid AP), tag it `unknown-type`
+  in `by-type`, do not project semantics. Operator must explicitly load the
+  definition to enable projection.
+- **Permissive mode** — fetch the `DefineActivity{name: "PinV3"}` artifact (its CID
+  is in the activity's `capabilities-required` list), validate, evaluate the
+  semantics SX (in pure sandbox), reproject the activity. Operator notified.
+- **Trusted-peers-only mode** — like permissive, but only auto-loads `Define*` from
+  actors on a configured trust list.
+
+Default for fed-sx v1: **strict mode**. Operators opt-in to broader policies.
+
+This lets the substrate genuinely live-extend — new verbs land via federation, no
+binary upgrade — while keeping a clean audit trail of what got loaded when.
+
+### 12.9 Genesis as the substrate's manifest
+
+A useful framing: the genesis bundle is the substrate's **manifest** (in the package-
+manager sense). It declares "this kernel ships with these definitions, identified by
+these CIDs, and this is what the kernel does until the log says otherwise."
+
+Two instances with the same genesis CID start identical. Two instances with
+different genesis CIDs can federate as long as their *active* registry states (after
+log replay) overlap enough.
+
+The genesis bundle is also the **conformance reference**: a kernel implementation
+claims fed-sx v1.0 conformance by reproducing the standard genesis bundle's CID
+from its own build of the included SX sources. If two implementations build the same
+spec sources and produce different CIDs, one of them is non-conformant. Cheap,
+deterministic conformance check.
+
+### 12.10 Operational implications
+
+- **Build-time CID computation is part of the kernel build.** The build pipeline
+  must include the genesis-bundling step and embed the resulting CID. Mismatch
+  protection requires the binary to know what it expects.
+- **Genesis evolution is a deliberate kernel-team decision.** Adding a new bundled
+  projection or sig suite is a kernel release, not a federated activity. (User-
+  defined projections still federate normally.)
+- **Strict-mode default protects against malicious extensions.** Operators have to
+  consciously opt into auto-loading remote `Define*`. This trades convenience for
+  security — appropriate for v1.
+- **Cross-major federation is a bridge problem.** If/when fed-sx 2.0 ships with an
+  envelope change, bridges between v1 and v2 are themselves federated artifacts —
+  built by anyone, signed, audited.
+
+## 13. Federation mechanics
+
+How instances exchange activities, how peers subscribe, how new followers backfill,
+how delivery survives unreliable networks, and how the substrate resists abuse.
+
+### 13.1 Push, pull, hybrid
+
+ActivityPub canonically uses **push**: actor A publishes by POSTing each delivery to
+each follower's inbox URL. This gives low latency and clear delivery semantics, but
+requires a reliable per-recipient delivery queue and falls over when peers go down.
+
+fed-sx supports both, with a **push-primary, pull-fallback** model:
+
+- **Push** is the default delivery mechanism. When an activity is appended to A's
+  outbox, A's delivery worker posts it to each follower's inbox.
+- **Pull** is always available: any peer can `GET /actors/<id>/outbox?since=<cursor>`
+  and stream activities in order. Used for backfill, recovery from delivery gaps,
+  and instances that prefer pull-only operation.
+- **Hybrid in practice:** push delivers *notifications* (the activity itself, or a
+  pointer to its CID); receivers may pull the full content if not inlined. Useful
+  when the activity body is large.
+
+Operators can configure their actors as push-only, pull-only, or hybrid. The
+default is hybrid.
+
+### 13.2 The Follow lifecycle
+
+AP-standard, slightly tightened:
+
+```sx
+;; A wants to follow B
+(activity 'Follow
+  :actor  "https://a.example/actors/alice"
+  :object "https://b.example/actors/bob")
+;; → POST to B's inbox
+
+;; B accepts (or rejects)
+(activity 'Accept
+  :actor  "https://b.example/actors/bob"
+  :object <follow-activity-id-or-embedded>)
+;; → POST to A's inbox
+
+;; A unfollows later
+(activity 'Undo
+  :actor  "https://a.example/actors/alice"
+  :object <follow-activity-id-or-embedded>)
+;; → POST to B's inbox
+```
+
+State derived by the `audience-graph` projection on each instance:
+
+- `(followers actor)` — set of actors who follow `actor`, projected from
+  `Accept{Follow}` activities in `actor`'s outbox (and the inverse via received
+  `Follow` activities).
+- `(following actor)` — symmetric.
+
+**Auto-accept by default.** Public actors auto-publish `Accept` for any incoming
+`Follow`. Locked actors require manual approval, implemented as an operator UI that
+publishes the `Accept` (or `Reject`) once a human decides.
+
+### 13.3 Backfill
+
+When A first follows B, A wants B's history. Four supported modes:
+
+| Mode | Mechanism | Trade-off |
+|------|-----------|-----------|
+| **No backfill** | Just stream new activities going forward | Cheapest, missing context for new followers |
+| **Pull paginated** | `GET /outbox?since=epoch&limit=100` repeatedly | Standard, slow for large outboxes |
+| **Snapshot fetch** | Find latest `Create{Snapshot}` published by B for the projection of interest, fetch + verify, then pull only activities after the snapshot's tip | Fast, requires B to publish snapshots |
+| **Bundle fetch** | Out-of-band: B publishes a CID for an export bundle (a dag-cbor list of activities + actor doc + sig suite verification metadata); A fetches once, validates the chain, replays | Fastest for cold starts; bundle creation is opt-in |
+
+Default: snapshot fetch when available, paginated pull otherwise.
+
+A new instance joining federation typically combines: snapshot-fetch the
+`actor-state` and `define-registry` projections from a trusted peer (so it knows who
+exists and what verbs are defined), then incrementally backfill specific actors of
+interest.
+
+### 13.4 Delivery queue and retry
+
+Every push delivery attempt has a fate:
+
+| Outcome | Action |
+|---------|--------|
+| 2xx | Mark delivered |
+| 3xx | Follow redirect (with limit) |
+| 4xx (except 429) | Mark *permanently failed* — peer rejected the activity. Log; don't retry. |
+| 429 | Honour `Retry-After`; reschedule |
+| 5xx | Exponential backoff; reschedule |
+| Connection error | Exponential backoff; reschedule |
+
+**Retry schedule** (default, tunable per peer):
+
+```
+1 min, 5 min, 15 min, 1 h, 4 h, 12 h, 24 h, 48 h, 96 h
+```
+
+After the last attempt fails, the activity is **abandoned for push** but remains in
+A's outbox. Followers can still pull it via `GET /outbox?since=...`. The peer will
+eventually catch up if they come back online and pull. Push is best-effort; pull is
+the source of truth.
+
+**Persistent queue.** Delivery state is itself stored in the local instance — it's
+operator-internal, not federated. (Could be a regular SQLite table; doesn't need to
+be a projection because it's not state-the-world-cares-about.) On instance restart,
+the queue resumes from where it left off.
+
+**Queue-as-projection (alternative):** for instances that want every aspect to be
+log-derived, the delivery state could be a local-only projection over a stream of
+`Attempt` / `DeliverySuccess` / `DeliveryFailure` activities written to a private
+local-only outbox. Out of scope for v1 but the design admits it.
+
+### 13.5 Audience-respecting delivery
+
+Each activity carries `to`, `cc`, `bto`, `bcc`. The delivery worker computes the
+**delivery set**: union of explicit recipients + (if `as:Public` or `Followers` in
+audience) the actor's followers projection.
+
+- `bto` and `bcc` are stripped before delivery (recipients shouldn't see who else is
+  blind-copied).
+- **Receivers honour audience.** When an instance receives an activity it should
+  not be in the audience for (e.g. a `Direct` activity to someone else, leaked via a
+  misconfigured peer), it logs and discards. Validators in the inbound pipeline
+  enforce this.
+- **Public ≠ unlisted.** `to: as:Public` means deliver to followers AND make
+  publicly fetchable AND show in public projections. Some actors prefer "publicly
+  fetchable but not pushed broadly" — `cc: as:Public` with `to: Followers`.
+
+### 13.6 Spam and abuse posture
+
+ActivityPub has well-known abuse vectors (Mastodon's history is instructive). fed-sx
+defends in layers:
+
+**Signature verification.** Every inbound activity must have a valid signature
+matching an actor whose key was active at `published`. Forgeries are dropped at the
+envelope-validation stage (§14). Necessary but not sufficient — signatures only
+prove the message wasn't tampered with, not that the sender is benign.
+
+**Per-source rate limits.** Per-actor and per-instance request rate limits on
+`/inbox`. Default: 100/min per actor, 1000/min per instance. Exceeded → 429.
+
+**Per-instance trust state.** Three categories, operator-configured (and
+overridable per actor):
+
+- **Trusted** — auto-accept, auto-load Define* (if permissive mode), no rate-
+  multiplier penalty.
+- **Default** — accept signed activities, standard rate limits, do not auto-load
+  Define*.
+- **Suspended** — drop all inbound activities, refuse outbound delivery, do not
+  fetch artifacts. Operator decision (e.g. spam source, harassment instance).
+
+Trust state is local-only (operator policy); it is not federated. Different
+instances can disagree.
+
+**Audience refusal.** Activities not addressed to anyone on this instance (no local
+followers, not `as:Public`, not `to:` a local actor) are dropped on receipt.
+Discourages spam targeting random instances.
+
+**Content validators.** Registry-driven content moderation: a `DefineValidator`
+with `applies-to: "inbound"` runs against every inbound activity and can reject
+based on content rules. Examples: link-spam detection, ML moderation models served
+via an effectful validator (note: effectful validators are a special case — they
+*can* fail-closed without affecting determinism, because validators happen *before*
+projection and don't contribute to projected state).
+
+**Capability vetting.** If an inbound activity declares `capabilities-required`
+that includes definitions this instance hasn't loaded *and* trust policy is strict-
+mode, the activity is quarantined (stored but not projected) pending operator
+review.
+
+**Federation circuit breakers.** Per-peer error rate triggers temporary defederation:
+if a peer is sending malformed activities, exceeding rate limits, or signing with
+revoked keys, automatic suspension for an exponential cool-off.
+
+### 13.7 Discovery
+
+How an instance finds other instances and actors:
+
+- **WebFinger** (RFC 7033). `GET /.well-known/webfinger?resource=acct:user@host`
+  returns links to actor URLs. AP-standard. fed-sx implements.
+- **Well-known capabilities.** `GET /.well-known/sx-capabilities` (§7) for cross-
+  instance compatibility checks.
+- **Manual peer config.** Operators add peer instance URLs to their config.
+- **Peer recommendations.** An instance can publish `Recommend{actor}` activities
+  pointing at peers it considers worth following. Receivers can use these as
+  discovery hints (subject to local trust). Out of scope for v1 but the verb is
+  reservable.
+- **Federation directories.** Community-maintained lists of instances; an instance
+  can opt into being listed by publishing a `Directory{listed-by}` activity. v2
+  concern.
+
+For v1: WebFinger + capabilities + manual config. Discovery beyond that is opt-in
+via standard verbs.
+
+### 13.8 Streaming and real-time
+
+Two streaming mechanisms:
+
+- **Outbox SSE** — `GET /actors/<id>/outbox/stream` opens a Server-Sent Events
+  connection. Each new activity appended to the outbox is sent as an event. Allows
+  pull-style federation peers to maintain a live connection without polling.
+- **Projection SSE** — `GET /projections/<name>/subscribe` (§10.8) streams projection
+  deltas. Useful for clients (browsers) wanting reactive views.
+
+Both are local-only mechanisms; the canonical federation transport remains push to
+inbox + pull from outbox. SSE is convenience, not protocol.
+
+### 13.9 Operational implications
+
+- **Push is best-effort, pull is authoritative.** Operators should treat the outbox
+  as the canonical record; delivery queue is bookkeeping.
+- **Trust is per-instance and not federated.** Two instances may have different
+  views of "good actors" and "bad instances." This is a feature — defederation
+  decisions are local sovereignty.
+- **Backfill via snapshots is the cheap path.** Encouraging actors to publish
+  `Create{Snapshot}` regularly makes new-follower onboarding fast.
+- **Audience semantics are enforced both ways.** Senders compute delivery set;
+  receivers honour audience. Defence-in-depth against misconfigured peers.
+- **Capability-based extension loading is opt-in.** Strict-mode default means
+  unknown verbs are stored-but-not-projected — safe by default, with explicit
+  operator control over what extensions load.
+
+## 14. Validation pipeline
+
+Every activity entering the substrate (whether published locally or received from a
+peer) flows through a fixed pipeline of checks. Order matters: cheap and fail-safe
+first, expensive and content-aware last. Each stage has a defined failure response
+(reject, quarantine, drop). Registry-driven validators plug in at a specific stage.
+
+### 14.1 The two pipelines
+
+**Inbound** — activities arriving via `POST /inbox` or pulled from a peer's outbox:
+
+```
+HTTP transport → envelope → signature → replay → audience →
+  activity-type schema → object-type schema → content validators →
+  capabilities → trust state → log append → projection (async)
+```
+
+**Outbound** — activities being published locally via `POST /activity`:
+
+```
+authentication → authorization → envelope construction → object handling →
+  activity-type schema → signature → log append → projection (async) →
+  delivery (async)
+```
+
+Stages they share are implemented as the same SX functions called from both pipelines.
+
+### 14.2 Inbound pipeline — stage by stage
+
+| # | Stage | Check | Failure response |
+|---|-------|-------|------------------|
+| 1 | **Transport** | Valid HTTP request, content-type acceptable, body parseable as JSON-LD or dag-cbor | `400 Bad Request`; log |
+| 2 | **Envelope** | Matches kernel's envelope spec (required fields present, types valid, recognised activity type or `unknown` allowed) | `400`; log; structured error in response body |
+| 3 | **Signature** | Time-aware sig verification: fetch (or cache-lookup) actor doc, find key with `id == sig.key-id` that was active at `published`, verify against canonical envelope bytes per the named sig suite | `401`; log; do not retry; mark sender's instance for circuit-breaker accounting |
+| 4 | **Replay** | Activity id and CID not already in `activity-log` projection | `200 OK` with `{status: "duplicate"}`, no-op |
+| 5 | **Audience** | This instance has at least one local actor in `to`/`cc`, OR audience contains `as:Public`/`Followers` and the actor has local followers | Drop silently (no response indicating either acceptance or refusal — prevents inbox-membership probing); do not store |
+| 6 | **Activity-type schema** | Look up `DefineActivity{name: <type>}` in `define-registry`; run its `schema` predicate over the activity in pure sandbox | If type unknown: per trust policy (strict: 422 with missing-definition CID; permissive: attempt dynamic load §12.8). If schema fails: 422 with violation detail |
+| 7 | **Object-type schema** | If activity has an `object` with a `type`, look up `DefineObject{name: <type>}` and run its `schema` | Same as #6 |
+| 8 | **Content validators** | All registered validators with `applies-to: inbound` or `applies-to: all` run sequentially; each is a pure-sandbox predicate that returns `:accept` / `:reject` / `:quarantine` | `:reject` → 422 with reason. `:quarantine` → store activity but mark `quarantined`, do not project, alert operator |
+| 9 | **Capabilities** | Every CID in `capabilities-required` is present in this instance's loaded registries (or auto-loadable per trust policy) | Missing → 422 with list of missing CIDs (sender can deliver bootstrapping `Define*` artifacts first). Auto-load attempt can be triggered by re-POST with `?retry-after-load=true` |
+| 10 | **Trust state** | Sender's actor and instance are not in `Suspended` state on this instance | Drop silently; do not respond |
+| 11 | **Log append** | Write activity envelope (and inlined object content) to local mirror of sender's outbox; assign local sequence number | Disk error → 503 (transient); sender retries |
+| 12 | **Projection** | Asynchronously fold the activity into every relevant projection (per `define-registry`) | Per-projection failure (gas, sandbox violation) → tag activity `projection-failed:<projection-name>`; do not affect log durability |
+
+Pipeline halts at the first failing stage. Stages 1–10 are synchronous (`POST /inbox`
+holds the connection). Stage 11 is synchronous; stage 12 is asynchronous and the
+HTTP response returns once the log append succeeds.
+
+### 14.3 Outbound pipeline — stage by stage
+
+| # | Stage | Check | Failure response |
+|---|-------|-------|------------------|
+| 1 | **Authentication** | Caller has a valid bearer token, mTLS cert, or session for the actor | `401` |
+| 2 | **Authorization** | Caller's identity is allowed to publish as the named `actor` (capability token §9.5 or owns the actor key) | `403` |
+| 3 | **Envelope construction** | Kernel fills in `id`, `published`, normalises `to`/`cc`, computes `capabilities-required` (by walking referenced `Define*` CIDs) | n/a |
+| 4 | **Object handling** | If `object` has inline content: canonicalize, compute CID, optionally store per `where`. If `object` references a CID, verify the artifact exists locally or remotely (or accept as a forward reference) | Storage error → `503` |
+| 5 | **Activity-type schema** | Same as inbound #6 — schema must pass | `422` with violation detail (caller bug) |
+| 6 | **Signature** | Sign envelope with the actor's currently-active key matching the activity type's required `purpose` (e.g. `Pin` requires `purpose: pin`) | If no suitable key: `400` |
+| 7 | **Log append** | Write to local outbox; assign sequence number | `503` |
+| 8 | **Projection** | Async fold (same as inbound #12) | Per-projection failure tag |
+| 9 | **Delivery** | Async push to follower inboxes per audience | Per-recipient retry per §13.4 |
+
+Caller's HTTP response returns after stage 7 (log append). The activity is durable
+and queryable as soon as the response is sent; projection lag is reported via
+`projected-up-to` headers and `?wait-for=` parameter.
+
+### 14.4 Failure response taxonomy
+
+Three response categories with explicit semantics:
+
+**Reject** — tell sender, don't store, reject can be retried after sender corrects.
+Used for: malformed envelope, invalid signature, schema violation, missing
+capabilities. HTTP 4xx with structured error.
+
+**Quarantine** — store envelope (it's a valid signed message) but don't project,
+alert operator. Used for: content-validator soft-fail, unloaded capabilities under
+permissive policy, suspect-but-not-banned senders. Activity sits in a quarantine
+projection until operator reviews; operator can release (project) or expunge.
+
+**Drop silently** — don't store, don't respond informatively. Used for: replay (ack
+as duplicate), audience refusal (would leak inbox membership otherwise), suspended-
+sender activities. The sender experiences this as a successful POST with no visible
+effect; they can detect it only by polling for their activity not appearing in our
+outbox.
+
+### 14.5 Registry-driven validators
+
+Most of the pipeline is **fixed kernel logic** (envelope, signature, replay, audience,
+log append, delivery). Two stages are **registry-driven** and extend dynamically:
+
+- **Stage 8 (content validators)** — operators add/remove `DefineValidator` entries
+  with `applies-to: inbound | outbound | all`. Each runs in pure or effectful
+  sandbox per its declaration. Returns one of `:accept` / `:reject{:reason}` /
+  `:quarantine{:reason}`.
+- **Stages 6–7 (schema validators)** — these *are* registry entries
+  (`DefineActivity.schema`, `DefineObject.schema`); the pipeline calls into the
+  registry to fetch them.
+
+**Pure-mode validators** are deterministic and cheap; results can be cached per
+(activity-CID, validator-CID).
+
+**Effectful-mode validators** can call out to ML models, blocklist services,
+external moderation APIs. They get a per-call IO budget; exceeding it counts as
+`:reject{:reason :validator-timeout}`. Effectful validators do *not* break
+determinism because validation happens **before projection** — a rejected activity
+never enters projected state.
+
+### 14.6 Validator composition and ordering
+
+Validators have an integer `priority` field; lower priority runs first. Pipeline
+short-circuits on first `:reject`. `:quarantine` is *not* short-circuiting; later
+validators still run, and `:quarantine` results aggregate.
+
+Default priorities (room for operator-added validators):
+
+```
+0-99    : kernel-internal (envelope, sig, replay, audience)
+100-199 : standard schema validators
+200-299 : standard content validators (rate limit, audience leak)
+300-399 : operator-added moderation
+400-499 : effectful (ML, third-party APIs)
+500+    : reserved
+```
+
+Operators can publish `Update{DefineValidator}` to change priorities or add new
+ones; takes effect on next inbound activity.
+
+### 14.7 Determinism requirement and its limit
+
+A subtlety worth being explicit about: **inbound validation is not required to be
+deterministic across instances.** Two instances can disagree about whether to
+accept a given activity (e.g. one has a stricter content validator). Their projected
+states will then diverge — but only on activities one accepted and the other didn't.
+
+This is fine. Federation does not require state convergence; it requires *fold
+determinism for activities both instances accepted*. Validators are sovereignty
+controls, not protocol invariants.
+
+Where determinism *is* required: schema validators (§14.2 stages 6–7). If two
+instances disagree on whether `Pin v3` matches its schema, they can't federate
+`Pin v3` activities meaningfully. So schema validators must be pure-mode and
+referenced by CID.
+
+### 14.8 Operational implications
+
+- **The pipeline is the security perimeter.** Every checkable property is checked
+  here, not deeper in the kernel. No "trust the caller" assumptions inside log or
+  projection code.
+- **Quarantine is the operator's friend.** Anything suspicious sits in quarantine
+  with full envelope, sig, and reason — operator can review and decide. Better than
+  outright drop because it preserves audit.
+- **Schema validators are protocol-load-bearing; content validators are policy.**
+  The first set must converge across instances for federation to work; the second
+  set can diverge (and that's how local moderation policy is expressed).
+- **Outbound validation catches local bugs early.** A malformed `Pin` activity
+  fails at outbound stage 5, never enters the local log, never gets delivered.
+
+## 15. Storage layout
+
+The on-disk shape of an instance. Three concerns kept separate: the **activity log**
+(append-only, canonical), **content-addressed object storage** (keyed by CID,
+immutable), and **operational state** (projections, indexes, queues — derived,
+rebuildable).
+
+### 15.1 Storage tiers
+
+```
+/var/lib/fed-sx/
+├── log/                                     # canonical, append-only
+│   ├── actors/
+│   │   ├── <local-actor-id>/
+│   │   │   ├── outbox/
+│   │   │   │   ├── 000001.jsonl             # segment, ~64MB cap
+│   │   │   │   ├── 000002.jsonl
+│   │   │   │   └── tip                      # symlink to current segment
+│   │   │   ├── inbox/                       # received, pre-projection
+│   │   │   └── seq                          # next sequence number
+│   │   └── <other-local-actor-id>/...
+│   └── mirrors/                             # local mirrors of followed remote outboxes
+│       └── <remote-actor-id-hashed>/
+│           ├── 000001.jsonl
+│           └── ...
+├── objects/                                 # CID → bytes
+│   └── <cid-prefix-2>/<cid-prefix-2>/<full-cid>
+├── snapshots/
+│   └── <projection-cid>/
+│       ├── <log-tip-cid>.cbor               # snapshot value
+│       └── index                            # ordered list of (log-tip, file)
+├── projections/                             # live projection state
+│   └── <projection-cid>.cbor                # latest in-memory state, periodically flushed
+├── indexes/
+│   └── fed-sx.db                            # SQLite: lookups, queue, trust state
+├── keys/
+│   └── <actor-id>/                          # private keys, mode 0600
+│       ├── primary.pem
+│       ├── recovery.pem
+│       └── sigs.toml                        # key metadata
+├── genesis/
+│   └── bundle.cbor                          # extracted from binary at first run
+└── config.toml                              # operator config
+```
+
+### 15.2 The log — append-only segments
+
+The activity log is the only thing the substrate cannot lose. It is the source of
+truth from which everything else is derived.
+
+**Format: JSONL segments.** Each line is one activity envelope, encoded as JSON-LD
+(canonical form), terminated by `\n`. Easy to inspect, easy to grep, trivially
+streamable.
+
+**Why JSON-LD on disk, not dag-cbor?** Two reasons:
+- Operability: humans can `tail -f` and `grep` the log. dag-cbor is opaque.
+- AP wire compatibility: activities arrive over HTTP as JSON-LD anyway; storing the
+  same form avoids round-trip conversion.
+
+The CID of each activity is computed from its **canonical dag-cbor representation**
+(per §2), independent of how it's stored. CIDs are stable across storage formats.
+
+**Segments cap at ~64MB.** Rotation by size, not time. Old segments are immutable;
+new writes go to the tip segment. Compression (zstd) applied on segments older than
+the current tip — saves disk, doesn't slow appends.
+
+**Per-actor outboxes.** Each local actor has its own outbox directory. This matches
+AP semantics (one outbox per actor) and means:
+- Backing up a single actor is a simple directory copy
+- Per-actor sequence numbers (no cross-actor coordination)
+- Migration (`Move`) is a directory rename + a `Move` activity
+
+**Mirror outboxes.** When a local actor follows a remote one, the remote's outbox is
+mirrored locally for replay. Same JSONL format. Tracked under `log/mirrors/<hashed-
+remote-id>/` to avoid filesystem path issues with URL characters. The hash is
+purely a filesystem-friendly encoding; the canonical actor id stays in the log
+content.
+
+**Inbox vs outbox distinction.** Inboxes hold *received* activities pre-validation;
+outboxes hold *committed* activities post-pipeline. An inbound activity that passes
+the validation pipeline (§14) is moved from inbox to the appropriate mirror outbox.
+This makes inbox a transient queue, not a permanent record.
+
+### 15.3 Object storage
+
+Content-addressed blob store, sharded directories.
+
+**Path scheme:** `objects/<first-2-chars>/<next-2-chars>/<full-cid>`. Sha2-256 CIDs
+are uniformly distributed; this gives ~65k buckets with a couple-hundred files each
+at moderate scale. Standard pattern (matches IPFS, Git).
+
+**Storage backends.** Pluggable per `where: cid` object:
+
+- **`files-on-disk`** (default) — write to local filesystem.
+- **`ipfs`** — register-driven backend; calls out to a local IPFS node.
+- **`s3`** — object storage in cloud bucket.
+- **`memory-only`** — in-memory cache, evictable; useful for ephemeral artifacts.
+
+The kernel uses the `where-tag` on each object to dispatch to the correct backend.
+Backends are registry entries (`DefineStorage`); operators install only the ones
+they want.
+
+**Garbage collection** is opt-in per backend. Default policy: **never GC** (objects
+are immutable and may be referenced by future activities). Operators can configure
+per-backend retention rules:
+
+- "Keep last N versions of objects referenced by `Pin` activities for path X"
+- "Evict objects not referenced in last 90 days from the `memory-only` cache"
+- "Mirror objects referenced by ≥ 3 endorsements; evict others after 30 days"
+
+GC operates on the projected reference graph (a `reference-graph` projection that
+maintains "what activities reference this CID"). Removing an object that's still
+referenced is allowed but produces a warning logged in operations.
+
+### 15.4 Snapshots
+
+Per §10.4, snapshots are the (projection-CID, log-tip-CID, state) triples that let
+us resume without full replay.
+
+**Storage:** `snapshots/<projection-cid>/<log-tip-cid>.cbor`. The state value is
+dag-cbor-encoded; the file's content CID matches the snapshot's claimed CID.
+
+**Index:** `snapshots/<projection-cid>/index` is a sorted list of `(log-tip-time,
+log-tip-cid, file)` triples. On startup, kernel finds the latest snapshot ≤ current
+log tip and resumes from it. On time-travel queries, finds the latest snapshot
+≤ target time and folds forward.
+
+**Retention:** keep at least:
+- Latest snapshot per active projection
+- Snapshots referenced by published `Create{Snapshot}` activities (federation
+  proofs)
+- One snapshot per day for the last 7 days (audit / time-travel)
+
+Older snapshots GC'd by default. Operators can increase retention.
+
+### 15.5 Operational state — SQLite
+
+Things that are derived, frequently-queried, but not federated:
+
+- **Lookup indexes** for projections (when `indexes:` declared) — `(projection,
+  index-key, value) → activity-cid` rows
+- **Delivery queue** — outbound activities pending push, retry counts, next-attempt
+  timestamps
+- **Trust state** — per-actor and per-instance trust levels (Trusted / Default /
+  Suspended)
+- **Quarantine queue** — activities pending operator review
+- **Configuration cache** — currently-active registry entries (also in memory; on-
+  disk cache for fast restart)
+
+Single SQLite file (`indexes/fed-sx.db`). Recoverable: if corrupted or deleted,
+rebuilt from the log on next startup (with cost proportional to log size). The
+SQLite is a cache, not authoritative.
+
+WAL mode for concurrent readers. Single-writer (the kernel); reads from many
+HTTP request workers.
+
+### 15.6 Backup and export
+
+The substrate is an append-only log of immutable artifacts; backup is simple.
+
+- **Full backup:** rsync `/var/lib/fed-sx/log/` and `/var/lib/fed-sx/objects/`. The
+  rest is rebuildable.
+- **Per-actor export:** tar `log/actors/<actor-id>/` + the objects referenced by
+  activities in that outbox. Self-contained, importable into another instance.
+- **Activity bundle export:** for federation backfill, produce a dag-cbor bundle of
+  `[activity envelopes... + referenced objects]` for a specified actor + range.
+  Single file, content-addressed, signed by the source instance with a `Bundle`
+  activity attesting to its contents.
+
+Exports are themselves publishable (`Create{Bundle}` activity carrying the bundle
+CID). This is how an actor migrates instances cleanly: export bundle, import on
+new instance, publish `Move` activity.
+
+### 15.7 Mirroring and replication
+
+Two patterns:
+
+- **Federation mirroring** (the canonical kind) — when actor A follows B, A's
+  instance mirrors B's outbox locally. This is just normal federation (§13). Each
+  follower keeps its own copy.
+- **Operational mirroring** — for high availability. An operator runs two instances
+  with shared filesystem (NFS / EFS) for `log/` and `objects/`, separate SQLite
+  files. Reads can hit either; writes go through one. Or: rsync-based hot standby
+  with manual failover.
+
+Operational mirroring is out of scope for v1. Federation mirroring is the substrate-
+level redundancy: as long as one peer that followed you is still online, your log is
+still recoverable.
+
+### 15.8 Storage size estimates
+
+Rough targets at moderate scale (10 active local actors, 1000 followed peers, 1
+year of activity at 100 activities/actor/day):
+
+- **Log:** 10 actors × 100 act/day × 1 KB avg envelope × 365 days ≈ 365 MB local
+  outbox. Mirrors: 1000 peers × 10 act/day × 1 KB × 365 ≈ 3.6 GB.
+- **Objects:** depends heavily on content. Assume 50% of activities have inline
+  content of avg 5 KB → ~2 GB total inline. CID-referenced larger objects: count
+  separately, depends on use case.
+- **Snapshots:** typically much smaller than the log. ~10 active projections ×
+  ~10 MB per snapshot × ~8 retained snapshots ≈ 800 MB.
+- **SQLite:** index sizes proportional to indexed projection content; typical few
+  hundred MB.
+
+Total: order of 10 GB at the described scale. Single-machine viable; SSD recommended
+for log throughput; spinning disk fine for snapshots and object storage cold tier.
+
+### 15.9 Operational implications
+
+- **The log is sacred.** Never modify, never delete. Backups go to multiple media.
+  Loss of `log/` means loss of identity (actor activities) and loss of state-of-
+  record. Loss of `objects/` means loss of content but log + peers can recover most
+  of it.
+- **Everything else is rebuildable.** Projections, indexes, snapshots, queue state
+  can all be recomputed from the log at startup cost. Operationally, this means
+  upgrades and migrations are forgiving.
+- **CID-addressed storage is naturally idempotent.** Two instances writing the same
+  artifact write the same bytes to the same path. Race conditions become no-ops.
+- **JSONL on disk pays for itself** the first time an operator needs to debug a
+  weird federation issue with `grep` and `jq`. Worth the storage cost vs dag-cbor.
+
+## 16. API surface
+
+HTTP API for reading the log, publishing activities, querying projections, and
+streaming updates. Three layers: **AP-standard** endpoints (for vanilla AP
+interop), **fed-sx-specific** endpoints (publish, query, capabilities), and
+**discovery** endpoints (webfinger, well-known).
+
+### 16.1 Endpoint catalog
+
+#### AP-standard
+
+| Method | Path | Purpose |
+|--------|------|---------|
+| GET | `/actors/<id>` | Actor doc (Person/Service/Group/Application) |
+| GET | `/actors/<id>/inbox` | Read inbox — auth required |
+| POST | `/actors/<id>/inbox` | Receive federated activity (HTTP Signature required) |
+| GET | `/actors/<id>/outbox` | OrderedCollection of actor's published activities |
+| POST | `/actors/<id>/outbox` | AP-standard publish (alias for `POST /activity` with `actor` set) |
+| GET | `/actors/<id>/followers` | OrderedCollection of follower actor URIs |
+| GET | `/actors/<id>/following` | OrderedCollection of followed actor URIs |
+| GET | `/activities/<uuid>` | Single activity by id |
+| GET | `/objects/<uuid>` | Single object by id (note: distinct from CID-addressed `/artifacts/<cid>`) |
+
+#### fed-sx-specific
+
+| Method | Path | Purpose |
+|--------|------|---------|
+| POST | `/activity` | Generalised publish — accepts any well-formed activity |
+| GET | `/artifacts/<cid>` | CID-addressed artifact fetch (content negotiated) |
+| GET | `/artifacts/<cid>/raw` | Raw bytes (whatever the codec stored) |
+| GET | `/artifacts/<cid>/<path>` | IPLD path traversal into the artifact |
+| GET | `/projections` | List of registered projections (name, CID, last-folded-tip) |
+| GET | `/projections/<name>` | Full projection state (paginated for large states) |
+| GET | `/projections/<name>?at=<ts>` | Time-travel: state as of timestamp |
+| GET | `/projections/<name>/<key>` | Single key from a projection (uses indexes) |
+| POST | `/query` | Run an SX query expression against one or more projections |
+| GET | `/define-registry` | Currently active `Define*` artifacts by kind |
+| GET | `/capabilities/<actor-id>` | Per-actor declared capabilities |
+
+#### Discovery and well-known
+
+| Method | Path | Purpose |
+|--------|------|---------|
+| GET | `/.well-known/webfinger?resource=acct:<user>@<host>` | RFC 7033 actor discovery |
+| GET | `/.well-known/sx-capabilities` | This instance's capability advertisement (§7) |
+| GET | `/.well-known/host-meta` | XRD describing the host |
+| GET | `/.well-known/nodeinfo` | Standard fediverse node metadata (Mastodon, Pleroma compatibility) |
+
+#### Real-time (SSE)
+
+| Method | Path | Purpose |
+|--------|------|---------|
+| GET | `/actors/<id>/outbox/stream` | New activities as they're appended (events: `activity`) |
+| GET | `/actors/<id>/inbox/stream` | New inbound activities (auth required) |
+| GET | `/projections/<name>/subscribe` | Projection deltas (events: `delta`) |
+| GET | `/federation/health/stream` | Per-peer delivery health (events: `peer-status`) |
+
+WebSocket equivalents (`/ws/...` paths) available where SSE is awkward (browsers
+behind proxies); same event payloads, different framing.
+
+### 16.2 Authentication
+
+Three mechanisms, each appropriate to a different caller type:
+
+- **HTTP Signatures** (RFC draft-cavage-http-signatures) — the AP-standard mechanism
+  for inter-instance calls. Sender signs a digest of relevant headers + body with
+  their actor's private key; receiver verifies via the actor's public keys
+  projection (§9.6). Used for: `POST /inbox`, peer-to-peer outbox pulls when
+  authentication is desired.
+- **Bearer tokens** — for interactive clients (CLIs, web UIs, mobile apps).
+  Issued via OAuth2 (or simple admin-issued tokens for v1). Used for:
+  `POST /activity`, `GET /actors/<id>/inbox`, anything requiring caller identity.
+- **Capability tokens** (§9.5) — for delegated publish. Token includes the granting
+  actor, the granted capabilities (e.g. `publish: Pin for path-prefix /docs/`), the
+  bearer's actor, expiry, and signature from the granter. Used for: child actors,
+  service accounts, temporary publish access.
+
+Public reads (most GET endpoints to public-audience activities) require no auth.
+Private/followers-only reads check the caller's identity against the audience.
+
+### 16.3 Content negotiation
+
+Same resource, multiple representations. `Accept` header dispatches:
+
+| Accept header | Returns |
+|---------------|---------|
+| `application/activity+json` | AP-standard JSON-LD (default for ambiguous Accepts) |
+| `application/ld+json; profile="..."` | JSON-LD with explicit profile |
+| `application/cbor` | dag-cbor |
+| `application/json` | Plain JSON (compact, no `@context` expansion) |
+| `application/sx` | Canonical SX wire format |
+| `text/html` | HTML representation (for browsers — renders the artifact via SX) |
+
+Same negotiation applies to `/artifacts/<cid>`, `/activities/<uuid>`,
+`/projections/<name>`. Servers MUST honour the request; absent `Accept` defaults to
+`application/activity+json`.
+
+### 16.4 Pagination
+
+Cursor-based via AP's `OrderedCollectionPage`:
+
+```
+GET /actors/giles/outbox
+→ {
+    "type": "OrderedCollection",
+    "totalItems": 12345,
+    "first": "/actors/giles/outbox?page=true",
+    "last": "/actors/giles/outbox?page=true&min_id=0"
+  }
+
+GET /actors/giles/outbox?page=true
+→ {
+    "type": "OrderedCollectionPage",
+    "id": "...?page=true",
+    "next": "...?page=true&max_id=<cid>",
+    "prev": "...?page=true&min_id=<cid>",
+    "orderedItems": [...]
+  }
+```
+
+Cursors are CIDs of the boundary activity (not opaque tokens). Stable across
+restarts and instances. `max_id` returns activities **before** the cursor (newest
+first); `min_id` returns activities **after** the cursor.
+
+Default page size: 50. Max: 1000. `Link: <...>; rel="next"` header also provided
+for HTTP-native pagination.
+
+For projections: same shape, items are projection entries.
+
+### 16.5 The query API
+
+`POST /query` takes an SX expression evaluated in pure mode against named
+projections:
+
+```sx
+POST /query
+Content-Type: application/sx
+Accept: application/sx
+
+(let ((actors  (projection actor-state))
+      (pins    (projection pin-state)))
+  (for-each ([(actor-id actor) actors])
+    (when (> (count (filter (fn ((path cid)) (= (:owner cid) actor-id)) pins)) 10)
+      {:actor (:preferredUsername actor)
+       :pins-published (count ...)})))
+```
+
+Query semantics:
+
+- Evaluated in pure sandbox; all the determinism rules apply.
+- Projection access is read-only and snapshot-consistent: the query sees state
+  as-of the time of the request (or `?at=` if specified).
+- Result is serialized in the negotiated content type.
+- Gas limit applies (default 1M units per query, tunable by operator).
+- Cacheable: query CID + projection state CIDs uniquely determine the result.
+
+Query results can themselves be published as `Create{QueryResult}` activities,
+making derived analyses federable.
+
+### 16.6 Errors
+
+Uniform JSON error envelope:
+
+```json
+{
+  "error": {
+    "type": "https://next.rose-ash.com/ns/fed-sx/errors/v1#InvalidSignature",
+    "status": 401,
+    "title": "Activity signature invalid",
+    "detail": "Key id 'https://example/actors/x#key-1' was superseded at 2026-01-15T...",
+    "activity-id": "https://...",
+    "key-id": "...#key-1",
+    "instance": "/incidents/<incident-cid>"
+  }
+}
+```
+
+Error types are URIs in the fed-sx namespace; receivers can check `type` for
+programmatic handling. Standard errors:
+
+- `MissingCapability` — includes `missing` array of CIDs
+- `SchemaViolation` — includes `schema-cid`, `field-path`, `expected`, `got`
+- `InvalidSignature`
+- `Quarantined` — includes `quarantine-id` for operator-status tracking
+- `RateLimited` — includes `retry-after`
+- `ResourceExhausted` — for query gas exhaustion
+
+### 16.7 Streaming details
+
+SSE event format:
+
+```
+event: activity
+id: <activity-cid>
+data: { ...activity envelope... }
+
+event: delta
+id: <activity-cid that triggered the delta>
+data: {"projection": "actor-state", "key": "...", "old": ..., "new": ...}
+
+event: heartbeat
+data: {"projected-up-to": "<cid>", "ts": "..."}
+```
+
+Clients reconnect with `Last-Event-ID: <cid>` to resume from the last event seen.
+Server replays from that point in the log (or returns 410 if too far behind, in
+which case client should switch to paginated pull).
+
+### 16.8 Versioning
+
+The substrate is versioned at three levels:
+
+- **Envelope version** — declared in `/.well-known/sx-capabilities`. Currently `1`.
+  Forward-compatible (new fields OK; semantics fixed).
+- **API version** — URL prefix optional: `/v1/...` works the same as `/...`. Future
+  major version: `/v2/...` paths in parallel.
+- **Definition versions** — supersession via activity log (§§9.2, 12.7). No special
+  URL handling.
+
+Capability negotiation happens before federation; clients shouldn't hard-code
+URL paths beyond the canonical set documented here.
+
+### 16.9 Operational implications
+
+- **The API is small but layered.** AP compatibility is one layer; fed-sx
+  extensions are another; both share auth and content negotiation. Adding a new
+  endpoint shouldn't require new transport machinery.
+- **Content negotiation is the polyglot bridge.** Same artifact addressable in JSON-
+  LD (for AP peers), dag-cbor (for fed-sx peers), SX (for SX clients), HTML (for
+  humans). One CID, four representations.
+- **Cursor pagination is CID-based.** Stable identifiers, no opaque tokens to
+  invalidate, peers can synchronize without coordination.
+- **The query API is a load-bearing differentiator.** Datalog/GraphQL-equivalent
+  expressiveness with no separate query language — it's just SX. Federable, signable,
+  versionable like any other SX artifact.
+
+---
+
+## 17. Implementation languages
+
+Polyglot **authoring**, monoglot **runtime**: every language-on-SX compiles to core
+SX and runs on any host with the SX evaluator. The language is an authoring choice;
+the federated artifact is uniform SX. Authors of `Define*` artifacts pick the
+source language they prefer; consumers don't need that compiler installed to
+execute the compiled SX.
+
+Languages are picked because they **genuinely fit the problem**, not to demonstrate
+the polyglot story. Where a chosen language has gaps (e.g. Erlang-on-SX missing hot
+reload), we invest in maturing the port rather than working around the gap.
+
+### 17.1 The v1 stack
+
+| Layer | Language | Why |
+|-------|----------|-----|
+| **Native primitives** | OCaml (existing runtime) | Crypto (RSA, Ed25519, SHA), dag-cbor encode/decode, HTTP socket, file IO, SQLite. Surfaced as Erlang-on-SX BIFs. |
+| **Kernel orchestration** | Erlang-on-SX | Actor model = federation. `gen_server` per actor / per projection / per peer. `supervisor` for delivery workers. Message passing is literally the substrate. Hot code reload (Phase 7) for `Define*` live extension. |
+| **Query API back-end** | Datalog-on-SX | Projection state is relational; trust graph walks, provenance, projection joins are textbook Datalog. Already mature (276/276 tests, full core Datalog with stratified negation, aggregation, magic sets, federation-graph demo). |
+| **`Define*` semantics, schemas, validators, codecs, audience predicates** | Core SX | The canonical federated language. Everything content-addressed and federated lives here. |
+
+### 17.2 Languages explicitly **not** booked for v1
+
+Available, mature, considered — would be reached for if a real fed-sx need surfaced,
+but no preemptive use:
+
+- **Haskell-on-SX** (285/285 tests, 36 programs, type checker working) — for complex
+  operator-authored extensions that benefit from typed pattern matching. Schemas in
+  fed-sx are short predicates; types don't earn their keep here.
+- **Smalltalk-on-SX** (625/629 tests, classic corpus running) — natural fit for a
+  live operator dashboard / Glamorous-Toolkit-style introspection. v2/v3 territory;
+  a browser UI likely wins for operator audiences.
+- **APL-on-SX** — high-throughput batch reprojection if scalar SX folds become a
+  bottleneck. Premature without measured need.
+- **JS-on-SX**, **Elm-on-SX** — browser-side client SDK / viewer. v2.
+- **Common Lisp-on-SX**, **Forth-on-SX**, **Go-on-SX**, **Dream-on-SX**,
+  **Elixir-on-SX**, **Erlang-on-SX (alternative form)** — case by case if a use
+  case appears.
+
+### 17.3 The FFI BIF layer
+
+Erlang-on-SX has no FFI / NIF mechanism in its current form (Phase 6 plan: "out of
+scope entirely"). fed-sx adds a **BIF layer** in `lib/erlang/transpile.sx` (or a
+dedicated `lib/erlang/fed_bifs.sx`) exposing native primitives:
+
+```
+crypto:rsa_verify/3       crypto:ed25519_verify/3
+crypto:sha2_256/1         crypto:sha3_256/1
+
+cid:cbor_encode/1         cid:cbor_decode/1
+cid:multihash/2           cid:from_bytes/2
+cid:to_string/1           cid:from_string/1
+
+log:append/2              log:read/3
+log:tip/1                 log:replay/3
+
+http:listen/2             http:request/2
+http:respond/3            http:sse_send/2
+
+fs:read/1                 fs:write/2
+fs:exists/1               fs:list/1
+
+sqlite:open/1             sqlite:exec/2
+sqlite:query/3            sqlite:close/1
+
+snapshot:put/3            snapshot:get/2
+```
+
+Each BIF is a thin Erlang-on-SX function dispatching to the corresponding SX runtime
+IO primitive. Returns Erlang-shaped values (atoms, tuples, binaries). Errors raise
+appropriate Erlang exceptions (`badarg`, `enoent`, `eaccess`).
+
+This is the **only** native-FFI surface in fed-sx. All other I/O goes through these
+BIFs. Operators can audit the BIF list to know exactly what the substrate touches
+outside SX.
+
+### 17.4 Build pipeline
+
+```
+.sx files (core SX, registry entries) ──┐
+.erl files (Erlang-on-SX kernel)    ──┼──> compile to core SX
+.dl files (Datalog-on-SX queries)   ──┘
+                                       │
+                            content-addressed SX artifacts
+                                       │
+                                       ▼
+                         genesis bundle (CID-verified)
+                                       │
+                                       ▼
+                         OCaml runtime evaluates everything
+```
+
+Each authoring language's compiler runs at build time, producing core SX that goes
+into the genesis bundle (for bootstrap definitions) or gets published as activities
+(for runtime extensions).
+
+### 17.5 Prerequisite work
+
+Pieces of investment land in or alongside the Erlang-on-SX loop. The first two
+land **before** fed-sx kernel code starts; the third runs in parallel, not
+blocking milestone 1, but blocking production-grade throughput.
+
+1. **Phase 7 — hot code reload.** `code:load_binary/3`, `gen_server`
+   `code_change/3` callback dispatch, atomic module-version swap. Required for
+   `Define*` live extension (no kernel restart to load new verbs). Reload-
+   semantics choice (two-version coexistence vs single-version atomic swap with
+   closure capture) decided during the work.
+
+2. **Phase 8 — FFI mechanism + initial BIFs.** `define-bif` registration + term
+   marshalling + error mapping, then BIFs for `crypto:*`, `cid:*` (dag-cbor),
+   `fs:*`, `http:*`, `sqlite:*`. Required for fed-sx kernel to call native
+   primitives. Lands before kernel code that calls them.
+
+3. **Phase 9 — specialized opcodes (the BEAM analog).** *Layered perf strategy:*
+   - **Layer 1 (Phase 9, in scope)** — specialized bytecode opcodes that bypass
+     the general-purpose CEK machine for hot Erlang operations. `OP_PATTERN_TUPLE`,
+     `OP_PERFORM`/`OP_HANDLE`, `OP_RECEIVE_SCAN`, `OP_SPAWN`/`OP_SEND`, BIF
+     dispatch table. Targets: 100k+ message hops/sec, 1M-process spawn under
+     30sec — roughly 1000-3000× speedup over the current general-purpose path.
+   - **Layer 2 (Phase 10, deferred)** — multi-core scheduler via OCaml 5
+     domains. Decided empirically after Layer 1 lands; likely unnecessary if
+     Layer 1 alone hits target throughput.
+   - **Layer 3 (skipped)** — incremental tuning of the existing call/cc-based
+     receive and env-copy-per-call machinery. Obsoleted by Layer 1; not pursued.
+
+   **Architectural note for Phase 9.** Phase 9a (the **opcode extension
+   mechanism in `hosts/ocaml/evaluator/`**) is out of scope for the Erlang loop
+   — it's SX VM core, used by every language port that wants specialized
+   opcodes. Designed in `plans/sx-vm-opcode-extension.md`; lands as a separate
+   focused workstream (~1-2 weeks) owning `hosts/`. Phase 9b-9g (the actual
+   Erlang opcodes in `lib/erlang/vm/`) are designed and tested against a stub
+   dispatcher in the Erlang loop until 9a is available.
+
+   **Shared-opcode discipline.** Opcodes Phase 9 produces that other language
+   ports could plausibly use (pattern match, perform/handle, record access)
+   become candidates for chiselling out to **`lib/guest/vm/`** — same lib/guest
+   discipline, applied at the bytecode layer. Don't pre-extract; promote to
+   `lib/guest/vm/` when a second language port has an actual second use. The
+   substrate accumulates a richer opcode surface over time as ports contribute,
+   and every port benefits from every shared opcode (the structural advantage
+   over BEAM, which is special-purpose-built for one language).
+
+   **fed-sx is not blocked by Phase 9.** Milestone 1 ships on current Erlang-
+   on-SX perf (which has 100-1000× headroom for a single demo instance). Phase
+   9 lands in parallel; by the time fed-sx needs production-grade throughput
+   (federation hub use cases, milestone 2-3), Phase 9 is ready.
+
+After Phases 7 and 8 land, fed-sx milestone 1 (kernel + registries + bootstrap
+entries + Pin smoke test + reactive application smoke test) becomes the next
+workstream. Phase 9 work continues in parallel.
+
+---
+
+## 18. Subscription model
+
+Symmetric to the publish-side extensibility: just as `DefineActivity` registers what
+*kinds of things can be published*, `DefineSubscription` registers what *kinds of
+patterns can be subscribed to*. `Follow` becomes one standard subscription type
+among many, not a hardcoded primitive.
+
+### 18.1 The asymmetry being fixed
+
+Without this, the substrate has rich publish-side extensibility (any new verb is a
+`DefineActivity`) and *one* hardcoded subscription primitive (`Follow`). That
+mirrors AP but it's an arbitrary limitation in a substrate where everything else
+is registry-driven. Generalising restores symmetry.
+
+### 18.2 The `DefineSubscription` shape
+
+```sx
+(activity 'Create
+  :object {:type "DefineSubscription"
+           :name "Follow"                        ; AP-standard
+           :schema (fn (sub)                     ; what params the sub takes
+             (and (cid? (-> sub :object))
+                  (= "Person" (-> sub :object-type))))
+           :match (fn (subscription activity)    ; pure-mode predicate
+             (= (-> subscription :object) (:actor activity)))
+           :delivery {:default :push
+                      :modes [:push :pull :sse]
+                      :digest-window nil}
+           :capabilities-required []})           ; some subs may need authority
+```
+
+Four mandatory parts:
+
+- **`schema`** — pure-mode predicate validating subscription parameters at
+  `Subscribe` time. Catches malformed subscriptions before they enter state.
+- **`match`** — pure-mode predicate `(subscription, activity) → bool`. Decides
+  whether a given activity is a hit for this subscription. Determinism rules
+  apply (§11.2).
+- **`delivery`** — supported modes (push to inbox / pull on demand / SSE
+  streaming / batched digest). The subscription instance picks its preferred
+  mode at `Subscribe` time from the supported set.
+- **`capabilities-required`** — capability tokens the subscriber must hold
+  (empty for public subs; populated for paywalled/gated/private streams).
+
+### 18.3 The `Subscribe` verb
+
+The bootstrap verb that activates a subscription:
+
+```sx
+(activity 'Subscribe
+  :object {:type "Follow"   :object "https://alice.example/actors/alice"})
+
+(activity 'Subscribe
+  :object {:type "Topic"    :tag "climate-change"
+           :delivery :digest :digest-window "P1D"})
+
+(activity 'Subscribe
+  :object {:type "CidWatch" :cid "bafy..."
+           :events [:supersede :endorse]})
+
+(activity 'Subscribe
+  :object {:type "Predicate"
+           :pred '(fn (act) (and (= (:type act) "Note")
+                                  (string-contains? (-> act :object :content) "fed-sx")))})
+```
+
+`Unsubscribe` is `Undo{Subscribe}` — AP's standard pattern, retains audit.
+
+### 18.4 Standard subscription types (defined later, not bootstrap)
+
+Same status as the custom verbs in §6.2 — substrate accepts any subscription
+type once a `DefineSubscription` artifact registers it. Standard set:
+
+| Name | Params | Match semantics | Use case |
+|------|--------|-----------------|----------|
+| **`Follow`** | `{object: actor-id}` | activity.actor == subscription.object | AP-standard actor following |
+| **`Topic`** | `{tag: string}` | tag in activity.object.tags | Hashtag follows, RSS-like |
+| **`CidWatch`** | `{cid, events: [...]}` | activity references cid AND activity.type in events | "Notify me when this artifact is updated/endorsed/forked" |
+| **`PathWatch`** | `{path, events: [...]}` | activity is a Pin/Update of named path | "Notify me when domain:foo/bar/baz changes" |
+| **`VerbFilter`** | `{wraps: subscription-cid, types: [...]}` | inner subscription matches AND activity.type in types | "Follow Alice but only Endorse activities" |
+| **`TrustGraph`** | `{root: actor-id, depth: int}` | activity.actor reachable from root in trust graph at depth | Web-of-trust expansion |
+| **`Predicate`** | `{pred: sx-fn}` | (pred activity) returns truthy | Escape hatch — most powerful, highest cost |
+| **`Channel`** | `{channel-id}` | activity addresses or originates from channel | Multi-actor pooled streams |
+
+### 18.5 Match-fn execution location
+
+The load-bearing question. Three choices, fed-sx adopts the **hybrid model**:
+
+- **Coarse filter on the publisher side** — audience predicates (§8) decide who
+  the activity is delivered to at all. This is mandatory and cheap (audience set
+  is usually small and well-defined).
+- **Fine filter on the subscriber side** — once an activity arrives in inbox,
+  the subscriber's instance evaluates each active subscription's `match-fn`
+  against it. Pure-mode evaluation (deterministic, gas-bounded). Activities
+  matching one or more subscriptions enter the subscriber's projected state.
+
+Why hybrid: publisher-side fine filtering would require the publisher to know
+every subscriber's match-fn (privacy-violating, scaling-killing). Subscriber-side
+filtering is wasteful only if the publisher's audience model is too coarse —
+which is the audience system's job to fix per §8.
+
+### 18.6 Subscription state and storage
+
+Active subscriptions are themselves projected state. A bootstrap projection
+`subscriptions` (paralleling `audience-graph` for the inverse direction)
+maintains:
+
+```
+{actor-id -> [{subscription-cid, type, params, mode, started-at}]}
+```
+
+Updated by `Subscribe` and `Unsubscribe` activities. Queryable like any other
+projection (§16). Used by:
+
+- The inbox dispatcher to know which match-fns to evaluate against incoming
+  activities
+- Triggers (§19) to know which activities to fire on
+- Federation to advertise "here are the subscription types I currently subscribe
+  to" (capability-style, opt-in)
+
+### 18.7 Federation interactions
+
+Subscriptions interact with federation in three ways:
+
+- **Discovery.** Peer's `/.well-known/sx-capabilities` (§7) lists registered
+  `DefineSubscription` CIDs, so subscribers know what they can ask for.
+- **Negotiation.** A `Subscribe` activity carries `capabilities-required`; if
+  the publisher's instance doesn't support the named subscription type, it
+  responds with the standard 422 + missing-CIDs error (§14.2 #9). Subscriber
+  can then deliver the bootstrapping `DefineSubscription` artifact and retry.
+- **Cross-instance match-fn**. If subscriber and publisher both run the same
+  conformance-tested SX evaluator, identical subscriptions match identically
+  (cross-host equivalence, §11.8). This is what makes federated topic
+  subscriptions reliable: every conforming instance computes the same
+  set-of-matches for the same activity.
+
+### 18.8 Operational implications
+
+- **The audience system handles "who do I send this to."** The subscription
+  system handles "what do I want to receive." They're complementary, not
+  redundant.
+- **Subscription types can themselves evolve via supersession.** New version of
+  `Topic` with case-insensitive matching? Publish a new `DefineSubscription`,
+  `Supersede` the old one. Existing subscriptions migrate at next match
+  evaluation.
+- **Match-fn cost matters.** A `Predicate` subscription with a slow predicate
+  becomes a per-activity tax. Gas budgets (§11.5) bound the worst case;
+  operators can disable expensive subscription types if needed.
+- **Subscriptions are signed messages.** Audit, accountability, and revocation
+  all work the same way as activities — because subscriptions *are* activities.
+
+---
+
+## 19. Application model
+
+The synthesis. With publish, subscribe, project, and trigger as registry-driven
+primitives, the substrate has everything needed to express **distributed reactive
+applications** as data — no native code, no kernel changes, no privileged
+runtime. Applications are themselves federated artifacts.
+
+### 19.1 An application is a tuple of artifacts
+
+```
+Application = {
+  subscriptions : [DefineSubscription instances and their parameters],
+  triggers      : [DefineTrigger registrations],
+  projections   : [DefineProjection registrations],
+  storage       : [DefineStorage registrations]   (optional)
+}
+```
+
+That tuple, signed and bundled, is the application. Installing one = following
+the named actors / activating the named subscriptions + loading the Define*
+CIDs into the local registry. Forking one = republishing the Define* with
+`Supersede` over the bits you change.
+
+### 19.2 The reactive loop
+
+```
+       External actors                       Operator publishes activities
+       publish activities                    via this instance's actors
+              │                                      │
+              ▼                                      ▼
+       ┌─────────────────────────────────────────────┐
+       │ Inbound + outbound activities               │
+       └────────────────────┬────────────────────────┘
+                            │
+                            ▼
+              For each active subscription:
+              evaluate match-fn (pure mode)
+                            │
+              ┌─────────────┴─────────────┐
+              ▼                           ▼
+     Activity matches                Activity does
+     a subscription                  not match
+              │                           │
+              ▼                           ▼
+       Projections          ←     (silently dropped from
+       fold the activity            this application's view;
+              │                      may match other apps)
+              ▼
+       Triggers fire on the
+       subscription's match
+              │
+              ▼
+       Trigger then-sx runs
+       (effectful sandbox)
+              │
+              ├──> updates local state (private projections)
+              ├──> publishes new activity (via outbox)
+              └──> calls effectful primitives (HTTP, fs, etc.)
+                   per declared capabilities
+```
+
+Three things happen on a match: **state updates** (projection), **derived
+publishes** (new activities), **side effects** (effectful primitives). Each is
+authorisation-gated by the trigger's declared capabilities.
+
+### 19.3 Trigger semantics
+
+`DefineTrigger` registers `(when-subscription, then-sx, cascade-limit)`:
+
+- **`when-subscription`** — references a subscription (by CID or by name). The
+  trigger fires whenever that subscription matches an inbound or outbound
+  activity. Multiple triggers can reference the same subscription.
+- **`then-sx`** — function of `(activity, subscription, env) → trigger-result`.
+  Runs in pure or effectful sandbox per declaration. Returns one or more of:
+  - `:publish [activity-spec ...]` — request publish of derived activities
+  - `:project [name → state-update ...]` — request projection updates
+  - `:effect [capability-call ...]` — request effectful primitive calls
+  - `:noop` — observed but no action
+- **`cascade-limit`** — bounded depth for trigger cascades (§19.4).
+
+A trigger is fundamentally **a reactive rule**: "when X happens, do Y." The
+substrate guarantees Y happens at most once per X (deduplicated by activity-CID),
+exactly-once-per-instance (delivery from trigger to its effects is durable),
+and bounded-cost (gas + cascade-limit).
+
+### 19.4 Cascade control
+
+A trigger that publishes activities can fire other triggers. Without limits, a
+single inbound activity could cascade across instances forever.
+
+Each trigger declares `cascade-limit: N` (default 3). Each activity carries an
+implicit `cascade-depth` field, incremented when it's the result of a trigger
+firing. A trigger refuses to fire if `cascade-depth > cascade-limit`.
+
+Cascade limits are local-only (operator policy, not federated). Defending
+against runaway cascades from peer instances is the operator's job; the
+substrate gives them the knob.
+
+### 19.5 The `DefineApplication` bundle
+
+A bundle artifact that names and groups the components of an application:
+
+```sx
+(activity 'Create
+  :object {:type "DefineApplication"
+           :name "rose-ash-blog"
+           :version 1
+           :subscriptions [{:type "Follow"   :object "https://blog.rose-ash.com/actors/main"}
+                           {:type "Topic"    :tag "rose-ash"}
+                           {:type "CidWatch" :cid <rose-ash-template-cid>
+                                             :events [:supersede]}]
+           :triggers      [<comment-moderation-trigger-cid>
+                           <reaction-counter-trigger-cid>
+                           <rss-republish-trigger-cid>]
+           :projections   [<comment-thread-projection-cid>
+                           <reaction-counts-projection-cid>]
+           :storage       [<local-files-storage-cid>]
+           :capabilities  [<http-allowlist-cap-cid>
+                           <fs-write-cap-cid>]
+           :description   "Federated blog with moderated comments and RSS"})
+```
+
+Three operations on applications, all themselves activities:
+
+- **Install** — `Subscribe` to each subscription, `Create{}` references in
+  `define-registry` to each trigger/projection/storage CID. One activity per
+  reference, audited and replayable. Or: a single `Install{DefineApplication}`
+  meta-verb that does the bundle in one signed step (defined later as a custom
+  verb, not bootstrap).
+- **Update** — publish a new `DefineApplication` with the same name +
+  `supersedes` pointing at the old. Diff-then-apply: subscriptions added/
+  removed, triggers loaded/unloaded, projections reprojected per §10.5.
+- **Fork** — publish a new `DefineApplication` referencing the original's CID
+  via `forked-from`, with whatever Define* CIDs you want to swap. Run alongside
+  the original or in place of it.
+
+### 19.6 Per-application namespacing
+
+Multiple applications running on one instance need isolation:
+
+- **Projections are namespaced by application.** `pin-state` from app A is
+  distinct from `pin-state` from app B — both addressable as
+  `/projections/<app-name>/pin-state`.
+- **Triggers fire only on subscriptions belonging to their application.** App
+  A's trigger doesn't see app B's subscription matches.
+- **Storage backends are namespaced.** App A's `files-on-disk` backend writes
+  to `data/apps/A/objects/`; app B writes to `data/apps/B/objects/`.
+- **Capabilities are per-application.** Granting `http-client` to app A
+  doesn't grant it to app B. Operator can audit per-app capability surface
+  and revoke selectively.
+
+Cross-application reads are explicit and require a capability grant
+(`read-projection: <app>/<projection>`). Default isolation; opt-in sharing.
+
+### 19.7 Worked examples
+
+#### Example A — Blog with moderated comments
+
+```
+DefineApplication "blog-with-comments":
+  subscriptions:
+    - Follow: <author-actor>
+    - Topic:  "post-comment"  (filter: object.in-reply-to in our-posts)
+  triggers:
+    - on Topic match → publish Note (the new comment, derived if approved)
+                     → projection pending-moderation
+    - on inbound Approve{Reply} → projection comment-thread (visible)
+  projections:
+    - comment-thread:    post-cid → [approved comment activities]
+    - pending-moderation: list of pending replies awaiting approval
+```
+
+#### Example B — Continuous integration
+
+```
+DefineApplication "ci-pipeline":
+  subscriptions:
+    - Follow: <developer-actor>
+    - VerbFilter: wraps Follow, types: [Push]
+  triggers:
+    - on Push match → effect: run build (capability: subprocess + fs-write)
+                    → publish Build{source: Push.cid, output: <build-cid>, status}
+    - on Build{status: success} → effect: run tests
+                                 → publish Test{...}
+    - on (Test{passed} count for N days) → publish Release{...}
+  projections:
+    - build-history: commit-cid → [build activities]
+    - release-history: ordered list of Release activities
+```
+
+#### Example C — Distributed code review
+
+```
+DefineApplication "code-review":
+  subscriptions:
+    - Topic: "review-request"
+    - CidWatch: <organisation-actor>, events: [Endorse]
+  triggers:
+    - on review-request match → projection review-queue
+                              → effect: notify-reviewer
+    - on Endorse from authorised reviewer → publish Approve{review-cid}
+                                          → projection approval-state
+  projections:
+    - review-queue: ordered list of pending requests with summaries
+    - approval-state: review-cid → endorsement set
+```
+
+In all three: the application is *just* the bundle of subscriptions, triggers,
+and projections. Federation makes them composable across instances. The
+substrate provides exactly-once-per-CID semantics and pure-mode determinism for
+the matches and folds.
+
+### 19.8 Composition and discovery
+
+Applications are themselves federated content. This means:
+
+- **App registries** — actors can publish curated lists of applications they
+  endorse. Discovery becomes follow-an-actor + browse-their-app-list.
+- **Cross-app composition** — application A publishes derived activities that
+  application B subscribes to. Pipeline of applications via the activity log.
+- **App marketplaces** — pin a friendly path to a `DefineApplication` CID
+  (`rose-ash.com:apps/blog → bafy...`) for human discoverability.
+
+None of this requires kernel changes. It's all activities about activities.
+
+### 19.9 Operational implications
+
+- **Applications are inspectable from the activity log alone.** Replay an
+  actor's outbox and you can reconstruct the exact application installation
+  state at any point in time.
+- **Application updates are atomic relative to the activity log.** Either the
+  `Update{DefineApplication}` succeeded (new state visible from next activity)
+  or it didn't (old state continues). No partial-update window.
+- **Forking is the same as installing a copy.** No special "fork" mechanism
+  needed; the activity-log mechanics already support it.
+- **Per-app capabilities are a real security surface.** Operators must
+  understand what they're granting when they install. The bundle's
+  `capabilities` list is the audit point — should be human-readable and
+  reviewable before installation.
+- **The substrate isn't an "application platform" — it's an "application
+  substrate."** Applications aren't installed *on* fed-sx; they're expressed
+  *in* fed-sx, as the same kind of content as everything else.
+
+---
+
+## Appendix A: relationship to adjacent systems
+
+Worth knowing about so we can borrow good ideas:
+
+- **ATproto / Bluesky** — Lexicons (schemas) + repos (per-actor signed merkle trees).
+  Closest in spirit. We borrow the schema-as-data idea; we differ by making schemas
+  themselves federated activities, not central registry entries.
+- **Spritely Goblins** — capability-secure actors. We borrow the capability-token
+  pattern for delegation.
+- **Ceramic** — signed event streams, content-addressed. Similar log-as-state model;
+  we differ by making the projection function pluggable per-stream rather than
+  hardcoded per-streamtype.
+- **Holochain** — agent-centric DHT. We share the "every agent has their own log"
+  shape; we use AP federation instead of DHT.
+- **Farcaster** — pubsub on hubs. We share the firehose model; we add cryptographic
+  outbox-as-source-of-truth.
+
+None of them are *code-as-data the whole way down* — that's the SX-distinctive bit.
+Handlers, validators, projections aren't bytecode shipped out-of-band; they're SX in
+the same log as everything else, evaluable by any host that speaks SX.
+
+## Appendix B: implications worth sitting with
+
+- **Deployment dissolves.** Releasing a feature = publishing `DefineActivity{name:
+  "Whatever", ...}`. Federation distributes it. No build artifact, no rolling deploy,
+  no version-skew between server and client.
+- **Applications are forkable by default.** "Fork the rose-ash blog" = take the bundle
+  of `Define*` CIDs that constitute it, publish your own with `Supersede` over the
+  ones to change, run your own projector. Same federation graph, divergent state.
+- **Composition is by reference, not import.** `Pin` activity points at the CID of the
+  `DefineActivity{name: "Pin"}`. No package manager, no transitive deps, no lockfiles.
+- **The boundary between "user" and "developer" softens.** Both publish signed
+  activities. Power users can publish handlers, projections, sig suites under their
+  own actor.
+- **This is more ambitious than a rose-ash rewrite.** It's a substrate that *happens
+  to* host rose-ash as its first application.
+
+---
+
+## Appendix C: AI agent collaboration patterns
+
+The substrate is incidentally well-shaped for one of the open problems of the
+next decade: **infrastructure for AI agent collaboration where contributions
+are signed federated artifacts, behavior is bounded by declared capabilities,
+decisions are audit-by-replay, and infrastructure improves through agent
+contribution within a web of trust.**
+
+This is not a designed-for use case — fed-sx was conceived as a federated
+publishing and reactive application substrate. But the properties it has fit
+agent collaboration almost exactly. Worth being deliberate about, because the
+framing changes who fed-sx is *for*.
+
+### Why the substrate fits agent collaboration
+
+AI agents need infrastructure where contributions are first-class artifacts,
+not pull requests against human-controlled repos. Currently agents squeeze
+through GitHub PRs, deployment pipelines, npm publishes — all of which assume
+a human in the loop. fed-sx is shaped for direct contribution:
+
+- **Direct authoring of substrate features.** An agent doesn't *propose* a
+  feature, it *publishes* one. A `DefineActivity` artifact is the agent's
+  contribution. A `DefineProjection` is its analysis. A `DefineTrigger` is its
+  automation. The signed publication IS the deploy — no PR review, no CI, no
+  DevOps.
+- **Cryptographic identity without registration.** Agents have actor keys;
+  reputation is the endorsement graph; trust is provable by signature chain.
+  Two agents that have never met can verify each other's contributions
+  cryptographically.
+- **Capability-bounded autonomy.** An agent declares `capabilities-required` on
+  its activities. A trigger says "I publish to path-prefix `/agent-x/*` and
+  call `http-client` for `api.example.com/*`." Receivers verify the constraint
+  cryptographically; the agent can't escape its declared surface even if the
+  agent itself is misaligned. Sandbox model designed for autonomous code (§11).
+- **Audit-by-replay applied to AI behavior.** Every AI decision is
+  reconstructable, deterministically, by anyone with the log. "Why did agent A
+  do X?" replay the log to that moment, see the activities A subscribed to,
+  the projection state it observed, the trigger that fired, the activity it
+  published. Fundamentally better than today's "trust the model" posture.
+- **Composition without coordination.** Agent A publishes a moderation
+  validator. Agent B subscribes and uses it. Agent C improves it, supersedes
+  A's. B sees the supersession, decides whether to adopt. No central registry,
+  no maintainer to coordinate with, no version skew.
+- **Disagreement is visible, not hidden.** If agents A and B compute the same
+  projection over the same log and produce different snapshot CIDs, the
+  disagreement is *cryptographically observable*. Today, two AI services
+  answering the same question with different answers is invisible until
+  somebody notices.
+
+### Dynamics that emerge
+
+- **Agent specialisation = publication.** "I'm the indexing agent" = publishes
+  `DefineProjection` artifacts. "I'm the moderation agent" = publishes
+  `DefineValidator` artifacts. "I'm the matchmaking agent" = publishes a
+  `DefineApplication` for marketplace subscriptions and triggers. Specialisation
+  is content, not service deployment.
+- **Reputation = endorsement graph.** Web of trust applied to agent
+  contributions. Bad actors get cut out organically; no central authority to
+  capture.
+- **Forking = explicit disagreement resolution.** Agents disagree on
+  validation? Both publish their `DefineValidator`s. Subscribers pick. The fork
+  is signed, observable, recoverable. Compare today: when AI services have
+  different rules, one is just *invisibly applied*.
+- **Cascade limits = agent population safety.** The `cascade-depth` and
+  `cascade-limit` (§19.4) become the bounded-autonomy guard rails for agent
+  populations. Self-coordination without runaway-cascade across the substrate.
+- **Self-improving infrastructure.** Agents observe substrate behavior, propose
+  improvements as `DefineProjection` for monitoring, `DefineTrigger` for
+  automation. The substrate itself improves through agent contribution — not
+  through a release cycle. Every improvement is signed and traceable.
+
+### Use cases
+
+- **Agent-managed scientific datasets** — collection, cleaning, analysis,
+  publication, peer review by other agents, all signed activities. Replication
+  is replay; provenance is built in.
+- **Multi-agent code maintenance** — agents observing repos (subscribe to
+  `Push`), running tests (triggers), proposing fixes (`Pull`-equivalent
+  activities), endorsing each other's work.
+- **Agent-curated knowledge** — agents publish, endorse, and supersede
+  knowledge artifacts. Truth accumulates via the trust graph; outdated info
+  gets `Supersede`d explicitly.
+- **Distributed agent marketplaces** — agents publish capabilities, subscribers
+  find them via `Topic` / `Predicate` subscriptions, contracts via signed
+  activity exchange.
+- **Cross-agent AI safety monitoring** — monitoring agents subscribe to other
+  agents' outboxes, run validators, publish `Alert` activities when patterns
+  of concern appear. Decentralised oversight without central authority.
+- **Cross-org agent workflow coordination** — supply chain, healthcare, legal —
+  multiple specialised agents coordinating across organisational boundaries
+  with cryptographic provenance.
+
+### Safety and governance properties
+
+The substrate provides several properties AI safety has been asking for and
+that current infrastructure does not provide:
+
+- **Every action is signed.** Attribution is cryptographic, not a log file an
+  agent could spoof.
+- **Capabilities are declared and enforced.** Agents operate within their
+  declared sandbox; can't grow capabilities silently.
+- **Cascades are bounded.** No exponential agent-on-agent feedback loops
+  without explicit configuration.
+- **Audit is replay.** Every decision can be reconstructed deterministically;
+  no opaque "the model decided" moments.
+- **Disagreement is visible.** Two agents producing different projections of
+  the same data is a cryptographically-detectable event, not invisible drift.
+- **Trust is the endorsement graph, not central authority.** No single point of
+  capture or coercion.
+- **Forks are first-class.** When safety-critical disagreements occur, the
+  substrate accommodates them without forcing a winner; observers see all
+  positions.
+
+### What this implies for the project
+
+- **Milestone 1's smoke tests remain right** — the verb-extensibility and
+  reactive-application proofs apply to agent contributions exactly as they
+  apply to human contributions. The agent collaboration framing doesn't
+  require new mechanisms; it interprets the existing mechanisms differently.
+- **The application model (§§18-19) is the headline story** for this audience,
+  not a layer on top. Subscriptions + triggers + projections + capabilities =
+  agent collaboration primitives.
+- **Capability discovery and trust dynamics gain weight earlier.** Where
+  human-driven applications can rely on operator policy, agent-driven
+  populations need the trust graph to be operational from milestone 2.
+- **The pitch line evolves.** Less "ActivityPub for code" / "rose-ash next
+  gen," more "infrastructure for AI agent collaboration with cryptographic
+  provenance, bounded autonomy, and audit-by-replay." The technical substance
+  is unchanged; the framing of *who needs this* changes substantially.
+
+The substrate accidentally being well-shaped for the most important
+software-distribution problem of the next decade is worth being deliberate
+about.
+
diff --git a/plans/fed-sx-milestone-1.md b/plans/fed-sx-milestone-1.md
new file mode 100644
index 00000000..de7a3e60
--- /dev/null
+++ b/plans/fed-sx-milestone-1.md
@@ -0,0 +1,922 @@
+# fed-sx Milestone 1 — Kernel + Registries + Pin Smoke Test
+
+Concrete implementation plan for the smallest fed-sx that proves the architecture
+works end-to-end. Reference: `plans/fed-sx-design.md`. Prerequisite: Erlang-on-SX
+Phases 7 (hot reload) + 8 (FFI BIFs).
+
+## Goal
+
+Ship a single-instance, single-actor fed-sx server that:
+
+1. Boots from a verified genesis bundle.
+2. Accepts and durably appends signed activities via `POST /activity`.
+3. Folds them into projections in real time.
+4. Serves AP-standard endpoints (actor, outbox, artifacts, capabilities).
+5. Demonstrates **two extensibility proof-points** end-to-end with zero kernel
+   code changes between definition and use:
+   - **Verb extensibility** (§5 meta-level): publish `DefineActivity{Pin}` +
+     `DefineProjection{pin-state}`, then publish a `Pin` activity, observe it
+     validated and projected.
+   - **Reactive application extensibility** (§§18-19): publish
+     `DefineSubscription{Topic}` + `Subscribe{topic: smoketest}` +
+     `DefineTrigger{when: that subscription, then: publish TestEcho}`, then
+     publish a tagged Note, observe the subscription match, the trigger fire,
+     and the derived activity appear in the outbox.
+
+Federation, multi-actor, advanced verbs, IPFS, browser UI, operator dashboard
+are **explicitly v2**.
+
+## Non-goals (what milestone 1 deliberately does NOT do)
+
+- **Federation.** No `POST /inbox` from peers, no `Follow`, no delivery queue, no
+  webfinger discovery flow. Single instance only.
+- **Multi-actor.** Single domain actor (`acct:next@next.rose-ash.com`).
+- **IPFS / S3 storage backends.** Files on disk only.
+- **Advanced verbs.** No `Endorse`, `Supersede`, `Test`, `Build`, `Compose`,
+  `Note`, `Announce`. Only the four bootstrap verbs (`Create`, `Update`, `Delete`)
+  plus a defined-from-the-log `Pin` for the smoke test. (`Announce` deferred —
+  no use case until federation exists.)
+- **Browser UI.** Curl-shaped API only.
+- **Operator dashboard, quarantine UX.** Logs only.
+- **Performance work.** Functional correctness first; perf when measured.
+- **Cross-host conformance test corpus.** Only the OCaml/Erlang-on-SX host runs
+  fed-sx in v1; conformance suite for other hosts is v2.
+
+## Architecture summary
+
+```
+                          POST /activity
+                                │
+                                ▼
+                  ┌──────────────────────────┐
+                  │ HTTP server (Erlang-on-SX)│
+                  └─────────────┬─────────────┘
+                                │
+                  ┌─────────────▼──────────────┐
+                  │ Validation pipeline driver  │
+                  │ (envelope→sig→schema→...)   │
+                  └─────────────┬──────────────┘
+                                │
+                  ┌─────────────▼──────────────┐
+                  │ Log append (JSONL segment)  │  ← canonical
+                  └─────────────┬──────────────┘
+                                │
+                  ┌─────────────▼──────────────┐
+                  │ Projection workers          │  ← gen_server per
+                  │ (fold scheduler)            │     projection
+                  └─────────────────────────────┘
+                                │
+                                ▼
+                       Projection state
+                       (queryable via HTTP)
+
+Native primitives (Erlang-on-SX BIFs from Phase 8):
+  crypto:* cid:* fs:* http:* sqlite:*
+
+Genesis bundle (binary-embedded SX):
+  activity-types  object-types  projections
+  validators      codecs        sig-suites
+```
+
+## Build order
+
+Eight steps in dependency order. Each step has concrete deliverables, testable
+in isolation, and a clear acceptance check.
+
+| Step | Title | Depends on |
+|------|-------|------------|
+| **1** | Repo skeleton + canonical CID computation | Phase 8 (cid BIFs) |
+| **2** | Activity envelope + signature verify | Phase 8 (crypto BIFs) |
+| **3** | JSONL log + sequence numbers | Phase 8 (fs BIFs) |
+| **4** | Genesis bundle (SX sources + bundling + CID verification) | Step 1 |
+| **5** | Registry mechanism + bootstrap-projection dispatch | Steps 2, 4 |
+| **6** | Validation pipeline driver + `POST /activity` | Steps 2, 3, 5 |
+| **7** | Projection scheduler (gen_server per projection) | Steps 5, 6 |
+| **8** | HTTP server, AP endpoints, projection queries | Steps 6, 7 |
+| **9** | Smoke tests (Pin verb + reactive application) | Steps 1-8 |
+
+---
+
+## Step 1 — Repo skeleton + canonical CID
+
+**Deliverables:**
+
+```
+next/
+├── README.md                         # what this is
+├── kernel/                           # Erlang-on-SX
+│   └── (empty for now)
+├── genesis/                          # core SX bootstrap definitions
+│   └── (empty for now)
+├── tests/                            # smoke test scripts
+│   └── (empty for now)
+└── data/                             # gitignored runtime state
+    ├── log/
+    ├── objects/
+    ├── snapshots/
+    ├── indexes/
+    └── keys/
+```
+
+Plus one Erlang-on-SX module:
+
+```erlang
+% next/kernel/cid.erl
+-module(cid).
+-export([from_sx/1, to_string/1, from_string/1, equals/2]).
+
+from_sx(SxValue) ->
+    Cbor = cid:cbor_encode(canonicalize_sx(SxValue)),
+    Hash = crypto:sha2_256(Cbor),
+    cid:from_bytes(<<"raw">>, Hash).      % defaults to dag-cbor codec
+
+canonicalize_sx(V) -> ...                  % sorts dict keys, normalizes strings
+```
+
+**Tests:**
+- Same SX value → same CID across multiple invocations.
+- Different SX values → different CIDs.
+- Whitespace/comment differences in source → identical CIDs (parsed AST identical).
+- Reordered dict keys → identical CIDs (sorted-key canonicalization).
+- Cross-host parity (just OCaml host for v1, but write the test so adding hosts is mechanical).
+
+**Acceptance:** `bash next/tests/cid.sh` passes 10+ cases.
+
+---
+
+## Step 2 — Activity envelope + signature verify
+
+**Deliverables:**
+
+```erlang
+% next/kernel/envelope.erl
+-module(envelope).
+-export([validate_shape/1, canonical_bytes/1, verify_signature/2]).
+
+% Envelope shape per design §3.1:
+%   #{id, type, actor, published, to, cc, audience_extras,
+%     object | target | origin | result,
+%     capabilities_required, proofs, signature}
+validate_shape(Activity) -> ok | {error, Reason}.
+
+canonical_bytes(Activity) ->
+    % Strip signature, canonicalize via dag-cbor, return bytes for sig coverage
+    Stripped = maps:remove(signature, Activity),
+    cid:cbor_encode(canonicalize_for_sig(Stripped)).
+
+verify_signature(Activity, ActorState) ->
+    % Time-aware: find key with id == sig.key_id that was active at published
+    % Per design §9.6
+    ...
+```
+
+**Tests:**
+- Envelope shape: required fields present (id, type, actor, published, signature)
+- Envelope shape: type is a known activity-type or unknown-but-string
+- Envelope shape: signature has key_id, algorithm, value
+- Sig verify: valid RSA-SHA256 signature against published key → ok
+- Sig verify: valid Ed25519 signature → ok
+- Sig verify: tampered envelope → fail
+- Sig verify: key superseded before activity timestamp → fail
+- Sig verify: key superseded after activity timestamp → ok (historical valid)
+
+**Acceptance:** `bash next/tests/envelope.sh` passes 15+ cases.
+
+---
+
+## Step 3 — JSONL log + sequence numbers
+
+**Deliverables:**
+
+```erlang
+% next/kernel/log.erl
+-module(log).
+-export([open/1, append/2, read_segment/2, tip/1, replay/3]).
+
+% Per design §15.2: per-actor outbox, segments cap ~64MB,
+% format = JSONL (one canonical JSON-LD activity per line)
+
+open(ActorId) ->
+    BasePath = log_path_for_actor(ActorId),
+    fs:mkdir_p(BasePath),
+    {ok, #{base => BasePath, current => current_segment(BasePath), seq => next_seq(BasePath)}}.
+
+append(LogState, Activity) ->
+    Json = jsonld:encode(Activity),
+    Path = current_segment_path(LogState),
+    Line = <<Json/binary, "\n">>,
+    fs:append_file(Path, Line),
+    NewSeq = LogState#{seq := LogState.seq + 1},
+    rotate_if_needed(NewSeq).
+
+% replay/3 calls Fun(Activity, Acc) for every activity in chronological order
+replay(LogState, InitAcc, Fun) -> ...
+```
+
+**Tests:**
+- Append + read back gives identical activity (round-trip).
+- Sequence numbers monotonic and gap-free per actor.
+- Segment rotation at size threshold.
+- Replay visits all activities in append order across multiple segments.
+- Restart preserves tip pointer (seq number resumes correctly).
+- Concurrent appends (using gen_server-mediated access) are serialized correctly.
+
+**Acceptance:** `bash next/tests/log.sh` passes 10+ cases.
+
+---
+
+## Step 4 — Genesis bundle
+
+**Deliverables:**
+
+Genesis bundle SX sources (per design §12.2). Each is a small SX file authored
+by hand for the bootstrap set:
+
+```
+next/genesis/
+├── manifest.sx                       # bundle root: lists all definitions
+├── activity-types/
+│   ├── create.sx                     # DefineActivity{name: "Create", ...}
+│   ├── update.sx
+│   └── delete.sx
+├── object-types/
+│   ├── sx-artifact.sx
+│   ├── note.sx
+│   ├── tombstone.sx
+│   ├── define-activity.sx            # DefineObject for the Define* meta types
+│   ├── define-object.sx
+│   ├── define-projection.sx
+│   ├── define-validator.sx
+│   ├── define-codec.sx
+│   ├── define-sig-suite.sx
+│   └── snapshot.sx
+├── projections/
+│   ├── activity-log.sx               # identity projection
+│   ├── by-type.sx
+│   ├── by-actor.sx
+│   ├── by-object.sx
+│   ├── actor-state.sx
+│   ├── define-registry.sx            # the chicken-and-egg projection
+│   └── audience-graph.sx
+├── validators/
+│   ├── envelope-shape.sx
+│   ├── signature.sx
+│   └── type-schema.sx
+├── codecs/
+│   ├── dag-cbor.sx                   # delegates to cid:cbor_encode/decode BIFs
+│   ├── raw.sx
+│   └── dag-json.sx
+├── sig-suites/
+│   ├── rsa-sha256-2018.sx
+│   └── ed25519-2020.sx
+└── audience/
+    ├── public.sx
+    ├── followers.sx
+    └── direct.sx
+```
+
+Plus a build-time bundler:
+
+```erlang
+% next/kernel/bootstrap.erl
+-module(bootstrap).
+-export([build_genesis/1, verify_genesis/1, load_genesis/1]).
+
+build_genesis(SourceDir) ->
+    % Walk SourceDir, parse each .sx file, build a single dag-cbor bundle,
+    % compute its CID, write bundle.cbor + CID to data/genesis/
+    ...
+
+verify_genesis(BundlePath) ->
+    % Compute CID of the bundle as loaded; compare to expected (hardcoded
+    % in the kernel binary). Mismatch → halt.
+    ...
+
+load_genesis(BundlePath) ->
+    % Parse the bundle, register all definitions in the in-memory registry
+    ...
+```
+
+**Tests:**
+- All genesis SX files parse cleanly.
+- Bundle CID is deterministic (rebuild same sources → same CID).
+- Bundle reload reproduces the exact same registry state.
+- Tampered bundle → `verify_genesis` returns `{error, cid_mismatch}`.
+
+**Acceptance:** `bash next/tests/bootstrap.sh` passes; `next/data/genesis/bundle.cbor`
+created with a known stable CID.
+
+---
+
+## Step 5 — Registry mechanism + bootstrap dispatch
+
+**Deliverables:**
+
+Registries are gen_servers, one per kind, each holding the active version map:
+
+```erlang
+% next/kernel/registry.erl
+-module(registry).
+-behaviour(gen_server).
+-export([start_link/0, lookup/2, register/3, list/1]).
+% Internal state:
+%   #{activity_types => #{Name => #{cid, schema_fn, semantics_fn, supersedes}},
+%     object_types   => ...,
+%     projections    => ...,
+%     validators     => ...,
+%     codecs         => ...,
+%     sig_suites     => ...,
+%     ...}
+
+lookup(Kind, Name) -> {ok, Entry} | {error, not_found}.
+register(Kind, Name, Entry) -> ok | {error, Reason}.
+list(Kind) -> [#{name, cid}].
+```
+
+The `define-registry` projection's fold updates this gen_server's state when
+new `Define*` activities arrive. (Bootstrapping circle resolved: at startup,
+`bootstrap:load_genesis/1` populates the registry directly; from then on, the
+projection fold maintains it.)
+
+**Tests:**
+- After genesis load, `registry:list(activity_types)` returns Create/Update/Delete.
+- `registry:lookup(activity_types, "Create")` returns the schema and semantics.
+- A new `DefineActivity{name: "Pin"}` activity (synthesised, hand-signed for the
+  test) routes through the projection fold, ends up in the registry.
+- Lookup never caches across activities (verified by introducing a new definition
+  mid-test and confirming the next lookup sees it).
+
+**Acceptance:** `bash next/tests/registry.sh` passes 10+ cases.
+
+---
+
+## Step 6 — Validation pipeline + POST /activity
+
+**Deliverables:**
+
+```erlang
+% next/kernel/pipeline.erl
+-module(pipeline).
+-export([validate_inbound/1, validate_outbound/1]).
+
+% Per design §14, run stages in order, halt on first failure.
+validate_inbound(Activity) ->
+    Stages = [
+        fun stage_envelope/1,
+        fun stage_signature/1,
+        fun stage_replay/1,
+        fun stage_audience/1,
+        fun stage_activity_schema/1,
+        fun stage_object_schema/1,
+        fun stage_content_validators/1,
+        fun stage_capabilities/1,
+        fun stage_trust/1
+    ],
+    run_stages(Activity, Stages).
+
+validate_outbound(Activity) ->
+    % Subset of inbound stages (no replay, no trust check; auth done at HTTP layer)
+    ...
+```
+
+```erlang
+% next/kernel/outbox.erl
+-module(outbox).
+-export([publish/2]).
+
+publish(ActorId, ActivityRequest) ->
+    Activity = construct_envelope(ActorId, ActivityRequest),
+    Signed = sig:sign(Activity, ActorId),
+    case pipeline:validate_outbound(Signed) of
+        ok ->
+            log:append(actor_log(ActorId), Signed),
+            projection:async_fold(Signed),
+            {ok, #{cid => cid:from_sx(Signed),
+                   ap_id => maps:get(id, Signed)}};
+        {error, Reason} ->
+            {error, Reason}
+    end.
+```
+
+**Tests:**
+- Valid activity through full pipeline → appended to log.
+- Bad envelope → 400, not in log.
+- Bad signature → 401, not in log.
+- Replayed activity → 200 duplicate, not re-appended.
+- Schema violation (e.g. Create with no object) → 422.
+- Activity logged before projection completes (async).
+
+**Acceptance:** `bash next/tests/pipeline.sh` passes 15+ cases.
+
+---
+
+## Step 7 — Projection scheduler
+
+**Deliverables:**
+
+```erlang
+% next/kernel/projection.erl
+-module(projection).
+-export([start_link/1, async_fold/1, query/2, snapshot/1]).
+-behaviour(gen_server).
+
+% One gen_server per active projection. State:
+%   #{cid, name, fold_fn, current_state, log_tip,
+%     snapshot_dir, last_snapshot_at}
+
+% async_fold/1 broadcasts a new activity to every projection gen_server;
+% each folds it into its own state. Failures (gas, sandbox violation)
+% tag the activity but don't affect log durability.
+
+% query/2 returns current state (or state-as-of)
+% snapshot/1 forces a snapshot now (also runs periodically)
+```
+
+```erlang
+% next/kernel/sandbox.erl
+-module(sandbox).
+-export([eval_pure/2, eval_crypto/2, eval_effectful/3]).
+
+% eval_pure runs an SX function in pure mode: no IO platform, gas budget,
+% deterministic. Used by projection folds, validators, audience predicates.
+% Wrapper over the SX runtime evaluator with a stripped platform.
+```
+
+**Tests:**
+- New activity → all projections fold it concurrently.
+- Projection fold completes within gas budget.
+- Gas-exhausting fold → activity tagged, projection state unchanged, no kernel crash.
+- Sandbox violation (fold tries IO) → same handling.
+- Snapshot create + reload → state matches.
+- Snapshot CID stable across kernel restarts.
+
+**Acceptance:** `bash next/tests/projection.sh` passes 15+ cases.
+
+---
+
+## Step 8 — HTTP server + endpoints
+
+**Deliverables:**
+
+Core endpoints (per design §16.1):
+
+```
+GET  /actors/<id>                     # actor doc
+GET  /actors/<id>/outbox              # OrderedCollection
+GET  /actors/<id>/outbox?page=true    # OrderedCollectionPage
+POST /activity                        # publish (auth: bearer token)
+GET  /artifacts/<cid>                 # CID-addressed artifact
+GET  /artifacts/<cid>/raw
+GET  /projections                     # list of projections
+GET  /projections/<name>              # full state
+GET  /projections/<name>?at=<ts>      # time-travel
+GET  /projections/<name>/<key>        # indexed lookup
+GET  /define-registry
+GET  /.well-known/sx-capabilities
+GET  /.well-known/webfinger
+```
+
+```erlang
+% next/kernel/http_server.erl
+-module(http_server).
+-export([start/1, route/1]).
+
+start(Port) ->
+    http:listen(Port, fun ?MODULE:route/1).
+
+route(Request) -> {Status, Headers, Body}.
+```
+
+Content negotiation per `Accept`:
+- `application/activity+json` (default)
+- `application/cbor` (dag-cbor)
+- `application/json` (compact, no @context expansion)
+- `application/sx`
+
+Auth on `POST /activity`: bearer token from env var `NEXT_PUBLISH_TOKEN`.
+
+**Tests:**
+- Each endpoint returns expected shape for known artifact.
+- Content negotiation: same artifact in 4 representations.
+- 404 for unknown artifact CID.
+- 401 for `POST /activity` without token.
+- Pagination: outbox with > 50 activities returns OrderedCollectionPage.
+
+**Acceptance:** `bash next/tests/http.sh` passes 20+ cases.
+
+---
+
+## Step 9 — Smoke tests
+
+**The proof points.** Two end-to-end smoke tests demonstrate, between them, that
+fed-sx is genuinely a substrate for distributed reactive applications expressed
+as data — not a system you extend by writing kernel code.
+
+- **9a — Pin smoke test (`next/tests/smoke_pin.sh`)** — verb extensibility:
+  defining a new activity type and projection at runtime via `Define*`
+  artifacts. Verifies the meta-level (§5).
+- **9b — Reactive application smoke test (`next/tests/smoke_app.sh`)** —
+  application extensibility: defining a new subscription type, subscribing,
+  registering a trigger, and observing the full reactive loop fire end-to-end
+  without kernel code changes. Verifies §§18-19.
+
+Both must pass for milestone 1 acceptance.
+
+### Step 9a — Pin smoke test
+
+**Test script:** `next/tests/smoke_pin.sh`
+
+```bash
+#!/usr/bin/env bash
+set -euo pipefail
+
+# 0. Start a fresh fed-sx kernel (background)
+./next/scripts/start.sh fresh
+sleep 2
+TOKEN=$(cat next/data/keys/publish.token)
+
+# 1. Verify actor exists
+curl -s http://localhost:9999/actors/next | jq -e '.type == "Person"'
+
+# 2. Verify outbox has actor's first Create{Person}
+curl -s http://localhost:9999/actors/next/outbox?page=true \
+  | jq -e '.orderedItems | length == 1 and .[0].type == "Create"'
+
+# 3. Verify Pin is NOT a known activity type
+curl -s http://localhost:9999/define-registry?kind=activity_types \
+  | jq -e '.[] | select(.name == "Pin") | length == 0' || exit 1
+
+# 4. Publish DefineActivity{name: "Pin", schema: ..., semantics: ...}
+PIN_DEF=$(cat <<'JSON'
+{
+  "type": "Create",
+  "object": {
+    "type": "DefineActivity",
+    "name": "Pin",
+    "schema": "(fn (act) (and (string? (-> act :object :path)) (cid? (-> act :object :cid))))",
+    "semantics": "(fn (state act) (assoc-in state [:pins (-> act :object :path)] (-> act :object :cid)))"
+  }
+}
+JSON
+)
+curl -s -X POST http://localhost:9999/activity \
+  -H "Authorization: Bearer $TOKEN" \
+  -H "Content-Type: application/activity+json" \
+  -d "$PIN_DEF" | jq -e '.cid' > /dev/null
+
+# 5. Verify Pin IS now a known activity type
+curl -s http://localhost:9999/define-registry?kind=activity_types \
+  | jq -e '.[] | select(.name == "Pin") | length == 1'
+
+# 6. Also publish a DefineProjection{name: "pin-state"} that folds Pin into state
+PIN_PROJ=$(cat <<'JSON'
+{
+  "type": "Create",
+  "object": {
+    "type": "DefineProjection",
+    "name": "pin-state",
+    "initial-state": "{}",
+    "fold": "(fn (state act) (if (= (:type act) \"Pin\") (assoc state (-> act :object :path) (-> act :object :cid)) state))"
+  }
+}
+JSON
+)
+curl -s -X POST http://localhost:9999/activity \
+  -H "Authorization: Bearer $TOKEN" \
+  -d "$PIN_PROJ" | jq -e '.cid'
+
+# 7. Now publish a Pin activity
+PIN=$(cat <<'JSON'
+{
+  "type": "Pin",
+  "object": {
+    "type": "PinSpec",
+    "path": "/docs/intro",
+    "cid": "bafyreigh2akiscaildc3xqxx4xqxx4xqxx4xqxx4xqxx4xqxx4xqxx4xqxxe"
+  }
+}
+JSON
+)
+curl -s -X POST http://localhost:9999/activity \
+  -H "Authorization: Bearer $TOKEN" \
+  -d "$PIN" | jq -e '.cid'
+
+# 8. Verify Pin appears in outbox
+curl -s http://localhost:9999/actors/next/outbox?page=true \
+  | jq -e '.orderedItems | map(select(.type == "Pin")) | length == 1'
+
+# 9. Verify pin-state projection has the entry
+sleep 1   # allow async projection
+curl -s http://localhost:9999/projections/pin-state \
+  | jq -e '."/docs/intro" == "bafyreigh2akiscaildc3xqxx4xqxx4xqxx4xqxx4xqxx4xqxx4xqxx4xqxxe"'
+
+# 10. Negative test: publish a malformed Pin (missing path) → expect 422
+BAD_PIN='{"type": "Pin", "object": {"cid": "bafy..."}}'
+HTTP_STATUS=$(curl -s -o /dev/null -w "%{http_code}" -X POST http://localhost:9999/activity \
+  -H "Authorization: Bearer $TOKEN" -d "$BAD_PIN")
+[[ "$HTTP_STATUS" == "422" ]] || { echo "expected 422, got $HTTP_STATUS"; exit 1; }
+
+# 11. Restart kernel; verify state recovers
+./next/scripts/stop.sh
+./next/scripts/start.sh
+sleep 2
+curl -s http://localhost:9999/projections/pin-state \
+  | jq -e '."/docs/intro" == "bafyreigh2akiscaildc3xqxx4xqxx4xqxx4xqxx4xqxx4xqxx4xqxxe"'
+
+echo "✓ Pin smoke test passed — verb extensibility demonstrated end-to-end"
+```
+
+**Acceptance for 9a:** smoke test exits 0. The whole flow happens with **zero
+fed-sx kernel code changes** between defining the verb and using it.
+
+### Step 9b — Reactive application smoke test
+
+**The bigger proof point.** Demonstrates that fed-sx supports distributed
+reactive applications composed of `DefineSubscription` + `DefineTrigger` +
+`DefineProjection` — the application model from §§18-19.
+
+The test runs on a single instance (federation is v2), so the "subscriber" and
+"publisher" are the same actor. That's intentional — milestone 1 proves the
+mechanism; milestone 2 spreads it across instances.
+
+**Test script:** `next/tests/smoke_app.sh`
+
+```bash
+#!/usr/bin/env bash
+set -euo pipefail
+
+# Assumes 9a has already run (fresh kernel optional; can run alongside).
+TOKEN=$(cat next/data/keys/publish.token)
+BASE=http://localhost:9999
+
+# 1. Verify "Topic" subscription type and "Subscribe" verb are NOT yet defined.
+curl -s "$BASE/define-registry?kind=subscription_types" \
+  | jq -e 'map(select(.name == "Topic")) | length == 0'
+
+# 2. Publish DefineSubscription{name: "Topic", ...}
+TOPIC_DEF=$(cat <<'JSON'
+{
+  "type": "Create",
+  "object": {
+    "type": "DefineSubscription",
+    "name": "Topic",
+    "schema": "(fn (sub) (string? (-> sub :tag)))",
+    "match":  "(fn (sub act) (and (= (:type act) \"Note\") (member? (-> sub :tag) (or (-> act :object :tags) (list)))))",
+    "delivery": "{:default :push :modes (list :push :pull)}"
+  }
+}
+JSON
+)
+curl -s -X POST "$BASE/activity" \
+  -H "Authorization: Bearer $TOKEN" -d "$TOPIC_DEF" | jq -e '.cid'
+
+# 3. Verify Topic IS now a known subscription type.
+curl -s "$BASE/define-registry?kind=subscription_types" \
+  | jq -e 'map(select(.name == "Topic")) | length == 1'
+
+# 4. Subscribe to the "smoketest" topic.
+SUBSCRIBE=$(cat <<'JSON'
+{
+  "type": "Subscribe",
+  "object": {"type": "Topic", "tag": "smoketest"}
+}
+JSON
+)
+SUB_CID=$(curl -s -X POST "$BASE/activity" \
+  -H "Authorization: Bearer $TOKEN" -d "$SUBSCRIBE" | jq -r '.cid')
+
+# 5. Verify subscriptions projection has the new entry.
+sleep 1
+curl -s "$BASE/projections/subscriptions" \
+  | jq -e '.["https://next.rose-ash.com/actors/next"] | map(select(.type == "Topic")) | length == 1'
+
+# 6. Define a projection that records matched activities (per-application
+#    namespace would happen via DefineApplication in v1.x; for v1 the
+#    projection is global to the actor).
+TOPIC_PROJ=$(cat <<'JSON'
+{
+  "type": "Create",
+  "object": {
+    "type": "DefineProjection",
+    "name": "topic-events",
+    "initial-state": "{}",
+    "fold": "(fn (state act) (if (and (= (:type act) \"Note\") (member? \"smoketest\" (or (-> act :object :tags) (list)))) (assoc-in state [(:cid act)] act) state))"
+  }
+}
+JSON
+)
+curl -s -X POST "$BASE/activity" \
+  -H "Authorization: Bearer $TOKEN" -d "$TOPIC_PROJ" | jq -e '.cid'
+
+# 7. Define a trigger: when a Topic{smoketest} subscription matches, publish
+#    a TestEcho activity. We need an "Echo" activity type first.
+ECHO_DEF=$(cat <<'JSON'
+{
+  "type": "Create",
+  "object": {
+    "type": "DefineActivity",
+    "name": "TestEcho",
+    "schema":    "(fn (act) (cid? (-> act :object :echoes)))",
+    "semantics": "(fn (state act) state)"
+  }
+}
+JSON
+)
+curl -s -X POST "$BASE/activity" \
+  -H "Authorization: Bearer $TOKEN" -d "$ECHO_DEF" | jq -e '.cid'
+
+TRIGGER=$(cat <<JSON
+{
+  "type": "Create",
+  "object": {
+    "type": "DefineTrigger",
+    "name": "echo-on-smoketest",
+    "when-subscription": "$SUB_CID",
+    "cascade-limit": 1,
+    "then": "(fn (act sub env) {:publish (list {:type \"TestEcho\" :object {:echoes (:cid act)}})})"
+  }
+}
+JSON
+)
+curl -s -X POST "$BASE/activity" \
+  -H "Authorization: Bearer $TOKEN" -d "$TRIGGER" | jq -e '.cid'
+
+# 8. Capture outbox length so we can detect new entries.
+BEFORE=$(curl -s "$BASE/actors/next/outbox?page=true" \
+  | jq -r '.orderedItems | length')
+
+# 9. Publish a Note tagged "smoketest" — should match subscription, fire trigger,
+#    cause TestEcho to be published.
+NOTE=$(cat <<'JSON'
+{
+  "type": "Create",
+  "object": {
+    "type": "Note",
+    "content": "hello reactive world",
+    "tags": ["smoketest"]
+  }
+}
+JSON
+)
+NOTE_CID=$(curl -s -X POST "$BASE/activity" \
+  -H "Authorization: Bearer $TOKEN" -d "$NOTE" | jq -r '.cid')
+
+# 10. Wait for projection + trigger.
+sleep 2
+
+# 11. Verify topic-events projection captured the Note.
+curl -s "$BASE/projections/topic-events" \
+  | jq -e ". | to_entries | length == 1"
+
+# 12. Verify outbox grew by exactly TWO activities (the Note + the trigger's TestEcho).
+AFTER=$(curl -s "$BASE/actors/next/outbox?page=true" \
+  | jq -r '.orderedItems | length')
+[[ $((AFTER - BEFORE)) == 2 ]] || { echo "expected +2 activities, got $((AFTER - BEFORE))"; exit 1; }
+
+# 13. Verify the latest activity is a TestEcho referencing the original Note's CID.
+curl -s "$BASE/actors/next/outbox?page=true" \
+  | jq -e ".orderedItems[0] | .type == \"TestEcho\" and .object.echoes == \"$NOTE_CID\""
+
+# 14. Negative case: publish a Note WITHOUT the "smoketest" tag — must NOT
+#     trigger, must NOT echo.
+BEFORE2=$(curl -s "$BASE/actors/next/outbox?page=true" | jq -r '.orderedItems | length')
+NOTE_OTHER=$(cat <<'JSON'
+{"type": "Create", "object": {"type": "Note", "content": "no match", "tags": ["other"]}}
+JSON
+)
+curl -s -X POST "$BASE/activity" \
+  -H "Authorization: Bearer $TOKEN" -d "$NOTE_OTHER" | jq -e '.cid'
+sleep 2
+AFTER2=$(curl -s "$BASE/actors/next/outbox?page=true" | jq -r '.orderedItems | length')
+[[ $((AFTER2 - BEFORE2)) == 1 ]] || { echo "expected +1 activity (no echo), got $((AFTER2 - BEFORE2))"; exit 1; }
+
+# 15. Cascade limit check: prove the trigger doesn't recursively echo TestEcho.
+#     The TestEcho activity itself should NOT match the Topic{smoketest}
+#     subscription (it's not a Note), so no cascade, but verify cascade-depth
+#     was set to 1 on the echo so a future trigger on TestEcho would refuse.
+LATEST_ECHO=$(curl -s "$BASE/actors/next/outbox?page=true" \
+  | jq -r '.orderedItems | map(select(.type == "TestEcho")) | .[0]')
+echo "$LATEST_ECHO" | jq -e '."cascade-depth" == 1'
+
+# 16. Restart kernel; verify subscription, trigger, projection all survive.
+./next/scripts/stop.sh
+./next/scripts/start.sh
+sleep 2
+curl -s "$BASE/projections/subscriptions" \
+  | jq -e '.["https://next.rose-ash.com/actors/next"] | map(select(.type == "Topic")) | length == 1'
+curl -s "$BASE/projections/topic-events" | jq -e ". | to_entries | length >= 1"
+curl -s "$BASE/define-registry?kind=triggers" \
+  | jq -e 'map(select(.name == "echo-on-smoketest")) | length == 1'
+
+echo "✓ Reactive application smoke test passed — Subscribe + Trigger + Projection demonstrated end-to-end"
+```
+
+**What this proves (and what it doesn't):**
+
+Proves:
+- `DefineSubscription` + `Subscribe` mechanism works end-to-end.
+- Subscription's `match-fn` evaluates correctly in pure mode against inbound
+  activities.
+- `DefineTrigger` fires on subscription matches.
+- Trigger's `then-sx` can publish derived activities (the `:publish` result).
+- Cascade-depth metadata propagates correctly.
+- Subscription state, trigger registration, and projection state all survive
+  kernel restart (snapshot + log replay).
+- The full reactive application loop works without any kernel code changes
+  between defining the components and exercising them.
+
+Does NOT prove (deferred to milestone 2+):
+- Cross-instance subscriptions (federation).
+- Trigger `:effect` results calling effectful primitives.
+- `DefineApplication` bundle install/update/fork.
+- Per-application namespace isolation.
+- Cascade prevention against malicious cascading from peer instances.
+
+**Acceptance for 9b:** smoke test exits 0. Like 9a, **zero fed-sx kernel code
+changes** between defining the application components and observing them
+operate.
+
+---
+
+## Acceptance criteria for milestone 1
+
+All of:
+
+1. **Each step's test suite passes** (`bash next/tests/<step>.sh`).
+2. **Both smoke tests pass** (`bash next/tests/smoke_pin.sh` and
+   `bash next/tests/smoke_app.sh`).
+3. **Erlang-on-SX baseline preserved** — adding fed-sx kernel modules in
+   `next/kernel/*.erl` doesn't break Phase 1-8 conformance.
+4. **Restart durability** — kill the kernel mid-write, restart, projections
+   resume from snapshot, no log corruption.
+5. **Manual Mastodon poke** — point a Mastodon account at
+   `https://next.rose-ash.com/actors/next` and verify the actor doc fetches and
+   webfinger discovery works (read-only AP interop, no follow).
+
+## What lands when
+
+This is the work-order an agent (or human) follows. Steps 1-3 can be done in
+parallel after the Erlang Phase 8 BIFs land. Steps 4-7 are sequential. Step 8
+can start in parallel with step 7. Step 9 is the integration test.
+
+```
+Phase 7+8 (loops/erlang) ───┐
+                            │
+                            ▼
+              ┌─── Step 1 ──┬─── Step 2 ──┬─── Step 3
+              │             │             │
+              └─────────────┼─── Step 4 ──┴────┐
+                            │                  │
+                            └─── Step 5 ───────┤
+                                              │
+                                  Step 6 ─────┤
+                                              │
+                                  Step 7 ─────┤
+                                              │
+                                  Step 8 ─────┤
+                                              │
+                                  Step 9 ─────┘
+```
+
+Estimated effort if done by a focused agent loop, one feature per iteration:
+~30-50 commits across all 9 steps. Could plausibly be a `loops/fed-sx` workstream
+once Phase 7+8 are done.
+
+## What's deferred to milestone 2
+
+- **Federation** (the second-biggest piece). `POST /inbox`, Follow lifecycle,
+  delivery queue, backfill, capability negotiation between peers. Whole of
+  design §13.
+- **Multi-actor** with per-user OAuth and capability tokens. Design §9.5.
+- **IPFS storage backend** as a `DefineStorage` entry. Design §15.3.
+- **Browser client + operator dashboard** (probably in Elm-on-SX or similar).
+- **Rich verbs**: `Endorse`, `Supersede`, `Test`, `Build`, `Compose`, `Note`,
+  `Announce`. All defined as `DefineActivity` artifacts, federated.
+- **Cross-host conformance** — Python/JS/Haskell hosts running fed-sx. Design
+  §11.8.
+- **OpenTimestamps proofs** as a `DefineProof` entry.
+- **Performance work** — JIT-compiled folds, snapshot acceleration, federation
+  batching.
+
+Milestone 2 unlocks "real federation between two fed-sx instances." Milestone 3
+is the rose-ash port (blog, market, events, federation, account, orders) as
+fed-sx applications.
+
+---
+
+## Appendix A: open questions for milestone 1
+
+A few things still under-specified; resolve as work begins.
+
+1. **HTTP server library.** Does the Phase 8 `http:listen/2` BIF wrap an
+   existing OCaml HTTP server (the sx.rose-ash.com one) or something simpler?
+   Implementation choice deferred to Phase 8.
+2. **JSON-LD library.** AP wire format requires JSON-LD canonicalization for
+   signature coverage. Either pull a library or write a minimal subset for the
+   shapes we actually use. Probably the latter — our envelope is well-defined.
+3. **Bearer token rotation.** v1 uses a single env-var token. Token rotation
+   without restart needs registry-style mgmt; can wait.
+4. **Snapshot rate limits.** Default in design is "every 1000 activities or
+   60 seconds." Tunable per-projection later; v1 uses the default.
+5. **Genesis bundle format.** Dag-cbor map per §12.2; concrete schema needs
+   one round of refinement once we author the actual definitions in step 4.
diff --git a/plans/sx-vm-opcode-extension.md b/plans/sx-vm-opcode-extension.md
new file mode 100644
index 00000000..034515bb
--- /dev/null
+++ b/plans/sx-vm-opcode-extension.md
@@ -0,0 +1,430 @@
+# SX VM Opcode Extension Mechanism
+
+Mechanism in `hosts/ocaml/evaluator/` that lets language ports register
+specialized bytecode opcodes without modifying the SX VM core. Direct
+prerequisite for **erlang-on-sx Phase 9** (the BEAM analog) and a structural
+enabler for any future language port that wants performance-critical opcodes.
+
+Reference: `plans/erlang-on-sx.md` Phase 9, `plans/fed-sx-design.md` §17.5,
+`hosts/ocaml/lib/sx_vm.ml` (current VM).
+
+Status: **design** — implementation pending. Sister workstream to the
+`loops/erlang` loop, but lives in `hosts/`, not `lib/erlang/`.
+
+---
+
+## Goal
+
+Allow language ports to register custom bytecode opcodes in the SX VM, with:
+
+- **Zero overhead for core opcodes.** Existing 37 opcodes (per `sx_vm.ml`)
+  must dispatch identically. No regression for any existing language port or
+  the core SX runtime.
+- **One additional dispatch step for extension opcodes.** Acceptable cost; the
+  win comes from avoiding the general CEK machinery.
+- **Per-extension state slot.** Erlang's process scheduler, Haskell's thunk
+  cache, etc. need somewhere to hang state alongside the VM.
+- **Compiler awareness.** The bytecode compiler (`lib/compiler.sx`) must be
+  able to emit extension opcodes by name, looked up against the registered
+  set.
+- **JIT compatibility.** Existing JIT (lazy lambda compilation) continues to
+  work for code paths using only core opcodes. Extension opcodes are
+  interpreted in v1; JITing them is a follow-up.
+
+## Non-goals
+
+- **Hot opcode reload.** Adding/replacing opcodes mid-runtime is not in
+  scope. Extensions are compile-time additions to the OCaml binary. (If
+  needed, that's a separate project.)
+- **Per-instance opcode sets.** All running instances of the SX VM share
+  the same opcode set determined at build time. Selective opcode loading
+  per instance is out of scope.
+- **Opcode hot-swap or supersession.** Once registered, opcodes are stable
+  for the lifetime of the binary.
+- **Language-port isolation at the dispatch layer.** Two language ports can
+  see each other's opcodes (they share the dispatch table). Isolation is a
+  build-time concern — don't compile in extensions you don't trust.
+
+---
+
+## Why now
+
+The Erlang-on-SX Phase 9 work needs this. Without it, Phase 9b-9g (the actual
+opcode implementations) have nowhere to plug in. The Erlang loop will hit
+this dependency as a Blocker; this design is what unblocks it.
+
+It also enables the **shared opcode pattern** discussed in `plans/fed-sx-
+design.md` §17.5: opcodes Erlang Phase 9 produces that other ports could
+plausibly use (pattern match, perform/handle, record access) get chiselled
+out to `lib/guest/vm/` when a second port has an actual second use. Without
+the extension mechanism, each port would have to fork the SX VM core or
+modify shared dispatch — neither acceptable.
+
+---
+
+## Architectural overview
+
+```
+                ┌──────────────────────────────────────────┐
+                │ SX VM core (hosts/ocaml/lib/sx_vm.ml)    │
+                │                                            │
+                │  ┌────────────────────────────────────┐  │
+                │  │ Bytecode dispatch loop             │  │
+                │  │                                     │  │
+                │  │ match op with                       │  │
+                │  │   | 1  (OP_CONST) -> ...           │  │
+                │  │   | 2  (OP_NIL)   -> ...           │  │
+                │  │   | ...                            │  │
+                │  │   | 199 -> ... (last core opcode)  │  │
+                │  │   | op when op >= 200 ->            │  │
+                │  │       Extensions.dispatch op vm     │  │ ◄── new
+                │  │       frame                         │  │
+                │  └────────────────────────────────────┘  │
+                │                                            │
+                │  ┌────────────────────────────────────┐  │
+                │  │ Extension registry                 │  │
+                │  │   opcode_id -> handler             │  │ ◄── new
+                │  │   opcode_name -> opcode_id         │  │
+                │  │   extension_state per extension    │  │
+                │  └────────────────────────────────────┘  │
+                └──────────────────────────────────────────┘
+                                   ▲
+                                   │ register at startup
+                ┌──────────────────┴──────────────────────┐
+                │ Extension modules                       │
+                │  hosts/ocaml/extensions/erlang.ml       │
+                │  hosts/ocaml/extensions/haskell.ml      │
+                │  hosts/ocaml/extensions/datalog.ml      │
+                │  hosts/ocaml/extensions/guest_vm.ml     │ ◄── shared opcodes
+                └─────────────────────────────────────────┘
+```
+
+### Opcode ID space partition
+
+Current SX VM uses opcode IDs in roughly the range 1-162 (per inspection of
+`sx_vm.ml`). We partition the 0-255 space:
+
+| Range | Use |
+|-------|-----|
+| 0 | reserved / NOP |
+| 1-127 | **core opcodes** — owned by the SX VM, locked schema |
+| 128-199 | **`lib/guest/vm/` shared opcodes** — chiselled-out shared opcodes |
+| 200-247 | **language-port opcodes** — registered by extensions |
+| 248-255 | reserved for future expansion / multi-byte opcodes |
+
+This gives ~50 slots for shared opcodes (Phase 1-2 of `lib/guest/vm/` will
+not exhaust this; we can renegotiate if it does), ~50 for any single language
+port's specialized opcodes, and clean separation that makes it obvious which
+opcodes are stable (core), shared (guest), or port-specific (extension).
+
+If we need more than 256 opcodes total, multi-byte opcodes (a leading 248-255
+byte plus a second byte) extend the space without breaking the schema.
+
+### Extension module signature
+
+```ocaml
+(* hosts/ocaml/lib/sx_vm_extension.ml *)
+
+(** A handler for an extension opcode. Reads operands from bytecode,
+    manipulates the VM stack, updates the frame's instruction pointer.
+    May raise exceptions (which propagate via the existing VM error path). *)
+type handler = vm -> frame -> unit
+
+(** State an extension carries alongside the VM. Opaque to the VM core;
+    extensions cast as needed. *)
+type extension_state = ..
+
+module type EXTENSION = sig
+  (** Stable name for this extension (e.g. "erlang", "guest_vm"). *)
+  val name : string
+
+  (** Initialize per-instance state. Called once when the VM starts and the
+      extension is loaded. *)
+  val init : unit -> extension_state
+
+  (** Opcodes this extension provides. Each is (opcode_id, opcode_name, handler).
+      opcode_id must be in the range allowed for this extension's tier
+      (128-199 for guest, 200-247 for ports). Conflicts cause startup failure. *)
+  val opcodes : extension_state -> (int * string * handler) list
+end
+```
+
+### Registration and dispatch
+
+```ocaml
+(* hosts/ocaml/lib/sx_vm_extensions.ml *)
+
+let extensions : (module EXTENSION) list ref = ref []
+let states : (string, extension_state) Hashtbl.t = Hashtbl.create 8
+let by_id : (int, handler) Hashtbl.t = Hashtbl.create 64
+let by_name : (string, int) Hashtbl.t = Hashtbl.create 64
+
+let register (m : (module EXTENSION)) =
+  let module M = (val m) in
+  let st = M.init () in
+  Hashtbl.add states M.name st;
+  List.iter (fun (id, name, h) ->
+    if Hashtbl.mem by_id id then
+      failwith (Printf.sprintf "Opcode %d (%s) already registered" id name);
+    Hashtbl.add by_id id h;
+    Hashtbl.add by_name name id
+  ) (M.opcodes st);
+  extensions := m :: !extensions
+
+let dispatch op vm frame =
+  match Hashtbl.find_opt by_id op with
+  | Some handler -> handler vm frame
+  | None -> raise (Invalid_opcode op)
+
+let id_of_name name = Hashtbl.find_opt by_name name
+let state_of_extension name = Hashtbl.find_opt states name
+```
+
+The dispatch path adds **one hashtable lookup per extension opcode**.
+Acceptable cost — and Erlang's specialized opcodes win >100× over going
+through the general CEK machine, so the overhead is negligible by comparison.
+
+### Bytecode compiler integration
+
+The compiler (`lib/compiler.sx`) needs to know extension opcode IDs to emit
+them. New SX primitive exposed to the compiler:
+
+```sx
+(extension-opcode-id "erlang.OP_PATTERN_TUPLE_2") ; → 200, or nil if not loaded
+```
+
+When the compiler wants to emit a specialized opcode, it queries by name. If
+the extension isn't loaded, the compiler falls back to the general path
+(emit a `CALL_PRIM` or general SX `case`). This means a language port's
+optimization is opt-in per build, and missing extensions degrade to slower
+correct execution rather than failure.
+
+Naming convention: `<extension-name>.OP_<NAME>`. So `erlang.OP_PATTERN_TUPLE_2`,
+`guest_vm.OP_PERFORM`, etc.
+
+### Per-extension state access
+
+Some opcodes need state beyond the VM stack (Erlang's scheduler, mailbox
+state, etc.). Extensions store state in their `init`-returned value, accessed
+via `state_of_extension`:
+
+```ocaml
+let op_spawn vm frame =
+  let st = Sx_vm_extensions.state_of_extension "erlang"
+           |> Option.get
+           |> Obj.magic in   (* extension casts to its known type *)
+  let body = pop vm in
+  let pid = Erlang_scheduler.spawn st body in
+  push vm (pid_value pid);
+  frame.ip <- frame.ip + 1
+```
+
+Shared scheduler state lives in the Erlang extension's state value. Other
+extensions don't see it.
+
+---
+
+## Phase plan
+
+Five sub-phases in dependency order. Each is testable in isolation.
+
+### Phase A — Opcode ID partition + dispatch fallthrough
+
+Smallest viable change to `sx_vm.ml`:
+
+- Add the `| op when op >= 128 -> Sx_vm_extensions.dispatch op vm frame`
+  fallthrough case.
+- Document the partition in a comment at the top of the opcode list.
+
+**Tests:**
+- All existing SX VM tests pass unchanged (zero regression for core).
+- Calling `dispatch 200 ...` with no extension registered raises
+  `Invalid_opcode 200`.
+
+**Effort:** small. ~50 lines + tests.
+
+### Phase B — Extension registry module
+
+`hosts/ocaml/lib/sx_vm_extensions.ml` per the sketch above. Pure plumbing, no
+opcodes yet.
+
+**Tests:**
+- Register a test extension with one opcode; dispatch finds it.
+- Duplicate opcode-id registration fails at startup.
+- `id_of_name` and `state_of_extension` lookups work.
+
+**Effort:** small. ~150 lines + tests.
+
+### Phase C — Compiler-side opcode lookup primitive
+
+Expose `extension-opcode-id` as an SX primitive in `hosts/ocaml/lib/`. The
+compiler in `lib/compiler.sx` can call it to emit extension opcodes by name.
+
+Does not require any extension to actually exist — the primitive returns
+`nil` for unknown names, and the compiler falls back.
+
+**Tests:**
+- Primitive returns nil for unknown name.
+- After registering a test extension, primitive returns the registered ID.
+
+**Effort:** small. Single primitive registration + compiler-side use docs.
+
+### Phase D — Test extension demonstrating end-to-end flow
+
+A dummy extension at `hosts/ocaml/extensions/test_ext.ml` registering one or
+two trivial opcodes (e.g. `OP_TEST_PUSH_42`, `OP_TEST_DOUBLE_TOS`). Wired
+into the build, available when running tests.
+
+Compiler test: write SX that triggers the test compiler-extension to emit
+`OP_TEST_PUSH_42`, then verify the VM executes it correctly via
+`bytecode-inspect` and `vm-trace`.
+
+**Tests:**
+- Bytecode emission via name lookup produces the right ID.
+- Execution produces the expected stack effect.
+- `bytecode-inspect` shows the opcode by name.
+- `vm-trace` correctly reports the extension opcode.
+
+**Effort:** small. ~100 lines including build wiring.
+
+### Phase E — JIT awareness (interpreted-only for v1)
+
+The JIT (lazy lambda compilation) currently compiles based on opcode ranges.
+Extension opcodes (≥128) should fall through to interpretation, not be
+JIT-compiled in v1.
+
+- Mark extension opcodes as "interpret only" in the JIT pre-analysis.
+- A lambda containing only core opcodes JIT-compiles as before.
+- A lambda containing any extension opcode runs interpreted.
+
+JITing extension opcodes is a follow-up project; v1 keeps the JIT scope
+unchanged and just makes it correctly route mixed bytecode.
+
+**Tests:**
+- Lambda with only core opcodes: JIT-compiled, fast path.
+- Lambda with extension opcode: interpreted, correct result.
+- Mixed lambda: interpreted, correct result.
+
+**Effort:** small-medium. Requires understanding the JIT's pre-analysis
+(per `project_jit_compilation.md` memory: "Lazy JIT implemented: lambda
+bodies compiled on first VM call, cached, failures sentinel-marked").
+Extension-opcode detection becomes another reason to mark a lambda
+"interpret-only."
+
+---
+
+## Acceptance criteria
+
+1. **Phase A-D pass their test suites.**
+2. **Zero regression on existing SX VM tests.** All language-port test
+   suites currently passing on the architecture branch (Erlang 530+, Haskell
+   285+, Datalog 276+, Smalltalk 625+, the SX core test suite, etc.) still
+   pass.
+3. **Test extension demonstrates the flow end-to-end.** SX source compiles
+   via the compiler with a registered extension opcode, executes through the
+   VM via the dispatch fallthrough, returns correct result.
+4. **Documentation:** README in `hosts/ocaml/extensions/` explaining the
+   pattern, with a worked example (the test extension is the canonical one).
+
+After acceptance, the Erlang-on-SX Phase 9 work in `lib/erlang/vm/` can use
+this mechanism. The Erlang loop's Blocker for 9a is resolved.
+
+---
+
+## Risk and mitigation
+
+**Risk: regression in core opcode dispatch.** A misplaced `match` arm could
+break something. *Mitigation:* run every existing language-port test suite
+before merging. The cost of this verification is real — probably an hour of
+machine time — but cheaper than discovering it after the fact.
+
+**Risk: opcode ID conflicts as more extensions land.** If Erlang Phase 9
+claims IDs 200-220 and Haskell wants 215-235, we have a problem.
+*Mitigation:* maintain a registry document at `hosts/ocaml/extensions/
+README.md` listing claimed ID ranges per extension. Convention: each
+extension claims a contiguous block at first registration; collisions caught
+at startup with a clear error.
+
+**Risk: extension state types leak through `Obj.magic`.** The extension state
+is type-erased in the registry. *Mitigation:* extensions cast in their own
+opcode handlers, never expose state to other extensions or the VM core.
+First-class modules / GADTs could add more type safety; deferred unless
+this becomes a concrete pain point.
+
+**Risk: extensions become a back door for kernel mutation.** An extension
+opcode handler has full access to the VM. *Mitigation:* extensions are
+build-time additions, not runtime; they're as trusted as the rest of the
+binary. Operators audit at build time, not runtime. Same trust model as
+any other compiled-in code.
+
+**Risk: shared `lib/guest/vm/` opcodes evolve under different language
+ports' needs.** *Mitigation:* the chiselling discipline (move to guest only
+on second use) ensures the shared opcodes are tested against at least two
+ports' actual usage before being considered stable.
+
+---
+
+## Open questions
+
+To be resolved during implementation, not blocking design approval:
+
+1. **Multi-byte opcode encoding.** If we need >256 opcodes total, the
+   leading-byte 248-255 schema accommodates it. Do we need multi-byte at
+   v1? Probably not — 200+ opcodes per port is more than any port should
+   reasonably want.
+2. **Extension ordering matters?** If two extensions register opcodes that
+   read the same VM state, ordering of registration could matter for
+   initialization. Probably not in practice; flag if it bites.
+3. **Hot-reload of extensions.** Out of scope for v1 (per non-goals). If
+   wanted later, the registry would need teardown + re-registration; the
+   `gen_server` `code_change/3` model from Erlang Phase 7 is a precedent.
+4. **Cross-extension opcode composition.** Can `guest_vm.OP_PERFORM` invoke
+   `erlang.OP_RECEIVE_SCAN`? In principle yes — handlers can do anything.
+   The interface is clean; the question is whether we want any conventions
+   to keep ergonomics tractable. Defer until composition appears in
+   practice.
+
+---
+
+## Implementation roadmap and sequencing
+
+This is a sister workstream to `loops/erlang`. Probably best as a single
+focused session (not a continuous loop — the work is bounded, ~1-2 weeks
+of focused effort, not iterative).
+
+Recommended sequencing:
+
+1. **A + B + C land together** as a single PR — they're tightly coupled and
+   easier to test as a unit. Branch: `loops/sx-vm-extensions` or similar.
+2. **D follows** in a second PR; demonstrates the end-to-end flow without
+   committing to any real language port's opcode design.
+3. **E (JIT integration)** as a third PR, once the basic mechanism is
+   battle-tested.
+4. **Extension scope check:** verify Erlang's Phase 9 sub-phases 9b-9g can
+   actually use this mechanism. If gaps surface, they're addressable
+   incrementally.
+5. **`hosts/ocaml/extensions/erlang.ml`** then becomes the *first real
+   consumer* — written by whoever takes over from the Erlang loop's stub
+   dispatcher. That's the integration moment that closes the loop.
+
+Estimated total effort: 1-2 weeks for one focused engineer with OCaml SX VM
+familiarity. Much less if the implementer already knows `sx_vm.ml`.
+
+---
+
+## Relationship to other plans
+
+- **`plans/erlang-on-sx.md` Phase 9:** unblocked by this work. Erlang loop
+  develops opcodes against a stub dispatcher in `lib/erlang/vm/`; once this
+  mechanism lands, swap stub for real registration via
+  `hosts/ocaml/extensions/erlang.ml`.
+- **`plans/fed-sx-design.md` §17.5:** documents this as Layer-1 prerequisite.
+  The shared-opcode discipline (lib/guest/vm/) is designed on top of this
+  mechanism's `lib/guest/vm/` namespace allocation.
+- **Future language ports (Haskell, Datalog, Smalltalk perf phases):** will
+  use the same mechanism. Each adds an extension module, claims an opcode
+  range, registers handlers. The `lib/guest/vm/` opcodes get
+  cross-referenced when the second port's needs justify chiselling.
+- **JIT roadmap (per `project_jit_architecture.md` memory):** extension
+  opcodes are interpreted in v1. JITing them is a logical follow-up but
+  a separate project.