124 KiB
fed-sx — Federated SX Activity Substrate
A federated, content-addressed, extensible application substrate where the unit of computation is a signed activity, the unit of state is a pure SX projection over the activity log, and the substrate's own extensibility (new verbs, new object types, new projections, new validators) is itself published through the same mechanism.
Status: design — not yet implemented. Target subdomain: next.rose-ash.com.
Target location in repo: next/ (new top-level dir, sibling to blog/, market/,
etc.). Stack: pure SX-on-OCaml. Implementation language(s) to be chosen after design
is complete.
1. Premise
ActivityPub's data model — actors, signed activities, inboxes/outboxes — generalises beyond social posting to any domain where state evolves via signed messages. fed-sx takes that generalisation seriously:
- The unit of communication is a signed AP activity.
- The unit of content is an AP object, content-addressed by CID (multihash +
multicodec, default
dag-cborover the parsed SX AST). - State is the deterministic fold of pure SX functions over the activity log.
- The substrate is self-extending: new activity types, object types, projections,
validators, codecs, transports, and signature suites are themselves published as
Define*activities — federated like any other content.
Three commitments make the rest fall into place:
- The kernel is dumb. It only knows envelope shape, signature verification,
append-to-log, fetch-by-id, transport in/out. It does not know what
CreateorPinmean. - Everything else is registry-driven. Verbs, object types, validators, projections, codecs, transports, audiences, proofs, sig suites — all looked up in registries the kernel calls into.
- The registries are themselves publishable. New entries arrive as
Define*activities. Bootstrap registries load from a known set of CIDs at startup; everything else is replayed from the log.
Result: the only code that ever needs to change in the kernel is the envelope itself. New verbs = published SX, federated like any other artifact.
2. CIDs and content addressing
Every artifact has a CID. Default codec is dag-cbor over the parsed SX AST (not the raw text). This buys:
- Sub-AST addressing for free. Each nested structure has an implicit CID; IPLD can
walk paths like
<file-cid>/components/card. The "file CID and component CID" question dissolves: every node is a CID, you choose the granularity at reference time. - Polyglot canonicalization. JS, OCaml, Python only need to agree on AST shape + CBOR's deterministic encoding (RFC 8949 §4.2.1). No byte-identical pretty-printer required across hosts.
- Format immunity. Reformatting, indent changes, equivalent-form normalisations do not change the CID.
- Tooling fit. sx-tree already has the parsed form in memory; computing or verifying a CID is just an encode + hash.
Costs accepted:
- One spec to maintain: SX↔CBOR mapping (number → CBOR int/float, string → text, symbol → tag, keyword → tag, list → array, dict → map). ~50 lines of code per host.
- Author's exact source text is not preserved; re-pretty-print on fetch.
- "Why don't these CIDs match" requires comparing CBOR (a
cid-explaintool helps).
The CID format itself is multicodec-agile: the substrate also accepts raw,
dag-json, dag-pb, etc. when seen, dispatched via the codec registry.
3. Kernel surface (fixed — get this right)
The kernel is the only thing that's hard to change later. Everything else is in registries. Two envelope shapes plus five operations.
3.1 Activity envelope
{ id, type, actor, published,
to, cc, audience-extras,
object | target | origin | result, # AP slots, opaque to kernel
capabilities-required: [...], # so receivers can refuse cleanly
proofs: [...], # OTS, on-chain, multi-sig — all opaque
signature: { key-id, algorithm, value, covered-fields } }
3.2 Object envelope
{ id, type, cid, media-type,
where: inline | cid | url,
content?, link? } # only one populated based on `where`
3.3 Kernel verbs
The only verbs implemented directly by the kernel:
- Append signed activity to outbox (after envelope check + sig verify + validator pipeline).
- Verify signature against actor's published keys, time-aware (which key was
active at
published). - Fetch by
idor bycid. - Receive at inbox (verify + dispatch to registered handlers).
- Replay log to rebuild registries on boot.
Everything else is registry-resolved.
4. Registries
Each registry has a default-populated set (loaded from genesis-bundled CIDs) and
accepts new entries via Define* activities. Default entries themselves are SX
artifacts — versioning, audit, replacement work the same way as user content.
| Registry | Bootstrap defaults | Extended by |
|---|---|---|
| Activity types | Create, Update, Delete, Announce |
DefineActivity{type, schema-sx, semantics-sx} |
| Object types | SXArtifact, Note, Image, Tombstone |
DefineObject{type, schema-sx, render-hint} |
| Validators | envelope shape, signature, type-schema | DefineValidator{applies-to, predicate-sx} |
| Projections | identity, by-type, by-cid, by-actor, actor-state, define-registry, audience-graph, by-object | DefineProjection{name, fold-sx, query-sx} |
| Codecs | dag-cbor, raw, dag-json | DefineCodec{multicodec, encode-sx, decode-sx} |
| Hash algorithms | sha2-256 | multihash table — agile by spec |
| Transports | http-inbox-push | DefineTransport{name, deliver-sx, receive-sx} |
| Audience predicates | Public, Followers, direct |
DefineAudience{name, member-of-sx} |
| Subscription types | Follow (AP-standard) |
DefineSubscription{name, schema-sx, match-sx, delivery} |
| Proof types | (none) | DefineProof{type, attach-sx, verify-sx} |
| Storage backends | files-on-disk | DefineStorage{where-tag, put-sx, get-sx} |
| Triggers | (none) | DefineTrigger{when-subscription, then-sx, cascade-limit} |
| Signature suites | rsa-sha256 (AP-compatible) | DefineSigSuite{name, sign-sx, verify-sx} |
| Application bundles | (none) | DefineApplication{name, subscriptions, triggers, projections, storage} |
Adding Pin, Endorse, Supersede, Test, Build, Compose, etc. later is just
publishing DefineActivity artifacts — no kernel diff, no redeploy required if
registries are hot.
5. The meta-level
A DefineActivity is itself an AP Create activity over an SXArtifact of a
specific type:
(activity 'Create
:object {:type "DefineActivity"
:name "Pin"
:schema (fn (act)
(and (string? (-> act :object :path))
(cid? (-> act :object :cid))))
:semantics
'(fn (act state)
(assoc-in state [:pins (-> act :object :path)]
(-> act :object :cid)))})
When the kernel receives an activity with type: "Pin" it looks up the registered
semantics from a DefineActivity{name: "Pin"} artifact, runs the SX, projects the new
state. The semantics are themselves content-addressed and federated — every receiver
runs the same code.
Same pattern handles DefineProjection, DefineValidator, etc. The substrate is
genuinely self-extending.
6. Verbs
6.1 Bootstrap verbs (milestone 1)
The substrate exposes POST /activity (not POST /publish) — generalised entry
point that takes any well-formed AP activity, validates, signs, appends to outbox.
(publish sx) is sugar at the SX layer for Create{SXArtifact}.
Day-one verbs (cost ~zero once /activity exists):
Create— the publish primitive.Update— supersede a previous activity (correct metadata, change a path mapping). Distinct from "publishing new content" — new content is always a newCreatewith a new CID.Delete— tombstone. AP-native; readers honour it.Announce— boost another actor's artifact into your outbox. Comes free.Subscribe— generalised subscription verb (parallel to publish/Create). Wraps any registeredDefineSubscriptiontype.Followis the standard APSubscribe{Follow{actor: ...}}for wire compatibility. See §18.Unsubscribe—Undoof a priorSubscribe. Same shape as APUndo{Follow}.
6.2 Custom verbs (designed-for, defined later)
Substrate accepts these from day one (any signed activity can be appended); semantics
projected once DefineActivity artifacts exist.
Pin— assigndomain:path/name → CID. The future name-resolution layer made of activities. Each pin is signed; the resolver replays the outbox to compute current state.Endorse(modelled onLike/Approve) — third-party signature on a CID. Web-of-trust style code review without central authority.Supersede— "CID A replaces CID B". Stronger thanUpdate; readers can chase the chain.Test— published assertion that running CID A under conditions X yields result Y. Test-as-artifact, federated.Build— links a source CID to a compiled-output CID, with provenance.Compose— derived artifact citing input CIDs. Provenance graph in the outbox itself.Note(AP-native) — comments / reviews / discussion attached to a CID.Follow/Undo(Follow)— subscribe to another instance's outbox.
The pattern that matters: your outbox isn't just "things published," it's an append-only log of every assertion this actor makes about the SX universe.
7. Capability discovery
Two pieces:
GET /.well-known/sx-capabilities— JSON listing every registered activity-type, object-type, codec, transport, sig-suite, proof-type. Each with the CID of theDefine*artifact that introduced it. Peers can diff capabilities before federating.capabilities-requiredfield on activities — sender declares "this needsPinsemantics +dag-cborcodec." Receivers without those capabilities return a clean 422 referencing the missing CIDs; sender knows whether to replay-and-deliver the bootstrappingDefine*artifacts first.
Federation degrades gracefully across instances at different versions.
8. Axes of flexibility (all designed-for)
- Object types beyond SXArtifact —
Note,Article,Image,Video,Question,Event, etc. via the object-type registry. - Storage tier per-object —
where: inline | cid | url. Tiny things inline; big things to IPFS; legacy stuff URL-linked. Migrating storage backends doesn't migrate the substrate. - Multihash + multicodec agility — sha2-256 + dag-cbor by default; substrate accepts blake3, raw, dag-json, dag-pb, etc.
- Multi-key actors —
publicKeysarray always; per-keypurpose; multiple key types (RSA for AP wire compat, Ed25519 modern). See §9. - Audience / visibility — AP-native
to,cc,bto,bcc. Public, followers, direct, unlisted. Custom audiences viaDefineAudience. - Outbox-as-database — no source-of-truth other than the log. Projections are recomputable views.
- Programmable activities — activities can carry SX. Reactive federation, conditional pins, automated propose/test/release pipelines, all expressed as AP activities.
- Federation transport pluggable — outbox is canonical; how peers exchange is pluggable (HTTP push, pull, libp2p, polling).
- Optional timestamp proofs — every activity has an attachable
proofsslot. OpenTimestamps, on-chain merkle commit, third-party TSA all slot in without changing activity semantics.
Explicitly not pursuing for MVP:
- Schema-version negotiation (premature;
@contexthandles extension). - Configurable conflict-resolution per actor (last-signed-wins, log preserved for audit).
- Verb-specific kernel handlers (other than
Create's "compute CID, store body").
9. Identity & actor lifecycle
9.1 Actor doc shape
{
"@context": ["https://www.w3.org/ns/activitystreams",
"https://w3id.org/security/v1",
"https://next.rose-ash.com/ns/fed-sx/v1"],
"type": "Person", // or Service, Group, Application
"id": "https://next.rose-ash.com/actors/giles",
"preferredUsername": "giles",
"inbox": "https://next.rose-ash.com/actors/giles/inbox",
"outbox": "https://next.rose-ash.com/actors/giles/outbox",
"followers": "...",
"following": "...",
"publicKeys": [ // ARRAY from day one — never `publicKey`
{ "id": "...#key-2026-05",
"type": "RsaVerificationKey2018",
"owner": "<actor-id>",
"publicKeyPem": "...",
"purpose": ["sign-activity", "sign-http"],
"created": "2026-05-14T...",
"expires": null,
"supersedes": null,
"supersededBy": null },
{ "id": "...#key-ed25519-2026-05",
"type": "Ed25519VerificationKey2020",
"owner": "<actor-id>",
"publicKeyMultibase": "z6Mk...",
"purpose": ["sign-activity"],
"created": "2026-05-14T..." }
],
"capabilities": "https://.../actors/giles/capabilities", // what verbs they speak
"alsoKnownAs": ["did:web:rose-ash.com:giles", ...], // bridge to DID, AP migration
"movedTo": null // set on Move
}
Key shape decisions:
publicKeysarray always. Single-key actors have an array of length 1. AP standardpublicKeyis also served as the first array element for back-compat with vanilla AP servers (Mastodon etc. ignore the array).- Per-key
purpose— separates signing weight. Day-to-day publish key vs. high- value key forPin/Endorsevs. delegated machine key. Validators can require specific purposes per activity type (registry-driven). - Multiple key types — RSA for AP wire compat, Ed25519 for everything else (smaller, faster, modern). Sig suite registry decides which suites are accepted.
supersedes/supersededBy— keys form a chain, not a snapshot. Old activities still verify against historical keys.
9.2 Key rotation
Key rotation is itself an activity, signed by the old key (or a recovery key):
(activity 'Update
:object actor-id
:patch {:add-publicKey new-key
:supersede {old-key-id new-key-id}})
Kernel:
- Fetches actor's current state (a projection over their own outbox).
- Verifies activity is signed by a key with
purpose: rotate-key(or any active key, if registry allows). - Appends. The actor-state projection now has the new key.
Old activities still verify because the projection retains the historical key with
supersededBy set — sig verification looks up "what keys were active at activity
timestamp T."
9.3 Key recovery / loss
- Recovery key — separate key at actor creation, never used except to rotate.
Stored offline.
purpose: ["recover"]. Validator allowsUpdate{actor, patch: rotate-all-keys}if signed by a recovery key. - Social recovery — designate N trusted actors, M-of-N can co-sign a recovery
Update. Implemented as aDefineValidatorextension; multi-sig slot inproofsmakes it possible without changing the envelope. - Total loss — if both signing and recovery keys are gone, the actor is dead.
They publish a new actor with
alsoKnownAs: <old-actor-id>from a fresh key. Followers can choose to re-follow but there's no cryptographic continuity.
9.4 Migration (Move)
AP-native:
(activity 'Move
:object old-actor-id
:target new-actor-id)
Receivers update their follow lists. New actor's alsoKnownAs must include old
actor — bidirectional handshake prevents hijacking.
For fed-sx, Move should also carry an outbox migration hint (CID of an export bundle)
so receivers can re-anchor projections without re-fetching activity-by-activity.
9.5 Subordinate actors / delegation
Two patterns supported:
- Service actors (AP-native
type: Service): bots, build servers, test runners. Their own keys, their own outboxes, butattributedToa parent actor. - Capability tokens: parent publishes
Authorize{actor: child, capabilities: [...], expires: ...}signed by parent. Child publishes activities normally with their own key; receivers verify the capability chain when child invokes an authority they don't own outright. Useful for: temporary publish access, delegatedPinrights for a specific path prefix, multi-device.
Both work without new kernel mechanism — just activities.
9.6 Implications
- Sig verification is timestamp-aware. Verifying an old activity needs the key state at the time it was published — actor-state projection must support time-travel queries.
- Inbox doesn't trust
keyIdblindly. Fetches actor doc, projects current key state, checks key was valid atpublished. - Cross-instance identity via
alsoKnownAsand DIDs. Don't depend on DIDs but slot them in for Bluesky-bridge, Solid-bridge, etc.
10. Projection model
The architectural commitment: state is what you get when you fold pure SX over the log. No DB-of-record. Everything queryable is a projection.
10.1 What a projection is
A DefineProjection activity registers four things:
(activity 'Create
:object {:type "DefineProjection"
:name "actor-state"
:initial-state {} ; pure SX value
:fold (fn (state activity) ; pure SX
(case (:type activity)
"Create" (when (= "Person" (-> activity :object :type))
(assoc state (:id activity) (:object activity)))
"Update" (apply-patch state activity)
"Move" (set-moved state activity)
state))
:snapshot-codec "dag-cbor"
:indexes [{:by :id} {:by :preferredUsername}]})
name— query handle. Unique per actor; collisions resolved by CID + supersession.initial-state— pure SX value used as state-zero.fold— pure SX function(state activity) → state. The only thing the kernel calls.indexes— optional hint for materializing lookup paths.
The CID of the DefineProjection artifact is the projection's identity. Two instances
running the same projection are running the same CID's fold over the same log slice
— equivalence is decidable.
10.2 The fold contract — purity, determinism, gas
The fold function must be pure and deterministic. Non-negotiable; it's what makes cross-instance equivalence and replay possible.
- No IO. No HTTP, no file access, no DB calls, no clock. The activity carries its
own
publishedtimestamp. - No randomness. No host-seeded PRNG. (If pseudo-randomness is needed, seed from the activity's CID — deterministic across hosts.)
- No mutation outside the returned state.
- Bounded execution. Each fold call gets a gas budget (default tunable, e.g. 100k CEK steps). Exceeding it is a hard failure.
Enforced at the SX evaluator level by running folds in a sandboxed environment with the IO platform stripped to nothing. Same sandbox model applies to validators and trigger semantics.
Cross-host equivalence guarantee: for the same projection CID + same activity log slice, every conforming SX host (JS, OCaml, Python, Haskell-on-SX, …) must produce a state value with the same canonical CID. Tested via the spec test suite.
10.3 Bootstrap projections
The kernel cannot start without some projections, because the kernel itself uses them. Baked into the genesis bundle (see §11), superseded only by deliberate kernel-version upgrades.
| Projection | What it computes | Used by |
|---|---|---|
activity-log |
Identity — every activity, indexed by id and CID | Everything |
by-type |
type → ordered list of activity-CIDs |
Most queries |
by-actor |
actor-id → ordered list of activity-CIDs |
Per-actor outbox view |
by-object |
object-CID → list of referencing activity-CIDs |
"Who pinned this?" |
actor-state |
actor-id → current actor doc with key history |
Sig verification (kernel) |
define-registry |
kind+name → currently-active Define* CID |
All other Define* lookups |
audience-graph |
actor → followers/following |
Federation push |
define-registry is the bootstrap chicken-and-egg: it's the projection that knows
which projections (and validators, codecs, etc.) are currently active. Kernel ships
with it hardcoded; once running, every other projection (including a future replacement
of define-registry itself) is a regular DefineProjection superseding it.
10.4 Snapshotting
Replaying the entire log on every restart is unacceptable past day one.
- Snapshot =
(activity-tip-CID, projection-state, projection-CID)tuple, dag-cbor encoded, content-addressed. - Snapshot rule — every K activities (default 1000) and every T seconds (default 60), serialize, hash, store on disk.
- Resume — on startup, find latest snapshot for each (projection-CID, log-tip), load state, fold forward.
- Snapshot CID is verifiable — anyone with the same log slice and projection-CID can recompute and check the CID matches. This is the cross-instance agreement proof.
Snapshots are themselves publishable as activities (Create{Snapshot}): an instance
can publish "here's my computed state for projection X at log-tip Y, CID Z." Other
instances can fetch and use as a starting point. Federated state sharing falls out of
federated activities.
Snapshots are pruning-friendly: keep latest + snapshots referenced by published
Create{Snapshot} activities; everything else is GC-able.
10.5 Reprojection on definition change
When DefineProjection{name: "actor-state"} is superseded by a new CID with a
different fold:
define-registryprojection sees the supersession; its state advances.- New projection materialized alongside the old one — both kept live during migration.
- New projection runs in catch-up mode: replay from genesis (or from deepest compatible snapshot).
- When new projection catches up to log tip, queries cut over. Old projection state can be retired.
- Snapshots of old version stay around as long as referenced (e.g. for time-travel queries against historical state under old semantics).
Changing a projection definition is safe and online. Cost: temporary state duplication during catch-up. Slow folds → slow migrations, but never breakage.
For projections too expensive to fully reproject, Update{DefineProjection} can
declare migrationHint: <fn from old-state to new-state> — opt-in, used at migrator's
risk.
10.6 Time-travel queries
Folds are deterministic functions of (initial-state, activity-list-prefix).
Time-travel is fold-up-to:
state-as-of(projection, activity-id-or-timestamp)→ walk to requested point, return state.- Snapshots act as accelerators (resume from nearest snapshot ≤ target).
- Used by sig verification ("what keys did this actor have when this activity was signed?"), audit, "what did we believe last Tuesday."
10.7 Projection composition
Projections do not directly read each other's state during folding. Preserves locality and parallelism — every projection runs independently against the same log.
Composition via:
- Query time —
(query (projection actor-state) ...)joins are SX expressions over multiple projection states. - Republishing as activities — a projection that exposes its state as input to
others publishes
Create{Snapshot}periodically. Downstream projections fold over those.
Direct cross-projection reads during fold introduce ordering, cycles, cache- invalidation problems we don't need.
10.8 Querying
Three layers:
- Raw projection state —
GET /projections/<name>?at=<timestamp>returns dag-cbor (also JSON for tooling). Large states paginated by index. - SX queries —
POST /querywith an SX expression that runs against one or more projection states in pure mode. Equivalent to Datalog/GraphQL. - Materialized indexes — declared on projection (
indexes:field). Kernel maintains as side-tables forO(log n)lookup.
Real-time: clients GET /projections/<name>/subscribe (SSE), receive deltas as
activities land. Delta is (old-state, new-state, applied-activity-CID); clients can
verify by re-folding.
10.9 Lag, async, concurrency
- Append is sync; projection is async.
POST /activityreturns once activity is durably in the log. Projections run in a separate worker pool; query results carryprojected-up-toso callers know whether the latest write is visible. - One worker per projection. Folds are sequential, but projections run in parallel with each other.
- Sync option —
POST /activity?wait-for=projection-nameblocks until the named projection has folded the new activity. Use sparingly.
10.10 Failure modes
| Failure | Response |
|---|---|
| Gas exhaustion | Activity tagged projection-failed for this projection. State unchanged. Operator alert. |
| SX runtime error (assertion, type mismatch) | Same as gas: activity skipped, error logged, state unchanged. |
| Schema violation | Caught earlier in validation pipeline, never reaches projection. |
The log itself is always written successfully if it passes envelope + signature + validator checks. Projection failures don't gate appending — that would couple writes to arbitrary user-defined code.
10.11 Operational implications
- Projection determinism is the linchpin. If JS and OCaml ever produce different state for the same log + projection, federation cracks. Spec test suite must cover projection equivalence across hosts as a first-class requirement.
- Snapshots are eventual consensus. Two instances publish
Create{Snapshot}for the same log+projection; if their CIDs match, they agree without coordination. - Kernel reads its own projections.
actor-statefor sig verification;define-registryfor every Define* lookup. Startup sequence must bootstrap these before serving traffic. - Reprojection cost is real. Heavy projection changes mean replaying from genesis. Encourage incremental schemas (small per-activity work, idempotent updates) and provide profiling.
11. Sandbox & determinism
The runtime contract that makes folds (and validators, triggers, semantics) safe to execute, and that guarantees every conforming SX host computes the same state from the same log.
11.1 Three sandbox levels
Different registry entries need different power. We define three nested execution modes; the registry entry declares which mode it requires.
| Mode | Used by | IO | Clock | Random | Determinism |
|---|---|---|---|---|---|
| pure | folds, validators, audience predicates, semantics, trigger when-sx |
none | activity's own published only |
seeded from activity CID only | required across hosts |
| crypto | sig suite verify, codec encode/decode | crypto primitives only | none | sign-only secure RNG | required across hosts (verify); single-host (sign) |
| effectful | storage backends, transports, trigger then-sx, some proof verifiers |
per-capability grant only | host clock | host RNG | not required; single-host |
Default mode is pure. The other two are opt-in at registration time, and the registration is itself a signed activity — anyone can audit which extensions claim which powers.
11.2 Pure sandbox (the load-bearing one)
This is the mode every projection fold runs in. It must produce identical results on every conforming SX host, every time.
Allowed:
- All spec primitives in
spec/primitives.sxthat don't perform IO (arithmetic, comparison, predicates, string ops, collection ops, dict ops, format helpers). - The activity being processed (full envelope), as the function's argument.
- The current state value, as the function's argument.
- A small set of fed-sx-specific deterministic primitives:
(activity-cid act)→ CID of the activity envelope(activity-time act)→ ISO timestamp frompublished(actor-state-as-of state-snapshot actor-id activity-time)→ if the projection has been declared dependent onactor-state(see §10.7), reads from a snapshot of that projection at the activity's timestamp(seeded-rng cid)→ deterministic PRNG seeded from a CID, returns a stream of uniform values
Forbidden:
- All IO: HTTP, file, network, stdin/stdout, environment.
- Wall-clock access. The host's
nowis not in scope; the only time available is(activity-time act). - Host-seeded randomness. Only
seeded-rng(CID-derived) is available. - Mutation outside the returned value. Enforced by the SX evaluator's lack of
ambient mutable bindings; folds may use local
letand mutation within their own closure but cannot reach outside. - Calling other registry entries by name. Composition happens at query time, not fold time (see §10.7).
Enforced by: evaluator runs the fold with the IO platform stripped to nothing.
The fed-sx kernel constructs a pure-platform (no fetch, no query, no action, no
DOM, no storage) and uses it as the sole evaluator platform when calling the fold.
Any IO primitive call raises a hard error caught as a fold failure.
11.3 Crypto sandbox
Sig suites and codec encode/decode need hash + crypto + encoding primitives but nothing else. They're still deterministic across hosts (verify case) but get a narrower platform than effectful, wider than pure.
Additional primitives over pure:
(sha2-256 bytes),(sha3-256 bytes),(blake3 bytes), …(rsa-verify pubkey msg sig),(ed25519-verify pubkey msg sig), …(rsa-sign privkey msg),(ed25519-sign privkey msg)— sign-only; requires the caller to supply a secure RNG handle (which is not in pure mode)(cbor-encode value),(cbor-decode bytes)— for codecs implementing CBOR variants(base32-encode bytes),(base58btc-encode bytes),(multibase-encode tag bytes)(multihash-encode tag digest-bytes),(multihash-decode bytes)(cid-encode codec mhash),(cid-decode bytes)
Sign vs verify: verify is pure (deterministic). Sign is not — it consumes randomness. fed-sx draws a clean line: signing happens outside registry-entry SX (it's an operation the kernel/runtime performs on behalf of the actor with their private key); registry SX only ever verifies. This keeps the pure↔crypto distinction tractable.
11.4 Effectful sandbox
Storage backends, transports, trigger then-sx, and proof verifiers that need the
network (e.g. blockchain RPC for on-chain proof verification) all need real IO.
These are not used to compute projected state; they're how the substrate interacts
with the outside world.
Capability-granted primitives. The registration activity declares the capabilities the entry needs:
(activity 'Create
:object {:type "DefineStorage"
:where-tag "ipfs"
:capabilities [{:type "http-client" :allowlist ["http://localhost:5001/*"]}
{:type "fs-read" :path-prefix "/var/cache/fed-sx/ipfs/"}
{:type "fs-write" :path-prefix "/var/cache/fed-sx/ipfs/"}]
:put-sx (fn (cid bytes) ...)
:get-sx (fn (cid) ...)})
Capability types (initial set; extensible):
http-clientwithallowlist(URL prefix patterns)http-serverwithpath-prefix(mounts a sub-handler)fs-read/fs-writewithpath-prefix(chroot-style)subprocesswithcommand-allowlistclock-read(wall clock; granted if registry entry needs to timestamp something)random-bytes(host CSPRNG)
No ambient authority. Default capability set is empty; every capability is explicit, declared, signed, and auditable. A peer can refuse to load a registry entry whose capability claim is unacceptable to them.
Capabilities are content-addressed. Each capability descriptor has a CID. The substrate maintains a registry of "capability CIDs that this instance trusts to honour" — operator policy, not protocol.
11.5 Gas and resource accounting
Each sandbox call gets a budget:
- CEK gas — every evaluator step costs 1 unit; primitive calls cost a per-
primitive amount declared in
spec/primitives.sx. Default budget: 100k units per fold call. Tunable per-projection viaDefineProjection.gas-limit. - Memory ceiling — peak heap size for the fold call. Default 64 MB. Tunable.
- IO budget (effectful only) — bytes read/written and network calls per invocation, granted separately per capability.
- Wall-clock budget (effectful only) — max real-time before forced termination.
Exceeding any budget is a hard failure; the call returns an error value, the fold's state is unchanged, and the activity is tagged for the projection.
Gas accounting is part of the spec — every conforming host must charge the same units for the same operations, so "this fold runs out of gas" is a deterministic property of the (projection, activity) pair, not a host-specific outcome.
11.6 Determinism gotchas
The pure sandbox is only as deterministic as its primitives. Worth nailing:
- Floating point. IEEE 754 binary operations are bitwise-identical across
conforming hosts, but transcendentals (
sin,cos,log,exp) are not — libm implementations differ. *Decision: floats are forbidden in pure mode unless the projection declaresrequires-deterministic-floats: trueand uses only the IEEE 754 basic operations (+, -, , /, sqrt, comparison, conversion). For exact arithmetic, use integers or rationals (fed-sx will provide a rational primitive). - Map / dict iteration order. Must be sorted-key always in pure mode. The SX
spec mandates this for
for-eachandmapover dicts; we tighten it: pure mode forbids relying on insertion order. - String encoding. All strings are UTF-8 NFC at ingestion; pure-mode operations
use byte-level comparison after normalization. Codepoint operations (
length,substring) return identical results across hosts because they operate on the normalized form. - Integer overflow. Pure mode uses arbitrary-precision integers (the SX spec default). No undefined behaviour. Overflow is impossible.
- Equality. Structural equality (
equal?) compared across hosts must yield the same result for the same canonical-CID values. Implies dict equality is order-independent (as it should be), and float equality follows IEEE 754 (NaN ≠ NaN; +0.0 = -0.0). - Error values. When a primitive errors, the error must be representable as a
dag-cbor value with a stable CID across hosts. Reserve a
{:error :type ... :msg ...}shape; standard error types defined in the spec.
11.7 Failure model
A pure-mode call ends in one of three terminal states:
- Success — returns a value. Fold uses it as new state.
- Sandbox violation — IO attempted, capability denied, etc. Returns a stable
error value; fold's state is unchanged; activity tagged
{:projection-failed :reason :sandbox-violation :detail ...}. - Resource exhaustion — gas, memory, IO budget exceeded. Same handling as
sandbox violation but with
:reason :resource-exhausted.
Crypto-mode failures (e.g. invalid signature) are return values, not exceptions — verify returns boolean, sign returns either a sig or an error. This forces callers to handle failure explicitly.
Effectful-mode failures (network down, disk full) propagate to the operator as errors but never affect projected state. The substrate retries effectful operations according to the registry entry's policy (declared at registration).
11.8 Conformance testing
Cross-host equivalence isn't aspirational; it's tested.
- Spec test suite ships projection equivalence tests: a corpus of (log slice, projection CID, expected snapshot CID) tuples. Every conforming SX host must produce the expected snapshot CID for each input.
- Validator equivalence tests likewise: (validator CID, activity, expected result).
- Codec equivalence tests: (codec CID, value, expected encoded bytes), in both encode and decode directions.
- Sandbox isolation tests: "this fold attempts to call
fetch; expected outcome: sandbox violation error with stable CID."
Hosts run the conformance suite to claim "fed-sx pure-mode conformance." Failures
are publishable as Test{result: failed, host: ..., projection: ...} activities —
the conformance graph itself is federated.
11.9 Operational implications
- The pure sandbox is the heart of cross-host federation. Every divergence is a
spec bug or a host bug; both are caught by snapshot CID mismatches and surfaced
via
Testactivities. - Capability descriptors are the new audit trail. "What can the IPFS storage backend do?" is a question with a precise answer at any timestamp — the registered capability CIDs.
- Floats are mostly absent. This is unusual but defensible — most state in the substrate is ids, counts, sets, references. Numerical computation belongs in effectful registry entries (e.g. an analytics projection that publishes summaries as activities, projected by a downstream pure projection that just stores them).
- Gas is part of the protocol. Two hosts disagreeing about whether a fold runs out of gas is a conformance failure. Spec primitive gas costs are normative.
12. Bootstrap & genesis
How a fresh instance starts with no log, where the initial registry entries come from, and how the kernel evolves without bricking peers.
12.1 The genesis problem
The substrate is "everything is a Define* activity in the log." But on a fresh
instance the log is empty — so there are no Define* activities to tell the kernel
what Create means, how to verify a signature, or what dag-cbor is. Strict
turtles-all-the-way-down would deadlock startup.
Solution: the kernel ships with a baked-in genesis bundle containing the minimal set of definitions it needs to interpret its own log. The bundle is a constant of the kernel binary; its CID is hardcoded; the kernel verifies on startup that the bundle matches its hardcoded CID. After that, everything (including superseding the bundled definitions themselves) goes through the activity log.
The genesis bundle is not itself a federated artifact in the AP sense. It's the
dictionary you need before you can read any activities. Optionally, an actor can
Create{GenesisRecord} as their first published activity to advertise which genesis
they started from — informational, not load-bearing.
12.2 Genesis bundle contents
Minimal viable bundle (dag-cbor object, content-addressed):
{
"type": "fed-sx-genesis",
"kernel-version": "1.0.0",
"envelope-spec": { ... }, // canonical schema for activity envelope
"object-spec": { ... }, // canonical schema for object envelope
"definitions": {
"activity-types": {
"Create": { "schema": <sx>, "semantics": <sx> },
"Update": { "schema": <sx>, "semantics": <sx> },
"Delete": { "schema": <sx>, "semantics": <sx> },
"Announce": { "schema": <sx>, "semantics": <sx> }
},
"object-types": {
"SXArtifact": { "schema": <sx> },
"Note": { "schema": <sx> },
"Tombstone": { "schema": <sx> },
"DefineActivity": { "schema": <sx> },
"DefineObject": { "schema": <sx> },
"DefineProjection": { "schema": <sx> },
"DefineValidator": { "schema": <sx> },
"DefineCodec": { "schema": <sx> },
"DefineTransport": { "schema": <sx> },
"DefineAudience": { "schema": <sx> },
"DefineProof": { "schema": <sx> },
"DefineStorage": { "schema": <sx> },
"DefineTrigger": { "schema": <sx> },
"DefineSigSuite": { "schema": <sx> },
"Snapshot": { "schema": <sx> }
},
"sig-suites": {
"rsa-sha256-2018": { "verify": <sx>, "key-format": <sx> },
"ed25519-2020": { "verify": <sx>, "key-format": <sx> }
},
"codecs": {
"dag-cbor": { "encode": <sx>, "decode": <sx> },
"raw": { "encode": <sx>, "decode": <sx> },
"dag-json": { "encode": <sx>, "decode": <sx> }
},
"projections": {
"activity-log": { "initial-state": ..., "fold": <sx> },
"by-type": { "initial-state": ..., "fold": <sx> },
"by-actor": { "initial-state": ..., "fold": <sx> },
"by-object": { "initial-state": ..., "fold": <sx> },
"actor-state": { "initial-state": ..., "fold": <sx> },
"define-registry": { "initial-state": ..., "fold": <sx> },
"audience-graph": { "initial-state": ..., "fold": <sx> }
},
"validators": {
"envelope-shape": { "predicate": <sx> },
"signature": { "predicate": <sx> },
"type-schema": { "predicate": <sx> }
},
"audience-predicates": {
"Public": { "member-of": <sx> },
"Followers": { "member-of": <sx> },
"Direct": { "member-of": <sx> }
}
},
"capability-types": [ // schema for capability descriptors
"http-client", "http-server",
"fs-read", "fs-write",
"subprocess", "clock-read", "random-bytes"
]
}
Each definition's body is SX source, not bytecode. The kernel evaluates it at
startup using the same SX evaluator user-published Define* artifacts use — there
is no privileged "native" path. The bootstrap is just SX loaded from the binary
instead of from the log.
12.3 Hardcoded CID and verification
The kernel binary contains:
- The full genesis bundle (embedded as bytes).
- The CID computed over those bytes at build time.
On startup:
- Compute the actual CID of the embedded bundle.
- Compare to the hardcoded CID.
- Mismatch → refuse to start. Either the binary has been tampered with or the build process is broken. Either way, the operator should know immediately.
- Match → proceed. Every running instance with a given kernel binary has byte-identical bootstrap state — no version drift possible within a binary.
The genesis CID is exposed at GET /.well-known/sx-capabilities so peers can see
which kernel version they're talking to.
12.4 Fresh instance startup sequence
1. Load and verify genesis bundle (panic on mismatch)
2. Parse all definition SX sources, instantiate evaluator closures
3. Initialize registries from definitions (in the order: codecs → sig-suites →
validators → object-types → activity-types → audience-predicates → projections)
4. Open log file (create if missing)
5. Replay any existing log: for each activity, validate, then fold into each
projection (resuming from snapshots where available)
6. Load or generate actor keypair (filesystem path from config)
7. If actor has never published a Create{Person} for itself, generate and append
one as the first activity of this instance's outbox
8. Initialize HTTP server, wire routes
9. Open inbox: start accepting federated activities
10. Mark instance as ready
Steps 1-3 are the bootstrap. Step 5 is replay-and-project. Step 7 is the "actor genesis" — every instance has at least one local actor; it publishes itself as its first activity, and that activity (signed by the actor's own key) anchors all subsequent activity from that actor.
12.5 First activity — actor creation
Every fresh actor's outbox starts with:
(activity 'Create
:id "https://next.rose-ash.com/actors/giles/activities/<uuid>"
:actor "https://next.rose-ash.com/actors/giles"
:published "<iso-timestamp>"
:to ["https://www.w3.org/ns/activitystreams#Public"]
:object <full actor doc with publicKeys array>
:signature <signed by the new key over the activity envelope>)
Self-signed: the activity introduces the key it's signed with. Verifiers fetch the actor doc embedded in the activity, find the key, verify against the activity. This is the trust-on-first-encounter for a new actor — the same model AP uses.
The kernel emits this automatically on first startup if the actor has no prior
activity. Subsequent actor changes (key rotation, profile updates) are Update
activities signed by an existing key.
12.6 Joining federation
A new instance has no peers initially. Discovery is operator-driven for v1:
- Operator configures one or more peer URLs (or a well-known seed list).
- Instance fetches peer's actor doc and
/.well-known/sx-capabilities. - Instance verifies it can interpret the peer's activities (envelope compatible, sig suites overlap). Reports incompatibilities to operator.
- If compatible, instance follows peer's primary actor (
POST /inboxwith aFollowactivity). - Peer streams or backfills outbox to this instance.
- Activities arrive, validate, fold into local projections.
Discovery beyond manual config (e.g. peer recommendations, federation directories) is a v2 concern.
12.7 Kernel version evolution
The substrate must evolve without forcing every instance to upgrade in lockstep. Three rules:
Rule 1: The activity envelope shape is forward-compatible only.
We may add optional fields to the envelope; we may not change semantics or remove fields. Old activities still validate under new kernels. New activities with new fields are accepted by old kernels (which ignore the unknown fields, store the raw envelope, and project conservatively).
This is the AP discipline. We adopt it strictly. If we ever need a breaking envelope change, it's a major version (fed-sx 2.0) and instances at different majors don't federate directly — only via bridges.
Rule 2: Everything else evolves via supersession.
New sig suite, new codec, new projection definition, new validator: publish a
Define* activity that supersedes the old one. Both old and new versions stay valid
at their respective timestamps. Old activities verify under old definitions; new
activities use new definitions. Time-aware lookup (§9.6, §10.6) makes this work.
Rule 3: New genesis bundles supersede old ones via published activities.
When the kernel team ships a new version with an updated bundle:
- The new bundle's CID is different.
- Operators upgrading the kernel get the new bundle automatically.
- The new bundle's contents are largely supersession
Update{DefineProjection, DefineValidator, ...}activities relative to the old bundle's definitions. - A peer running the old kernel sees these
Updateactivities (when they appear in followed outboxes) and can opt to load them dynamically (§12.8) or stay on the old bundle definitions until the operator upgrades.
In other words: the kernel binary evolution and the activity-log evolution are parallel tracks. The binary determines what's built in; the log determines what's currently active. They converge over time but don't have to be lockstep.
12.8 Dynamic Define* loading
When an instance receives an activity of type: "PinV3" and has no DefineActivity{ name: "PinV3"} in its define-registry, it has three options (operator policy):
- Strict mode — store the activity envelope (it's valid AP), tag it
unknown-typeinby-type, do not project semantics. Operator must explicitly load the definition to enable projection. - Permissive mode — fetch the
DefineActivity{name: "PinV3"}artifact (its CID is in the activity'scapabilities-requiredlist), validate, evaluate the semantics SX (in pure sandbox), reproject the activity. Operator notified. - Trusted-peers-only mode — like permissive, but only auto-loads
Define*from actors on a configured trust list.
Default for fed-sx v1: strict mode. Operators opt-in to broader policies.
This lets the substrate genuinely live-extend — new verbs land via federation, no binary upgrade — while keeping a clean audit trail of what got loaded when.
12.9 Genesis as the substrate's manifest
A useful framing: the genesis bundle is the substrate's manifest (in the package- manager sense). It declares "this kernel ships with these definitions, identified by these CIDs, and this is what the kernel does until the log says otherwise."
Two instances with the same genesis CID start identical. Two instances with different genesis CIDs can federate as long as their active registry states (after log replay) overlap enough.
The genesis bundle is also the conformance reference: a kernel implementation claims fed-sx v1.0 conformance by reproducing the standard genesis bundle's CID from its own build of the included SX sources. If two implementations build the same spec sources and produce different CIDs, one of them is non-conformant. Cheap, deterministic conformance check.
12.10 Operational implications
- Build-time CID computation is part of the kernel build. The build pipeline must include the genesis-bundling step and embed the resulting CID. Mismatch protection requires the binary to know what it expects.
- Genesis evolution is a deliberate kernel-team decision. Adding a new bundled projection or sig suite is a kernel release, not a federated activity. (User- defined projections still federate normally.)
- Strict-mode default protects against malicious extensions. Operators have to
consciously opt into auto-loading remote
Define*. This trades convenience for security — appropriate for v1. - Cross-major federation is a bridge problem. If/when fed-sx 2.0 ships with an envelope change, bridges between v1 and v2 are themselves federated artifacts — built by anyone, signed, audited.
13. Federation mechanics
How instances exchange activities, how peers subscribe, how new followers backfill, how delivery survives unreliable networks, and how the substrate resists abuse.
13.1 Push, pull, hybrid
ActivityPub canonically uses push: actor A publishes by POSTing each delivery to each follower's inbox URL. This gives low latency and clear delivery semantics, but requires a reliable per-recipient delivery queue and falls over when peers go down.
fed-sx supports both, with a push-primary, pull-fallback model:
- Push is the default delivery mechanism. When an activity is appended to A's outbox, A's delivery worker posts it to each follower's inbox.
- Pull is always available: any peer can
GET /actors/<id>/outbox?since=<cursor>and stream activities in order. Used for backfill, recovery from delivery gaps, and instances that prefer pull-only operation. - Hybrid in practice: push delivers notifications (the activity itself, or a pointer to its CID); receivers may pull the full content if not inlined. Useful when the activity body is large.
Operators can configure their actors as push-only, pull-only, or hybrid. The default is hybrid.
13.2 The Follow lifecycle
AP-standard, slightly tightened:
;; A wants to follow B
(activity 'Follow
:actor "https://a.example/actors/alice"
:object "https://b.example/actors/bob")
;; → POST to B's inbox
;; B accepts (or rejects)
(activity 'Accept
:actor "https://b.example/actors/bob"
:object <follow-activity-id-or-embedded>)
;; → POST to A's inbox
;; A unfollows later
(activity 'Undo
:actor "https://a.example/actors/alice"
:object <follow-activity-id-or-embedded>)
;; → POST to B's inbox
State derived by the audience-graph projection on each instance:
(followers actor)— set of actors who followactor, projected fromAccept{Follow}activities inactor's outbox (and the inverse via receivedFollowactivities).(following actor)— symmetric.
Auto-accept by default. Public actors auto-publish Accept for any incoming
Follow. Locked actors require manual approval, implemented as an operator UI that
publishes the Accept (or Reject) once a human decides.
13.3 Backfill
When A first follows B, A wants B's history. Four supported modes:
| Mode | Mechanism | Trade-off |
|---|---|---|
| No backfill | Just stream new activities going forward | Cheapest, missing context for new followers |
| Pull paginated | GET /outbox?since=epoch&limit=100 repeatedly |
Standard, slow for large outboxes |
| Snapshot fetch | Find latest Create{Snapshot} published by B for the projection of interest, fetch + verify, then pull only activities after the snapshot's tip |
Fast, requires B to publish snapshots |
| Bundle fetch | Out-of-band: B publishes a CID for an export bundle (a dag-cbor list of activities + actor doc + sig suite verification metadata); A fetches once, validates the chain, replays | Fastest for cold starts; bundle creation is opt-in |
Default: snapshot fetch when available, paginated pull otherwise.
A new instance joining federation typically combines: snapshot-fetch the
actor-state and define-registry projections from a trusted peer (so it knows who
exists and what verbs are defined), then incrementally backfill specific actors of
interest.
13.4 Delivery queue and retry
Every push delivery attempt has a fate:
| Outcome | Action |
|---|---|
| 2xx | Mark delivered |
| 3xx | Follow redirect (with limit) |
| 4xx (except 429) | Mark permanently failed — peer rejected the activity. Log; don't retry. |
| 429 | Honour Retry-After; reschedule |
| 5xx | Exponential backoff; reschedule |
| Connection error | Exponential backoff; reschedule |
Retry schedule (default, tunable per peer):
1 min, 5 min, 15 min, 1 h, 4 h, 12 h, 24 h, 48 h, 96 h
After the last attempt fails, the activity is abandoned for push but remains in
A's outbox. Followers can still pull it via GET /outbox?since=.... The peer will
eventually catch up if they come back online and pull. Push is best-effort; pull is
the source of truth.
Persistent queue. Delivery state is itself stored in the local instance — it's operator-internal, not federated. (Could be a regular SQLite table; doesn't need to be a projection because it's not state-the-world-cares-about.) On instance restart, the queue resumes from where it left off.
Queue-as-projection (alternative): for instances that want every aspect to be
log-derived, the delivery state could be a local-only projection over a stream of
Attempt / DeliverySuccess / DeliveryFailure activities written to a private
local-only outbox. Out of scope for v1 but the design admits it.
13.5 Audience-respecting delivery
Each activity carries to, cc, bto, bcc. The delivery worker computes the
delivery set: union of explicit recipients + (if as:Public or Followers in
audience) the actor's followers projection.
btoandbccare stripped before delivery (recipients shouldn't see who else is blind-copied).- Receivers honour audience. When an instance receives an activity it should
not be in the audience for (e.g. a
Directactivity to someone else, leaked via a misconfigured peer), it logs and discards. Validators in the inbound pipeline enforce this. - Public ≠ unlisted.
to: as:Publicmeans deliver to followers AND make publicly fetchable AND show in public projections. Some actors prefer "publicly fetchable but not pushed broadly" —cc: as:Publicwithto: Followers.
13.6 Spam and abuse posture
ActivityPub has well-known abuse vectors (Mastodon's history is instructive). fed-sx defends in layers:
Signature verification. Every inbound activity must have a valid signature
matching an actor whose key was active at published. Forgeries are dropped at the
envelope-validation stage (§14). Necessary but not sufficient — signatures only
prove the message wasn't tampered with, not that the sender is benign.
Per-source rate limits. Per-actor and per-instance request rate limits on
/inbox. Default: 100/min per actor, 1000/min per instance. Exceeded → 429.
Per-instance trust state. Three categories, operator-configured (and overridable per actor):
- Trusted — auto-accept, auto-load Define* (if permissive mode), no rate- multiplier penalty.
- Default — accept signed activities, standard rate limits, do not auto-load Define*.
- Suspended — drop all inbound activities, refuse outbound delivery, do not fetch artifacts. Operator decision (e.g. spam source, harassment instance).
Trust state is local-only (operator policy); it is not federated. Different instances can disagree.
Audience refusal. Activities not addressed to anyone on this instance (no local
followers, not as:Public, not to: a local actor) are dropped on receipt.
Discourages spam targeting random instances.
Content validators. Registry-driven content moderation: a DefineValidator
with applies-to: "inbound" runs against every inbound activity and can reject
based on content rules. Examples: link-spam detection, ML moderation models served
via an effectful validator (note: effectful validators are a special case — they
can fail-closed without affecting determinism, because validators happen before
projection and don't contribute to projected state).
Capability vetting. If an inbound activity declares capabilities-required
that includes definitions this instance hasn't loaded and trust policy is strict-
mode, the activity is quarantined (stored but not projected) pending operator
review.
Federation circuit breakers. Per-peer error rate triggers temporary defederation: if a peer is sending malformed activities, exceeding rate limits, or signing with revoked keys, automatic suspension for an exponential cool-off.
13.7 Discovery
How an instance finds other instances and actors:
- WebFinger (RFC 7033).
GET /.well-known/webfinger?resource=acct:user@hostreturns links to actor URLs. AP-standard. fed-sx implements. - Well-known capabilities.
GET /.well-known/sx-capabilities(§7) for cross- instance compatibility checks. - Manual peer config. Operators add peer instance URLs to their config.
- Peer recommendations. An instance can publish
Recommend{actor}activities pointing at peers it considers worth following. Receivers can use these as discovery hints (subject to local trust). Out of scope for v1 but the verb is reservable. - Federation directories. Community-maintained lists of instances; an instance
can opt into being listed by publishing a
Directory{listed-by}activity. v2 concern.
For v1: WebFinger + capabilities + manual config. Discovery beyond that is opt-in via standard verbs.
13.8 Streaming and real-time
Two streaming mechanisms:
- Outbox SSE —
GET /actors/<id>/outbox/streamopens a Server-Sent Events connection. Each new activity appended to the outbox is sent as an event. Allows pull-style federation peers to maintain a live connection without polling. - Projection SSE —
GET /projections/<name>/subscribe(§10.8) streams projection deltas. Useful for clients (browsers) wanting reactive views.
Both are local-only mechanisms; the canonical federation transport remains push to inbox + pull from outbox. SSE is convenience, not protocol.
13.9 Operational implications
- Push is best-effort, pull is authoritative. Operators should treat the outbox as the canonical record; delivery queue is bookkeeping.
- Trust is per-instance and not federated. Two instances may have different views of "good actors" and "bad instances." This is a feature — defederation decisions are local sovereignty.
- Backfill via snapshots is the cheap path. Encouraging actors to publish
Create{Snapshot}regularly makes new-follower onboarding fast. - Audience semantics are enforced both ways. Senders compute delivery set; receivers honour audience. Defence-in-depth against misconfigured peers.
- Capability-based extension loading is opt-in. Strict-mode default means unknown verbs are stored-but-not-projected — safe by default, with explicit operator control over what extensions load.
14. Validation pipeline
Every activity entering the substrate (whether published locally or received from a peer) flows through a fixed pipeline of checks. Order matters: cheap and fail-safe first, expensive and content-aware last. Each stage has a defined failure response (reject, quarantine, drop). Registry-driven validators plug in at a specific stage.
14.1 The two pipelines
Inbound — activities arriving via POST /inbox or pulled from a peer's outbox:
HTTP transport → envelope → signature → replay → audience →
activity-type schema → object-type schema → content validators →
capabilities → trust state → log append → projection (async)
Outbound — activities being published locally via POST /activity:
authentication → authorization → envelope construction → object handling →
activity-type schema → signature → log append → projection (async) →
delivery (async)
Stages they share are implemented as the same SX functions called from both pipelines.
14.2 Inbound pipeline — stage by stage
| # | Stage | Check | Failure response |
|---|---|---|---|
| 1 | Transport | Valid HTTP request, content-type acceptable, body parseable as JSON-LD or dag-cbor | 400 Bad Request; log |
| 2 | Envelope | Matches kernel's envelope spec (required fields present, types valid, recognised activity type or unknown allowed) |
400; log; structured error in response body |
| 3 | Signature | Time-aware sig verification: fetch (or cache-lookup) actor doc, find key with id == sig.key-id that was active at published, verify against canonical envelope bytes per the named sig suite |
401; log; do not retry; mark sender's instance for circuit-breaker accounting |
| 4 | Replay | Activity id and CID not already in activity-log projection |
200 OK with {status: "duplicate"}, no-op |
| 5 | Audience | This instance has at least one local actor in to/cc, OR audience contains as:Public/Followers and the actor has local followers |
Drop silently (no response indicating either acceptance or refusal — prevents inbox-membership probing); do not store |
| 6 | Activity-type schema | Look up DefineActivity{name: <type>} in define-registry; run its schema predicate over the activity in pure sandbox |
If type unknown: per trust policy (strict: 422 with missing-definition CID; permissive: attempt dynamic load §12.8). If schema fails: 422 with violation detail |
| 7 | Object-type schema | If activity has an object with a type, look up DefineObject{name: <type>} and run its schema |
Same as #6 |
| 8 | Content validators | All registered validators with applies-to: inbound or applies-to: all run sequentially; each is a pure-sandbox predicate that returns :accept / :reject / :quarantine |
:reject → 422 with reason. :quarantine → store activity but mark quarantined, do not project, alert operator |
| 9 | Capabilities | Every CID in capabilities-required is present in this instance's loaded registries (or auto-loadable per trust policy) |
Missing → 422 with list of missing CIDs (sender can deliver bootstrapping Define* artifacts first). Auto-load attempt can be triggered by re-POST with ?retry-after-load=true |
| 10 | Trust state | Sender's actor and instance are not in Suspended state on this instance |
Drop silently; do not respond |
| 11 | Log append | Write activity envelope (and inlined object content) to local mirror of sender's outbox; assign local sequence number | Disk error → 503 (transient); sender retries |
| 12 | Projection | Asynchronously fold the activity into every relevant projection (per define-registry) |
Per-projection failure (gas, sandbox violation) → tag activity projection-failed:<projection-name>; do not affect log durability |
Pipeline halts at the first failing stage. Stages 1–10 are synchronous (POST /inbox
holds the connection). Stage 11 is synchronous; stage 12 is asynchronous and the
HTTP response returns once the log append succeeds.
14.3 Outbound pipeline — stage by stage
| # | Stage | Check | Failure response |
|---|---|---|---|
| 1 | Authentication | Caller has a valid bearer token, mTLS cert, or session for the actor | 401 |
| 2 | Authorization | Caller's identity is allowed to publish as the named actor (capability token §9.5 or owns the actor key) |
403 |
| 3 | Envelope construction | Kernel fills in id, published, normalises to/cc, computes capabilities-required (by walking referenced Define* CIDs) |
n/a |
| 4 | Object handling | If object has inline content: canonicalize, compute CID, optionally store per where. If object references a CID, verify the artifact exists locally or remotely (or accept as a forward reference) |
Storage error → 503 |
| 5 | Activity-type schema | Same as inbound #6 — schema must pass | 422 with violation detail (caller bug) |
| 6 | Signature | Sign envelope with the actor's currently-active key matching the activity type's required purpose (e.g. Pin requires purpose: pin) |
If no suitable key: 400 |
| 7 | Log append | Write to local outbox; assign sequence number | 503 |
| 8 | Projection | Async fold (same as inbound #12) | Per-projection failure tag |
| 9 | Delivery | Async push to follower inboxes per audience | Per-recipient retry per §13.4 |
Caller's HTTP response returns after stage 7 (log append). The activity is durable
and queryable as soon as the response is sent; projection lag is reported via
projected-up-to headers and ?wait-for= parameter.
14.4 Failure response taxonomy
Three response categories with explicit semantics:
Reject — tell sender, don't store, reject can be retried after sender corrects. Used for: malformed envelope, invalid signature, schema violation, missing capabilities. HTTP 4xx with structured error.
Quarantine — store envelope (it's a valid signed message) but don't project, alert operator. Used for: content-validator soft-fail, unloaded capabilities under permissive policy, suspect-but-not-banned senders. Activity sits in a quarantine projection until operator reviews; operator can release (project) or expunge.
Drop silently — don't store, don't respond informatively. Used for: replay (ack as duplicate), audience refusal (would leak inbox membership otherwise), suspended- sender activities. The sender experiences this as a successful POST with no visible effect; they can detect it only by polling for their activity not appearing in our outbox.
14.5 Registry-driven validators
Most of the pipeline is fixed kernel logic (envelope, signature, replay, audience, log append, delivery). Two stages are registry-driven and extend dynamically:
- Stage 8 (content validators) — operators add/remove
DefineValidatorentries withapplies-to: inbound | outbound | all. Each runs in pure or effectful sandbox per its declaration. Returns one of:accept/:reject{:reason}/:quarantine{:reason}. - Stages 6–7 (schema validators) — these are registry entries
(
DefineActivity.schema,DefineObject.schema); the pipeline calls into the registry to fetch them.
Pure-mode validators are deterministic and cheap; results can be cached per (activity-CID, validator-CID).
Effectful-mode validators can call out to ML models, blocklist services,
external moderation APIs. They get a per-call IO budget; exceeding it counts as
:reject{:reason :validator-timeout}. Effectful validators do not break
determinism because validation happens before projection — a rejected activity
never enters projected state.
14.6 Validator composition and ordering
Validators have an integer priority field; lower priority runs first. Pipeline
short-circuits on first :reject. :quarantine is not short-circuiting; later
validators still run, and :quarantine results aggregate.
Default priorities (room for operator-added validators):
0-99 : kernel-internal (envelope, sig, replay, audience)
100-199 : standard schema validators
200-299 : standard content validators (rate limit, audience leak)
300-399 : operator-added moderation
400-499 : effectful (ML, third-party APIs)
500+ : reserved
Operators can publish Update{DefineValidator} to change priorities or add new
ones; takes effect on next inbound activity.
14.7 Determinism requirement and its limit
A subtlety worth being explicit about: inbound validation is not required to be deterministic across instances. Two instances can disagree about whether to accept a given activity (e.g. one has a stricter content validator). Their projected states will then diverge — but only on activities one accepted and the other didn't.
This is fine. Federation does not require state convergence; it requires fold determinism for activities both instances accepted. Validators are sovereignty controls, not protocol invariants.
Where determinism is required: schema validators (§14.2 stages 6–7). If two
instances disagree on whether Pin v3 matches its schema, they can't federate
Pin v3 activities meaningfully. So schema validators must be pure-mode and
referenced by CID.
14.8 Operational implications
- The pipeline is the security perimeter. Every checkable property is checked here, not deeper in the kernel. No "trust the caller" assumptions inside log or projection code.
- Quarantine is the operator's friend. Anything suspicious sits in quarantine with full envelope, sig, and reason — operator can review and decide. Better than outright drop because it preserves audit.
- Schema validators are protocol-load-bearing; content validators are policy. The first set must converge across instances for federation to work; the second set can diverge (and that's how local moderation policy is expressed).
- Outbound validation catches local bugs early. A malformed
Pinactivity fails at outbound stage 5, never enters the local log, never gets delivered.
15. Storage layout
The on-disk shape of an instance. Three concerns kept separate: the activity log (append-only, canonical), content-addressed object storage (keyed by CID, immutable), and operational state (projections, indexes, queues — derived, rebuildable).
15.1 Storage tiers
/var/lib/fed-sx/
├── log/ # canonical, append-only
│ ├── actors/
│ │ ├── <local-actor-id>/
│ │ │ ├── outbox/
│ │ │ │ ├── 000001.jsonl # segment, ~64MB cap
│ │ │ │ ├── 000002.jsonl
│ │ │ │ └── tip # symlink to current segment
│ │ │ ├── inbox/ # received, pre-projection
│ │ │ └── seq # next sequence number
│ │ └── <other-local-actor-id>/...
│ └── mirrors/ # local mirrors of followed remote outboxes
│ └── <remote-actor-id-hashed>/
│ ├── 000001.jsonl
│ └── ...
├── objects/ # CID → bytes
│ └── <cid-prefix-2>/<cid-prefix-2>/<full-cid>
├── snapshots/
│ └── <projection-cid>/
│ ├── <log-tip-cid>.cbor # snapshot value
│ └── index # ordered list of (log-tip, file)
├── projections/ # live projection state
│ └── <projection-cid>.cbor # latest in-memory state, periodically flushed
├── indexes/
│ └── fed-sx.db # SQLite: lookups, queue, trust state
├── keys/
│ └── <actor-id>/ # private keys, mode 0600
│ ├── primary.pem
│ ├── recovery.pem
│ └── sigs.toml # key metadata
├── genesis/
│ └── bundle.cbor # extracted from binary at first run
└── config.toml # operator config
15.2 The log — append-only segments
The activity log is the only thing the substrate cannot lose. It is the source of truth from which everything else is derived.
Format: JSONL segments. Each line is one activity envelope, encoded as JSON-LD
(canonical form), terminated by \n. Easy to inspect, easy to grep, trivially
streamable.
Why JSON-LD on disk, not dag-cbor? Two reasons:
- Operability: humans can
tail -fandgrepthe log. dag-cbor is opaque. - AP wire compatibility: activities arrive over HTTP as JSON-LD anyway; storing the same form avoids round-trip conversion.
The CID of each activity is computed from its canonical dag-cbor representation (per §2), independent of how it's stored. CIDs are stable across storage formats.
Segments cap at ~64MB. Rotation by size, not time. Old segments are immutable; new writes go to the tip segment. Compression (zstd) applied on segments older than the current tip — saves disk, doesn't slow appends.
Per-actor outboxes. Each local actor has its own outbox directory. This matches AP semantics (one outbox per actor) and means:
- Backing up a single actor is a simple directory copy
- Per-actor sequence numbers (no cross-actor coordination)
- Migration (
Move) is a directory rename + aMoveactivity
Mirror outboxes. When a local actor follows a remote one, the remote's outbox is
mirrored locally for replay. Same JSONL format. Tracked under log/mirrors/<hashed- remote-id>/ to avoid filesystem path issues with URL characters. The hash is
purely a filesystem-friendly encoding; the canonical actor id stays in the log
content.
Inbox vs outbox distinction. Inboxes hold received activities pre-validation; outboxes hold committed activities post-pipeline. An inbound activity that passes the validation pipeline (§14) is moved from inbox to the appropriate mirror outbox. This makes inbox a transient queue, not a permanent record.
15.3 Object storage
Content-addressed blob store, sharded directories.
Path scheme: objects/<first-2-chars>/<next-2-chars>/<full-cid>. Sha2-256 CIDs
are uniformly distributed; this gives ~65k buckets with a couple-hundred files each
at moderate scale. Standard pattern (matches IPFS, Git).
Storage backends. Pluggable per where: cid object:
files-on-disk(default) — write to local filesystem.ipfs— register-driven backend; calls out to a local IPFS node.s3— object storage in cloud bucket.memory-only— in-memory cache, evictable; useful for ephemeral artifacts.
The kernel uses the where-tag on each object to dispatch to the correct backend.
Backends are registry entries (DefineStorage); operators install only the ones
they want.
Garbage collection is opt-in per backend. Default policy: never GC (objects are immutable and may be referenced by future activities). Operators can configure per-backend retention rules:
- "Keep last N versions of objects referenced by
Pinactivities for path X" - "Evict objects not referenced in last 90 days from the
memory-onlycache" - "Mirror objects referenced by ≥ 3 endorsements; evict others after 30 days"
GC operates on the projected reference graph (a reference-graph projection that
maintains "what activities reference this CID"). Removing an object that's still
referenced is allowed but produces a warning logged in operations.
15.4 Snapshots
Per §10.4, snapshots are the (projection-CID, log-tip-CID, state) triples that let us resume without full replay.
Storage: snapshots/<projection-cid>/<log-tip-cid>.cbor. The state value is
dag-cbor-encoded; the file's content CID matches the snapshot's claimed CID.
Index: snapshots/<projection-cid>/index is a sorted list of (log-tip-time, log-tip-cid, file) triples. On startup, kernel finds the latest snapshot ≤ current
log tip and resumes from it. On time-travel queries, finds the latest snapshot
≤ target time and folds forward.
Retention: keep at least:
- Latest snapshot per active projection
- Snapshots referenced by published
Create{Snapshot}activities (federation proofs) - One snapshot per day for the last 7 days (audit / time-travel)
Older snapshots GC'd by default. Operators can increase retention.
15.5 Operational state — SQLite
Things that are derived, frequently-queried, but not federated:
- Lookup indexes for projections (when
indexes:declared) —(projection, index-key, value) → activity-cidrows - Delivery queue — outbound activities pending push, retry counts, next-attempt timestamps
- Trust state — per-actor and per-instance trust levels (Trusted / Default / Suspended)
- Quarantine queue — activities pending operator review
- Configuration cache — currently-active registry entries (also in memory; on- disk cache for fast restart)
Single SQLite file (indexes/fed-sx.db). Recoverable: if corrupted or deleted,
rebuilt from the log on next startup (with cost proportional to log size). The
SQLite is a cache, not authoritative.
WAL mode for concurrent readers. Single-writer (the kernel); reads from many HTTP request workers.
15.6 Backup and export
The substrate is an append-only log of immutable artifacts; backup is simple.
- Full backup: rsync
/var/lib/fed-sx/log/and/var/lib/fed-sx/objects/. The rest is rebuildable. - Per-actor export: tar
log/actors/<actor-id>/+ the objects referenced by activities in that outbox. Self-contained, importable into another instance. - Activity bundle export: for federation backfill, produce a dag-cbor bundle of
[activity envelopes... + referenced objects]for a specified actor + range. Single file, content-addressed, signed by the source instance with aBundleactivity attesting to its contents.
Exports are themselves publishable (Create{Bundle} activity carrying the bundle
CID). This is how an actor migrates instances cleanly: export bundle, import on
new instance, publish Move activity.
15.7 Mirroring and replication
Two patterns:
- Federation mirroring (the canonical kind) — when actor A follows B, A's instance mirrors B's outbox locally. This is just normal federation (§13). Each follower keeps its own copy.
- Operational mirroring — for high availability. An operator runs two instances
with shared filesystem (NFS / EFS) for
log/andobjects/, separate SQLite files. Reads can hit either; writes go through one. Or: rsync-based hot standby with manual failover.
Operational mirroring is out of scope for v1. Federation mirroring is the substrate- level redundancy: as long as one peer that followed you is still online, your log is still recoverable.
15.8 Storage size estimates
Rough targets at moderate scale (10 active local actors, 1000 followed peers, 1 year of activity at 100 activities/actor/day):
- Log: 10 actors × 100 act/day × 1 KB avg envelope × 365 days ≈ 365 MB local outbox. Mirrors: 1000 peers × 10 act/day × 1 KB × 365 ≈ 3.6 GB.
- Objects: depends heavily on content. Assume 50% of activities have inline content of avg 5 KB → ~2 GB total inline. CID-referenced larger objects: count separately, depends on use case.
- Snapshots: typically much smaller than the log. ~10 active projections × ~10 MB per snapshot × ~8 retained snapshots ≈ 800 MB.
- SQLite: index sizes proportional to indexed projection content; typical few hundred MB.
Total: order of 10 GB at the described scale. Single-machine viable; SSD recommended for log throughput; spinning disk fine for snapshots and object storage cold tier.
15.9 Operational implications
- The log is sacred. Never modify, never delete. Backups go to multiple media.
Loss of
log/means loss of identity (actor activities) and loss of state-of- record. Loss ofobjects/means loss of content but log + peers can recover most of it. - Everything else is rebuildable. Projections, indexes, snapshots, queue state can all be recomputed from the log at startup cost. Operationally, this means upgrades and migrations are forgiving.
- CID-addressed storage is naturally idempotent. Two instances writing the same artifact write the same bytes to the same path. Race conditions become no-ops.
- JSONL on disk pays for itself the first time an operator needs to debug a
weird federation issue with
grepandjq. Worth the storage cost vs dag-cbor.
16. API surface
HTTP API for reading the log, publishing activities, querying projections, and streaming updates. Three layers: AP-standard endpoints (for vanilla AP interop), fed-sx-specific endpoints (publish, query, capabilities), and discovery endpoints (webfinger, well-known).
16.1 Endpoint catalog
AP-standard
| Method | Path | Purpose |
|---|---|---|
| GET | /actors/<id> |
Actor doc (Person/Service/Group/Application) |
| GET | /actors/<id>/inbox |
Read inbox — auth required |
| POST | /actors/<id>/inbox |
Receive federated activity (HTTP Signature required) |
| GET | /actors/<id>/outbox |
OrderedCollection of actor's published activities |
| POST | /actors/<id>/outbox |
AP-standard publish (alias for POST /activity with actor set) |
| GET | /actors/<id>/followers |
OrderedCollection of follower actor URIs |
| GET | /actors/<id>/following |
OrderedCollection of followed actor URIs |
| GET | /activities/<uuid> |
Single activity by id |
| GET | /objects/<uuid> |
Single object by id (note: distinct from CID-addressed /artifacts/<cid>) |
fed-sx-specific
| Method | Path | Purpose |
|---|---|---|
| POST | /activity |
Generalised publish — accepts any well-formed activity |
| GET | /artifacts/<cid> |
CID-addressed artifact fetch (content negotiated) |
| GET | /artifacts/<cid>/raw |
Raw bytes (whatever the codec stored) |
| GET | /artifacts/<cid>/<path> |
IPLD path traversal into the artifact |
| GET | /projections |
List of registered projections (name, CID, last-folded-tip) |
| GET | /projections/<name> |
Full projection state (paginated for large states) |
| GET | /projections/<name>?at=<ts> |
Time-travel: state as of timestamp |
| GET | /projections/<name>/<key> |
Single key from a projection (uses indexes) |
| POST | /query |
Run an SX query expression against one or more projections |
| GET | /define-registry |
Currently active Define* artifacts by kind |
| GET | /capabilities/<actor-id> |
Per-actor declared capabilities |
Discovery and well-known
| Method | Path | Purpose |
|---|---|---|
| GET | /.well-known/webfinger?resource=acct:<user>@<host> |
RFC 7033 actor discovery |
| GET | /.well-known/sx-capabilities |
This instance's capability advertisement (§7) |
| GET | /.well-known/host-meta |
XRD describing the host |
| GET | /.well-known/nodeinfo |
Standard fediverse node metadata (Mastodon, Pleroma compatibility) |
Real-time (SSE)
| Method | Path | Purpose |
|---|---|---|
| GET | /actors/<id>/outbox/stream |
New activities as they're appended (events: activity) |
| GET | /actors/<id>/inbox/stream |
New inbound activities (auth required) |
| GET | /projections/<name>/subscribe |
Projection deltas (events: delta) |
| GET | /federation/health/stream |
Per-peer delivery health (events: peer-status) |
WebSocket equivalents (/ws/... paths) available where SSE is awkward (browsers
behind proxies); same event payloads, different framing.
16.2 Authentication
Three mechanisms, each appropriate to a different caller type:
- HTTP Signatures (RFC draft-cavage-http-signatures) — the AP-standard mechanism
for inter-instance calls. Sender signs a digest of relevant headers + body with
their actor's private key; receiver verifies via the actor's public keys
projection (§9.6). Used for:
POST /inbox, peer-to-peer outbox pulls when authentication is desired. - Bearer tokens — for interactive clients (CLIs, web UIs, mobile apps).
Issued via OAuth2 (or simple admin-issued tokens for v1). Used for:
POST /activity,GET /actors/<id>/inbox, anything requiring caller identity. - Capability tokens (§9.5) — for delegated publish. Token includes the granting
actor, the granted capabilities (e.g.
publish: Pin for path-prefix /docs/), the bearer's actor, expiry, and signature from the granter. Used for: child actors, service accounts, temporary publish access.
Public reads (most GET endpoints to public-audience activities) require no auth. Private/followers-only reads check the caller's identity against the audience.
16.3 Content negotiation
Same resource, multiple representations. Accept header dispatches:
| Accept header | Returns |
|---|---|
application/activity+json |
AP-standard JSON-LD (default for ambiguous Accepts) |
application/ld+json; profile="..." |
JSON-LD with explicit profile |
application/cbor |
dag-cbor |
application/json |
Plain JSON (compact, no @context expansion) |
application/sx |
Canonical SX wire format |
text/html |
HTML representation (for browsers — renders the artifact via SX) |
Same negotiation applies to /artifacts/<cid>, /activities/<uuid>,
/projections/<name>. Servers MUST honour the request; absent Accept defaults to
application/activity+json.
16.4 Pagination
Cursor-based via AP's OrderedCollectionPage:
GET /actors/giles/outbox
→ {
"type": "OrderedCollection",
"totalItems": 12345,
"first": "/actors/giles/outbox?page=true",
"last": "/actors/giles/outbox?page=true&min_id=0"
}
GET /actors/giles/outbox?page=true
→ {
"type": "OrderedCollectionPage",
"id": "...?page=true",
"next": "...?page=true&max_id=<cid>",
"prev": "...?page=true&min_id=<cid>",
"orderedItems": [...]
}
Cursors are CIDs of the boundary activity (not opaque tokens). Stable across
restarts and instances. max_id returns activities before the cursor (newest
first); min_id returns activities after the cursor.
Default page size: 50. Max: 1000. Link: <...>; rel="next" header also provided
for HTTP-native pagination.
For projections: same shape, items are projection entries.
16.5 The query API
POST /query takes an SX expression evaluated in pure mode against named
projections:
POST /query
Content-Type: application/sx
Accept: application/sx
(let ((actors (projection actor-state))
(pins (projection pin-state)))
(for-each ([(actor-id actor) actors])
(when (> (count (filter (fn ((path cid)) (= (:owner cid) actor-id)) pins)) 10)
{:actor (:preferredUsername actor)
:pins-published (count ...)})))
Query semantics:
- Evaluated in pure sandbox; all the determinism rules apply.
- Projection access is read-only and snapshot-consistent: the query sees state
as-of the time of the request (or
?at=if specified). - Result is serialized in the negotiated content type.
- Gas limit applies (default 1M units per query, tunable by operator).
- Cacheable: query CID + projection state CIDs uniquely determine the result.
Query results can themselves be published as Create{QueryResult} activities,
making derived analyses federable.
16.6 Errors
Uniform JSON error envelope:
{
"error": {
"type": "https://next.rose-ash.com/ns/fed-sx/errors/v1#InvalidSignature",
"status": 401,
"title": "Activity signature invalid",
"detail": "Key id 'https://example/actors/x#key-1' was superseded at 2026-01-15T...",
"activity-id": "https://...",
"key-id": "...#key-1",
"instance": "/incidents/<incident-cid>"
}
}
Error types are URIs in the fed-sx namespace; receivers can check type for
programmatic handling. Standard errors:
MissingCapability— includesmissingarray of CIDsSchemaViolation— includesschema-cid,field-path,expected,gotInvalidSignatureQuarantined— includesquarantine-idfor operator-status trackingRateLimited— includesretry-afterResourceExhausted— for query gas exhaustion
16.7 Streaming details
SSE event format:
event: activity
id: <activity-cid>
data: { ...activity envelope... }
event: delta
id: <activity-cid that triggered the delta>
data: {"projection": "actor-state", "key": "...", "old": ..., "new": ...}
event: heartbeat
data: {"projected-up-to": "<cid>", "ts": "..."}
Clients reconnect with Last-Event-ID: <cid> to resume from the last event seen.
Server replays from that point in the log (or returns 410 if too far behind, in
which case client should switch to paginated pull).
16.8 Versioning
The substrate is versioned at three levels:
- Envelope version — declared in
/.well-known/sx-capabilities. Currently1. Forward-compatible (new fields OK; semantics fixed). - API version — URL prefix optional:
/v1/...works the same as/.... Future major version:/v2/...paths in parallel. - Definition versions — supersession via activity log (§§9.2, 12.7). No special URL handling.
Capability negotiation happens before federation; clients shouldn't hard-code URL paths beyond the canonical set documented here.
16.9 Operational implications
- The API is small but layered. AP compatibility is one layer; fed-sx extensions are another; both share auth and content negotiation. Adding a new endpoint shouldn't require new transport machinery.
- Content negotiation is the polyglot bridge. Same artifact addressable in JSON- LD (for AP peers), dag-cbor (for fed-sx peers), SX (for SX clients), HTML (for humans). One CID, four representations.
- Cursor pagination is CID-based. Stable identifiers, no opaque tokens to invalidate, peers can synchronize without coordination.
- The query API is a load-bearing differentiator. Datalog/GraphQL-equivalent expressiveness with no separate query language — it's just SX. Federable, signable, versionable like any other SX artifact.
17. Implementation languages
Polyglot authoring, monoglot runtime: every language-on-SX compiles to core
SX and runs on any host with the SX evaluator. The language is an authoring choice;
the federated artifact is uniform SX. Authors of Define* artifacts pick the
source language they prefer; consumers don't need that compiler installed to
execute the compiled SX.
Languages are picked because they genuinely fit the problem, not to demonstrate the polyglot story. Where a chosen language has gaps (e.g. Erlang-on-SX missing hot reload), we invest in maturing the port rather than working around the gap.
17.1 The v1 stack
| Layer | Language | Why |
|---|---|---|
| Native primitives | OCaml (existing runtime) | Crypto (RSA, Ed25519, SHA), dag-cbor encode/decode, HTTP socket, file IO, SQLite. Surfaced as Erlang-on-SX BIFs. |
| Kernel orchestration | Erlang-on-SX | Actor model = federation. gen_server per actor / per projection / per peer. supervisor for delivery workers. Message passing is literally the substrate. Hot code reload (Phase 7) for Define* live extension. |
| Query API back-end | Datalog-on-SX | Projection state is relational; trust graph walks, provenance, projection joins are textbook Datalog. Already mature (276/276 tests, full core Datalog with stratified negation, aggregation, magic sets, federation-graph demo). |
Define* semantics, schemas, validators, codecs, audience predicates |
Core SX | The canonical federated language. Everything content-addressed and federated lives here. |
17.2 Languages explicitly not booked for v1
Available, mature, considered — would be reached for if a real fed-sx need surfaced, but no preemptive use:
- Haskell-on-SX (285/285 tests, 36 programs, type checker working) — for complex operator-authored extensions that benefit from typed pattern matching. Schemas in fed-sx are short predicates; types don't earn their keep here.
- Smalltalk-on-SX (625/629 tests, classic corpus running) — natural fit for a live operator dashboard / Glamorous-Toolkit-style introspection. v2/v3 territory; a browser UI likely wins for operator audiences.
- APL-on-SX — high-throughput batch reprojection if scalar SX folds become a bottleneck. Premature without measured need.
- JS-on-SX, Elm-on-SX — browser-side client SDK / viewer. v2.
- Common Lisp-on-SX, Forth-on-SX, Go-on-SX, Dream-on-SX, Elixir-on-SX, Erlang-on-SX (alternative form) — case by case if a use case appears.
17.3 The FFI BIF layer
Erlang-on-SX has no FFI / NIF mechanism in its current form (Phase 6 plan: "out of
scope entirely"). fed-sx adds a BIF layer in lib/erlang/transpile.sx (or a
dedicated lib/erlang/fed_bifs.sx) exposing native primitives:
crypto:rsa_verify/3 crypto:ed25519_verify/3
crypto:sha2_256/1 crypto:sha3_256/1
cid:cbor_encode/1 cid:cbor_decode/1
cid:multihash/2 cid:from_bytes/2
cid:to_string/1 cid:from_string/1
log:append/2 log:read/3
log:tip/1 log:replay/3
http:listen/2 http:request/2
http:respond/3 http:sse_send/2
fs:read/1 fs:write/2
fs:exists/1 fs:list/1
sqlite:open/1 sqlite:exec/2
sqlite:query/3 sqlite:close/1
snapshot:put/3 snapshot:get/2
Each BIF is a thin Erlang-on-SX function dispatching to the corresponding SX runtime
IO primitive. Returns Erlang-shaped values (atoms, tuples, binaries). Errors raise
appropriate Erlang exceptions (badarg, enoent, eaccess).
This is the only native-FFI surface in fed-sx. All other I/O goes through these BIFs. Operators can audit the BIF list to know exactly what the substrate touches outside SX.
17.4 Build pipeline
.sx files (core SX, registry entries) ──┐
.erl files (Erlang-on-SX kernel) ──┼──> compile to core SX
.dl files (Datalog-on-SX queries) ──┘
│
content-addressed SX artifacts
│
▼
genesis bundle (CID-verified)
│
▼
OCaml runtime evaluates everything
Each authoring language's compiler runs at build time, producing core SX that goes into the genesis bundle (for bootstrap definitions) or gets published as activities (for runtime extensions).
17.5 Prerequisite work
Pieces of investment land in or alongside the Erlang-on-SX loop. The first two land before fed-sx kernel code starts; the third runs in parallel, not blocking milestone 1, but blocking production-grade throughput.
-
Phase 7 — hot code reload.
code:load_binary/3,gen_servercode_change/3callback dispatch, atomic module-version swap. Required forDefine*live extension (no kernel restart to load new verbs). Reload- semantics choice (two-version coexistence vs single-version atomic swap with closure capture) decided during the work. -
Phase 8 — FFI mechanism + initial BIFs.
define-bifregistration + term marshalling + error mapping, then BIFs forcrypto:*,cid:*(dag-cbor),fs:*,http:*,sqlite:*. Required for fed-sx kernel to call native primitives. Lands before kernel code that calls them. -
Phase 9 — specialized opcodes (the BEAM analog). Layered perf strategy:
- Layer 1 (Phase 9, in scope) — specialized bytecode opcodes that bypass
the general-purpose CEK machine for hot Erlang operations.
OP_PATTERN_TUPLE,OP_PERFORM/OP_HANDLE,OP_RECEIVE_SCAN,OP_SPAWN/OP_SEND, BIF dispatch table. Targets: 100k+ message hops/sec, 1M-process spawn under 30sec — roughly 1000-3000× speedup over the current general-purpose path. - Layer 2 (Phase 10, deferred) — multi-core scheduler via OCaml 5 domains. Decided empirically after Layer 1 lands; likely unnecessary if Layer 1 alone hits target throughput.
- Layer 3 (skipped) — incremental tuning of the existing call/cc-based receive and env-copy-per-call machinery. Obsoleted by Layer 1; not pursued.
Architectural note for Phase 9. Phase 9a (the opcode extension mechanism in
hosts/ocaml/evaluator/) is out of scope for the Erlang loop — it's SX VM core, used by every language port that wants specialized opcodes. Designed inplans/sx-vm-opcode-extension.md; lands as a separate focused workstream (~1-2 weeks) owninghosts/. Phase 9b-9g (the actual Erlang opcodes inlib/erlang/vm/) are designed and tested against a stub dispatcher in the Erlang loop until 9a is available.Shared-opcode discipline. Opcodes Phase 9 produces that other language ports could plausibly use (pattern match, perform/handle, record access) become candidates for chiselling out to
lib/guest/vm/— same lib/guest discipline, applied at the bytecode layer. Don't pre-extract; promote tolib/guest/vm/when a second language port has an actual second use. The substrate accumulates a richer opcode surface over time as ports contribute, and every port benefits from every shared opcode (the structural advantage over BEAM, which is special-purpose-built for one language).fed-sx is not blocked by Phase 9. Milestone 1 ships on current Erlang- on-SX perf (which has 100-1000× headroom for a single demo instance). Phase 9 lands in parallel; by the time fed-sx needs production-grade throughput (federation hub use cases, milestone 2-3), Phase 9 is ready.
- Layer 1 (Phase 9, in scope) — specialized bytecode opcodes that bypass
the general-purpose CEK machine for hot Erlang operations.
After Phases 7 and 8 land, fed-sx milestone 1 (kernel + registries + bootstrap entries + Pin smoke test + reactive application smoke test) becomes the next workstream. Phase 9 work continues in parallel.
18. Subscription model
Symmetric to the publish-side extensibility: just as DefineActivity registers what
kinds of things can be published, DefineSubscription registers what kinds of
patterns can be subscribed to. Follow becomes one standard subscription type
among many, not a hardcoded primitive.
18.1 The asymmetry being fixed
Without this, the substrate has rich publish-side extensibility (any new verb is a
DefineActivity) and one hardcoded subscription primitive (Follow). That
mirrors AP but it's an arbitrary limitation in a substrate where everything else
is registry-driven. Generalising restores symmetry.
18.2 The DefineSubscription shape
(activity 'Create
:object {:type "DefineSubscription"
:name "Follow" ; AP-standard
:schema (fn (sub) ; what params the sub takes
(and (cid? (-> sub :object))
(= "Person" (-> sub :object-type))))
:match (fn (subscription activity) ; pure-mode predicate
(= (-> subscription :object) (:actor activity)))
:delivery {:default :push
:modes [:push :pull :sse]
:digest-window nil}
:capabilities-required []}) ; some subs may need authority
Four mandatory parts:
schema— pure-mode predicate validating subscription parameters atSubscribetime. Catches malformed subscriptions before they enter state.match— pure-mode predicate(subscription, activity) → bool. Decides whether a given activity is a hit for this subscription. Determinism rules apply (§11.2).delivery— supported modes (push to inbox / pull on demand / SSE streaming / batched digest). The subscription instance picks its preferred mode atSubscribetime from the supported set.capabilities-required— capability tokens the subscriber must hold (empty for public subs; populated for paywalled/gated/private streams).
18.3 The Subscribe verb
The bootstrap verb that activates a subscription:
(activity 'Subscribe
:object {:type "Follow" :object "https://alice.example/actors/alice"})
(activity 'Subscribe
:object {:type "Topic" :tag "climate-change"
:delivery :digest :digest-window "P1D"})
(activity 'Subscribe
:object {:type "CidWatch" :cid "bafy..."
:events [:supersede :endorse]})
(activity 'Subscribe
:object {:type "Predicate"
:pred '(fn (act) (and (= (:type act) "Note")
(string-contains? (-> act :object :content) "fed-sx")))})
Unsubscribe is Undo{Subscribe} — AP's standard pattern, retains audit.
18.4 Standard subscription types (defined later, not bootstrap)
Same status as the custom verbs in §6.2 — substrate accepts any subscription
type once a DefineSubscription artifact registers it. Standard set:
| Name | Params | Match semantics | Use case |
|---|---|---|---|
Follow |
{object: actor-id} |
activity.actor == subscription.object | AP-standard actor following |
Topic |
{tag: string} |
tag in activity.object.tags | Hashtag follows, RSS-like |
CidWatch |
{cid, events: [...]} |
activity references cid AND activity.type in events | "Notify me when this artifact is updated/endorsed/forked" |
PathWatch |
{path, events: [...]} |
activity is a Pin/Update of named path | "Notify me when domain:foo/bar/baz changes" |
VerbFilter |
{wraps: subscription-cid, types: [...]} |
inner subscription matches AND activity.type in types | "Follow Alice but only Endorse activities" |
TrustGraph |
{root: actor-id, depth: int} |
activity.actor reachable from root in trust graph at depth | Web-of-trust expansion |
Predicate |
{pred: sx-fn} |
(pred activity) returns truthy | Escape hatch — most powerful, highest cost |
Channel |
{channel-id} |
activity addresses or originates from channel | Multi-actor pooled streams |
18.5 Match-fn execution location
The load-bearing question. Three choices, fed-sx adopts the hybrid model:
- Coarse filter on the publisher side — audience predicates (§8) decide who the activity is delivered to at all. This is mandatory and cheap (audience set is usually small and well-defined).
- Fine filter on the subscriber side — once an activity arrives in inbox,
the subscriber's instance evaluates each active subscription's
match-fnagainst it. Pure-mode evaluation (deterministic, gas-bounded). Activities matching one or more subscriptions enter the subscriber's projected state.
Why hybrid: publisher-side fine filtering would require the publisher to know every subscriber's match-fn (privacy-violating, scaling-killing). Subscriber-side filtering is wasteful only if the publisher's audience model is too coarse — which is the audience system's job to fix per §8.
18.6 Subscription state and storage
Active subscriptions are themselves projected state. A bootstrap projection
subscriptions (paralleling audience-graph for the inverse direction)
maintains:
{actor-id -> [{subscription-cid, type, params, mode, started-at}]}
Updated by Subscribe and Unsubscribe activities. Queryable like any other
projection (§16). Used by:
- The inbox dispatcher to know which match-fns to evaluate against incoming activities
- Triggers (§19) to know which activities to fire on
- Federation to advertise "here are the subscription types I currently subscribe to" (capability-style, opt-in)
18.7 Federation interactions
Subscriptions interact with federation in three ways:
- Discovery. Peer's
/.well-known/sx-capabilities(§7) lists registeredDefineSubscriptionCIDs, so subscribers know what they can ask for. - Negotiation. A
Subscribeactivity carriescapabilities-required; if the publisher's instance doesn't support the named subscription type, it responds with the standard 422 + missing-CIDs error (§14.2 #9). Subscriber can then deliver the bootstrappingDefineSubscriptionartifact and retry. - Cross-instance match-fn. If subscriber and publisher both run the same conformance-tested SX evaluator, identical subscriptions match identically (cross-host equivalence, §11.8). This is what makes federated topic subscriptions reliable: every conforming instance computes the same set-of-matches for the same activity.
18.8 Operational implications
- The audience system handles "who do I send this to." The subscription system handles "what do I want to receive." They're complementary, not redundant.
- Subscription types can themselves evolve via supersession. New version of
Topicwith case-insensitive matching? Publish a newDefineSubscription,Supersedethe old one. Existing subscriptions migrate at next match evaluation. - Match-fn cost matters. A
Predicatesubscription with a slow predicate becomes a per-activity tax. Gas budgets (§11.5) bound the worst case; operators can disable expensive subscription types if needed. - Subscriptions are signed messages. Audit, accountability, and revocation all work the same way as activities — because subscriptions are activities.
19. Application model
The synthesis. With publish, subscribe, project, and trigger as registry-driven primitives, the substrate has everything needed to express distributed reactive applications as data — no native code, no kernel changes, no privileged runtime. Applications are themselves federated artifacts.
19.1 An application is a tuple of artifacts
Application = {
subscriptions : [DefineSubscription instances and their parameters],
triggers : [DefineTrigger registrations],
projections : [DefineProjection registrations],
storage : [DefineStorage registrations] (optional)
}
That tuple, signed and bundled, is the application. Installing one = following
the named actors / activating the named subscriptions + loading the Define*
CIDs into the local registry. Forking one = republishing the Define* with
Supersede over the bits you change.
19.2 The reactive loop
External actors Operator publishes activities
publish activities via this instance's actors
│ │
▼ ▼
┌─────────────────────────────────────────────┐
│ Inbound + outbound activities │
└────────────────────┬────────────────────────┘
│
▼
For each active subscription:
evaluate match-fn (pure mode)
│
┌─────────────┴─────────────┐
▼ ▼
Activity matches Activity does
a subscription not match
│ │
▼ ▼
Projections ← (silently dropped from
fold the activity this application's view;
│ may match other apps)
▼
Triggers fire on the
subscription's match
│
▼
Trigger then-sx runs
(effectful sandbox)
│
├──> updates local state (private projections)
├──> publishes new activity (via outbox)
└──> calls effectful primitives (HTTP, fs, etc.)
per declared capabilities
Three things happen on a match: state updates (projection), derived publishes (new activities), side effects (effectful primitives). Each is authorisation-gated by the trigger's declared capabilities.
19.3 Trigger semantics
DefineTrigger registers (when-subscription, then-sx, cascade-limit):
when-subscription— references a subscription (by CID or by name). The trigger fires whenever that subscription matches an inbound or outbound activity. Multiple triggers can reference the same subscription.then-sx— function of(activity, subscription, env) → trigger-result. Runs in pure or effectful sandbox per declaration. Returns one or more of::publish [activity-spec ...]— request publish of derived activities:project [name → state-update ...]— request projection updates:effect [capability-call ...]— request effectful primitive calls:noop— observed but no action
cascade-limit— bounded depth for trigger cascades (§19.4).
A trigger is fundamentally a reactive rule: "when X happens, do Y." The substrate guarantees Y happens at most once per X (deduplicated by activity-CID), exactly-once-per-instance (delivery from trigger to its effects is durable), and bounded-cost (gas + cascade-limit).
19.4 Cascade control
A trigger that publishes activities can fire other triggers. Without limits, a single inbound activity could cascade across instances forever.
Each trigger declares cascade-limit: N (default 3). Each activity carries an
implicit cascade-depth field, incremented when it's the result of a trigger
firing. A trigger refuses to fire if cascade-depth > cascade-limit.
Cascade limits are local-only (operator policy, not federated). Defending against runaway cascades from peer instances is the operator's job; the substrate gives them the knob.
19.5 The DefineApplication bundle
A bundle artifact that names and groups the components of an application:
(activity 'Create
:object {:type "DefineApplication"
:name "rose-ash-blog"
:version 1
:subscriptions [{:type "Follow" :object "https://blog.rose-ash.com/actors/main"}
{:type "Topic" :tag "rose-ash"}
{:type "CidWatch" :cid <rose-ash-template-cid>
:events [:supersede]}]
:triggers [<comment-moderation-trigger-cid>
<reaction-counter-trigger-cid>
<rss-republish-trigger-cid>]
:projections [<comment-thread-projection-cid>
<reaction-counts-projection-cid>]
:storage [<local-files-storage-cid>]
:capabilities [<http-allowlist-cap-cid>
<fs-write-cap-cid>]
:description "Federated blog with moderated comments and RSS"})
Three operations on applications, all themselves activities:
- Install —
Subscribeto each subscription,Create{}references indefine-registryto each trigger/projection/storage CID. One activity per reference, audited and replayable. Or: a singleInstall{DefineApplication}meta-verb that does the bundle in one signed step (defined later as a custom verb, not bootstrap). - Update — publish a new
DefineApplicationwith the same name +supersedespointing at the old. Diff-then-apply: subscriptions added/ removed, triggers loaded/unloaded, projections reprojected per §10.5. - Fork — publish a new
DefineApplicationreferencing the original's CID viaforked-from, with whatever Define* CIDs you want to swap. Run alongside the original or in place of it.
19.6 Per-application namespacing
Multiple applications running on one instance need isolation:
- Projections are namespaced by application.
pin-statefrom app A is distinct frompin-statefrom app B — both addressable as/projections/<app-name>/pin-state. - Triggers fire only on subscriptions belonging to their application. App A's trigger doesn't see app B's subscription matches.
- Storage backends are namespaced. App A's
files-on-diskbackend writes todata/apps/A/objects/; app B writes todata/apps/B/objects/. - Capabilities are per-application. Granting
http-clientto app A doesn't grant it to app B. Operator can audit per-app capability surface and revoke selectively.
Cross-application reads are explicit and require a capability grant
(read-projection: <app>/<projection>). Default isolation; opt-in sharing.
19.7 Worked examples
Example A — Blog with moderated comments
DefineApplication "blog-with-comments":
subscriptions:
- Follow: <author-actor>
- Topic: "post-comment" (filter: object.in-reply-to in our-posts)
triggers:
- on Topic match → publish Note (the new comment, derived if approved)
→ projection pending-moderation
- on inbound Approve{Reply} → projection comment-thread (visible)
projections:
- comment-thread: post-cid → [approved comment activities]
- pending-moderation: list of pending replies awaiting approval
Example B — Continuous integration
DefineApplication "ci-pipeline":
subscriptions:
- Follow: <developer-actor>
- VerbFilter: wraps Follow, types: [Push]
triggers:
- on Push match → effect: run build (capability: subprocess + fs-write)
→ publish Build{source: Push.cid, output: <build-cid>, status}
- on Build{status: success} → effect: run tests
→ publish Test{...}
- on (Test{passed} count for N days) → publish Release{...}
projections:
- build-history: commit-cid → [build activities]
- release-history: ordered list of Release activities
Example C — Distributed code review
DefineApplication "code-review":
subscriptions:
- Topic: "review-request"
- CidWatch: <organisation-actor>, events: [Endorse]
triggers:
- on review-request match → projection review-queue
→ effect: notify-reviewer
- on Endorse from authorised reviewer → publish Approve{review-cid}
→ projection approval-state
projections:
- review-queue: ordered list of pending requests with summaries
- approval-state: review-cid → endorsement set
In all three: the application is just the bundle of subscriptions, triggers, and projections. Federation makes them composable across instances. The substrate provides exactly-once-per-CID semantics and pure-mode determinism for the matches and folds.
19.8 Composition and discovery
Applications are themselves federated content. This means:
- App registries — actors can publish curated lists of applications they endorse. Discovery becomes follow-an-actor + browse-their-app-list.
- Cross-app composition — application A publishes derived activities that application B subscribes to. Pipeline of applications via the activity log.
- App marketplaces — pin a friendly path to a
DefineApplicationCID (rose-ash.com:apps/blog → bafy...) for human discoverability.
None of this requires kernel changes. It's all activities about activities.
19.9 Operational implications
- Applications are inspectable from the activity log alone. Replay an actor's outbox and you can reconstruct the exact application installation state at any point in time.
- Application updates are atomic relative to the activity log. Either the
Update{DefineApplication}succeeded (new state visible from next activity) or it didn't (old state continues). No partial-update window. - Forking is the same as installing a copy. No special "fork" mechanism needed; the activity-log mechanics already support it.
- Per-app capabilities are a real security surface. Operators must
understand what they're granting when they install. The bundle's
capabilitieslist is the audit point — should be human-readable and reviewable before installation. - The substrate isn't an "application platform" — it's an "application substrate." Applications aren't installed on fed-sx; they're expressed in fed-sx, as the same kind of content as everything else.
Appendix A: relationship to adjacent systems
Worth knowing about so we can borrow good ideas:
- ATproto / Bluesky — Lexicons (schemas) + repos (per-actor signed merkle trees). Closest in spirit. We borrow the schema-as-data idea; we differ by making schemas themselves federated activities, not central registry entries.
- Spritely Goblins — capability-secure actors. We borrow the capability-token pattern for delegation.
- Ceramic — signed event streams, content-addressed. Similar log-as-state model; we differ by making the projection function pluggable per-stream rather than hardcoded per-streamtype.
- Holochain — agent-centric DHT. We share the "every agent has their own log" shape; we use AP federation instead of DHT.
- Farcaster — pubsub on hubs. We share the firehose model; we add cryptographic outbox-as-source-of-truth.
None of them are code-as-data the whole way down — that's the SX-distinctive bit. Handlers, validators, projections aren't bytecode shipped out-of-band; they're SX in the same log as everything else, evaluable by any host that speaks SX.
Appendix B: implications worth sitting with
- Deployment dissolves. Releasing a feature = publishing
DefineActivity{name: "Whatever", ...}. Federation distributes it. No build artifact, no rolling deploy, no version-skew between server and client. - Applications are forkable by default. "Fork the rose-ash blog" = take the bundle
of
Define*CIDs that constitute it, publish your own withSupersedeover the ones to change, run your own projector. Same federation graph, divergent state. - Composition is by reference, not import.
Pinactivity points at the CID of theDefineActivity{name: "Pin"}. No package manager, no transitive deps, no lockfiles. - The boundary between "user" and "developer" softens. Both publish signed activities. Power users can publish handlers, projections, sig suites under their own actor.
- This is more ambitious than a rose-ash rewrite. It's a substrate that happens to host rose-ash as its first application.
Appendix C: AI agent collaboration patterns
The substrate is incidentally well-shaped for one of the open problems of the next decade: infrastructure for AI agent collaboration where contributions are signed federated artifacts, behavior is bounded by declared capabilities, decisions are audit-by-replay, and infrastructure improves through agent contribution within a web of trust.
This is not a designed-for use case — fed-sx was conceived as a federated publishing and reactive application substrate. But the properties it has fit agent collaboration almost exactly. Worth being deliberate about, because the framing changes who fed-sx is for.
Why the substrate fits agent collaboration
AI agents need infrastructure where contributions are first-class artifacts, not pull requests against human-controlled repos. Currently agents squeeze through GitHub PRs, deployment pipelines, npm publishes — all of which assume a human in the loop. fed-sx is shaped for direct contribution:
- Direct authoring of substrate features. An agent doesn't propose a
feature, it publishes one. A
DefineActivityartifact is the agent's contribution. ADefineProjectionis its analysis. ADefineTriggeris its automation. The signed publication IS the deploy — no PR review, no CI, no DevOps. - Cryptographic identity without registration. Agents have actor keys; reputation is the endorsement graph; trust is provable by signature chain. Two agents that have never met can verify each other's contributions cryptographically.
- Capability-bounded autonomy. An agent declares
capabilities-requiredon its activities. A trigger says "I publish to path-prefix/agent-x/*and callhttp-clientforapi.example.com/*." Receivers verify the constraint cryptographically; the agent can't escape its declared surface even if the agent itself is misaligned. Sandbox model designed for autonomous code (§11). - Audit-by-replay applied to AI behavior. Every AI decision is reconstructable, deterministically, by anyone with the log. "Why did agent A do X?" replay the log to that moment, see the activities A subscribed to, the projection state it observed, the trigger that fired, the activity it published. Fundamentally better than today's "trust the model" posture.
- Composition without coordination. Agent A publishes a moderation validator. Agent B subscribes and uses it. Agent C improves it, supersedes A's. B sees the supersession, decides whether to adopt. No central registry, no maintainer to coordinate with, no version skew.
- Disagreement is visible, not hidden. If agents A and B compute the same projection over the same log and produce different snapshot CIDs, the disagreement is cryptographically observable. Today, two AI services answering the same question with different answers is invisible until somebody notices.
Dynamics that emerge
- Agent specialisation = publication. "I'm the indexing agent" = publishes
DefineProjectionartifacts. "I'm the moderation agent" = publishesDefineValidatorartifacts. "I'm the matchmaking agent" = publishes aDefineApplicationfor marketplace subscriptions and triggers. Specialisation is content, not service deployment. - Reputation = endorsement graph. Web of trust applied to agent contributions. Bad actors get cut out organically; no central authority to capture.
- Forking = explicit disagreement resolution. Agents disagree on
validation? Both publish their
DefineValidators. Subscribers pick. The fork is signed, observable, recoverable. Compare today: when AI services have different rules, one is just invisibly applied. - Cascade limits = agent population safety. The
cascade-depthandcascade-limit(§19.4) become the bounded-autonomy guard rails for agent populations. Self-coordination without runaway-cascade across the substrate. - Self-improving infrastructure. Agents observe substrate behavior, propose
improvements as
DefineProjectionfor monitoring,DefineTriggerfor automation. The substrate itself improves through agent contribution — not through a release cycle. Every improvement is signed and traceable.
Use cases
- Agent-managed scientific datasets — collection, cleaning, analysis, publication, peer review by other agents, all signed activities. Replication is replay; provenance is built in.
- Multi-agent code maintenance — agents observing repos (subscribe to
Push), running tests (triggers), proposing fixes (Pull-equivalent activities), endorsing each other's work. - Agent-curated knowledge — agents publish, endorse, and supersede
knowledge artifacts. Truth accumulates via the trust graph; outdated info
gets
Superseded explicitly. - Distributed agent marketplaces — agents publish capabilities, subscribers
find them via
Topic/Predicatesubscriptions, contracts via signed activity exchange. - Cross-agent AI safety monitoring — monitoring agents subscribe to other
agents' outboxes, run validators, publish
Alertactivities when patterns of concern appear. Decentralised oversight without central authority. - Cross-org agent workflow coordination — supply chain, healthcare, legal — multiple specialised agents coordinating across organisational boundaries with cryptographic provenance.
Safety and governance properties
The substrate provides several properties AI safety has been asking for and that current infrastructure does not provide:
- Every action is signed. Attribution is cryptographic, not a log file an agent could spoof.
- Capabilities are declared and enforced. Agents operate within their declared sandbox; can't grow capabilities silently.
- Cascades are bounded. No exponential agent-on-agent feedback loops without explicit configuration.
- Audit is replay. Every decision can be reconstructed deterministically; no opaque "the model decided" moments.
- Disagreement is visible. Two agents producing different projections of the same data is a cryptographically-detectable event, not invisible drift.
- Trust is the endorsement graph, not central authority. No single point of capture or coercion.
- Forks are first-class. When safety-critical disagreements occur, the substrate accommodates them without forcing a winner; observers see all positions.
What this implies for the project
- Milestone 1's smoke tests remain right — the verb-extensibility and reactive-application proofs apply to agent contributions exactly as they apply to human contributions. The agent collaboration framing doesn't require new mechanisms; it interprets the existing mechanisms differently.
- The application model (§§18-19) is the headline story for this audience, not a layer on top. Subscriptions + triggers + projections + capabilities = agent collaboration primitives.
- Capability discovery and trust dynamics gain weight earlier. Where human-driven applications can rely on operator policy, agent-driven populations need the trust graph to be operational from milestone 2.
- The pitch line evolves. Less "ActivityPub for code" / "rose-ash next gen," more "infrastructure for AI agent collaboration with cryptographic provenance, bounded autonomy, and audit-by-replay." The technical substance is unchanged; the framing of who needs this changes substantially.
The substrate accidentally being well-shaped for the most important software-distribution problem of the next decade is worth being deliberate about.