fed-sx-m2: Step 8b-pure — retry-time bookkeeping + 11 tests + 2 Blockers
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 18s

delivery_worker state shape gains :next_retry proplist alongside
the existing :attempts:

  [{peer, _}, {pending, _}, {attempts, [{Cid, N}]},
   {next_retry, [{Cid, NextRetryAt}]}, {dead_letter, _},
   {dispatch_fn, _}]

New pure-functional exports:
  record_failure_pure/3(Cid, Now, State)
      Bumps :attempts for Cid. On the 6th failure
      (backoff_for returns dead_letter) moves the matching
      activity from :pending to :dead_letter and clears the
      :next_retry entry. Otherwise sets next_retry to
      Now + backoff_for(NewAttempts).
  record_success_pure/2(Cid, State)
      Clears both :attempts and :next_retry for Cid.
  next_due_pure/2(Now, State)
      Returns cids whose retry time has passed (insertion
      order preserved so the worker drains in FIFO retry
      order).
  attempts_for/2, next_retry_at/2, dead_letter_list/1
      Read-side accessors.

Internal helper move_to_dead_letter/2 + take_by_cid/4 walks
:pending to find the matching activity by cid.

11/11 in next/tests/delivery_retry.sh covering:
  - fresh state: 0 attempts / undefined retry / [] dead_letter
  - record_failure bumps to 1
  - record_failure sets next_retry_at = Now + 30 (slot 1)
  - second failure: attempts=2, NextRetryAt = Now + 300 (slot 2)
  - record_success clears both
  - next_due returns due cids
  - next_due empty before due
  - 6th failure -> dead-letter; activity out of :pending
  - dead-lettered cid removed from :next_retry
  - per-cid isolation: success on one doesn't disturb another

delivery_worker.sh 17/17 unchanged (new exports are additive).

Blockers added:
  #2 — Native http-request primitive missing in bin/sx_server.ml
       (briefing assumed it existed; only http-listen exists).
       Belongs to loops/fed-prims. Step 8e wrapper waits for
       the native.
  #3 — erlang:send_after-style timer primitive missing. Needed
       for the real retry loop. Belongs to loops/erlang. 8b-pure
       captures the semantics so 8b-timer is a 1-shot wiring
       when the primitive lands.

Conformance preserved at 761/761.
This commit is contained in:
2026-06-07 02:04:23 +00:00
parent dda967e060
commit 8bf2b45cf9
3 changed files with 278 additions and 9 deletions

View File

@@ -548,11 +548,24 @@ a dead-letter list visible via `/admin/dead-letter`.
Step 8f plugs in the live httpc call without touching the
queue logic. No actual HTTP yet; no retry timer wiring yet.
17/17 in `delivery_worker.sh`.
- [ ] **8b** — Retry / backoff scheduler. Wire `schedule_for/1`
into a private retry loop: `flush/1` returns deliveries that
failed; the worker schedules a self-cast via Erlang `after`
timer for the next retry slot. Tests fake-time via a Cfg
`:now_fn`.
- [x] **8b-pure** — Retry-time bookkeeping (pure-functional).
State shape gains `{next_retry, [{Cid, NextRetryAt}]}` alongside
the existing `:attempts`. New exports:
`record_failure_pure/3(Cid, Now, State)`,
`record_success_pure/2(Cid, State)`,
`next_due_pure/2(Now, State)`, `attempts_for/2`,
`next_retry_at/2`, `dead_letter_list/1`.
`record_failure_pure` bumps the attempt counter and computes
`Now + backoff_for(NewAttempts)` as the next retry; on the 6th
failure (`backoff_for` returns `dead_letter`) the matching
activity moves from `:pending` to `:dead_letter` and the cid
is cleared from `:next_retry`. `record_success_pure` clears
both. `next_due_pure` returns cids whose retry time has
passed. 11 cases in `delivery_retry.sh`.
- [ ] **8b-timer** — Erlang-side timer wiring (`erlang:send_after`
self-cast or equivalent). Needs the same substrate primitive
that `gen_server` uses for `timeout` returns. Defer behind
substrate gap discovery for now — see Blockers.
- [ ] **8c** — Delivery-state projection so the queue survives
kernel restart. New `next/kernel/delivery_state.erl` fold maps
enqueue / delivered / failed events to the worker's persistent
@@ -569,10 +582,12 @@ a dead-letter list visible via `/admin/dead-letter`.
in `delivery_dispatch.sh` covering single-peer enqueue,
two-peer fan-out, missing-worker skip, no-flag no-op,
FIFO append across two publishes, empty delivery_set no-op.
- [ ] **8e** — `httpc:request/4` BIF wrapper in
`lib/erlang/runtime.sx` (the briefing's allowed scope
exception for Step 8). Marshalling: SX dict ↔ Erlang proplist
shape with `{ok, Status, Headers, Body}` / `{error, Reason}`.
- [ ] **8e** — `httpc:request/4` BIF wrapper. **Blocker:** the
briefing assumed a native `http-request` primitive existed in
`bin/sx_server.ml`; on inspection there's only `http-listen`.
The native http-CLIENT primitive belongs to `loops/fed-prims`
(host primitives loop). Blockers entry below. m2 work
continues with the in-process flow until the native lands.
- [ ] **8f** — Real HTTP dispatch through the BIF + content-type
wiring. dispatch_fn for live use becomes a closure over the
peer URL that calls `httpc:request/4` with the signed envelope
@@ -893,12 +908,52 @@ proceed.
until resolved. Confirmed pre-existing by stashing 1a's changes and
re-running on the unmodified m1 closeout HEAD.
2. **Native `http-request` (HTTP client) primitive missing** —
discovered during Step 8e prep. The fed-sx-m2 briefing
("Substrate available to you" §) claimed: "Native HTTP client
primitive (registered in `bin/sx_server.ml`): `http-request` —
exposed at the SX layer, currently native-only." On inspection
`bin/sx_server.ml` only registers `http-listen`; there is no
`http-request` registration. The HTTP client primitive belongs
to `loops/fed-prims` (host primitives loop) per the
one-primitive-loop-per-substrate convention. m2's Step 8e
wrapper (`httpc:request/4` BIF in `lib/erlang/runtime.sx`)
can land in a 1-line follow-up once the native exists; m2
work continues with 8b-pure / 8c / 8d in the in-process flow.
3. **`erlang:send_after`-style timer primitive** — discovered
during Step 8b prep. The retry loop needs a way for the
delivery_worker to wake itself up after `backoff_for(N)`
seconds. Erlang's `erlang:send_after/3` is the standard
primitive; this port doesn't seem to register it (looked at
how `gen_server` handles `timeout` returns — it's a
message-loop self-cast that needs a delayed send). Belongs
to `loops/erlang` (Erlang runtime substrate). m2 captures the
retry semantics pure-functionally in 8b-pure so 8b-timer
becomes a 1-shot wiring when the primitive lands.
---
## Progress log
Newest first.
- **2026-06-07** — Step 8b-pure: retry-time bookkeeping.
`delivery_worker` state shape gains `:next_retry` proplist
alongside `:attempts`. `record_failure_pure/3(Cid, Now, State)`
bumps the per-cid counter and computes the next retry as
`Now + backoff_for(NewAttempts)`. On the 6th failure
(`backoff_for` returns `dead_letter`) the matching activity
moves from `:pending` to `:dead_letter`. `record_success_pure/2`
clears both `:attempts` and `:next_retry` for the cid.
`next_due_pure/2(Now, State)` returns the cids whose retry
time has passed (insertion order preserved). 11/11 in
`delivery_retry.sh`. 8b-timer (real timer wiring via
`erlang:send_after`-style primitive) and 8e
(`httpc:request/4` BIF) hit substrate gaps — Blockers entries
added pointing to loops/erlang + loops/fed-prims. Conformance
preserved at 761/761.
- **2026-06-07** — Step 8d: outbox dispatches delivery_set to
workers. `outbox:publish/2` gained `dispatch_deliveries/3` and
`enqueue_each/2`: after `log:append` + projection broadcast,