Merge loops/fed-sx-m2 into architecture: 8b-timer + send_after wiring

Closes m2's last open box. The delivery_worker now wires its retry
loop on erlang:send_after / cancel_timer self-casts: failing flush
arms the per-Cid backoff timer; handle_info({retry, Cid}) redrives
one Cid through deliver_one_pure; success clears state, failure
schedules next slot or dead-letters on attempt 6.

m2 carries three cherry-picks of the send_after substrate work
(originally landed on loops/erlang via 3709460d/98b0104c/b10e55f0).
Those same commits are already on this architecture via the earlier
loops/erlang merge (154681a4); merging m2's duplicates is a
mechanical conflict-resolve to whichever copy git picks first.

Highlights since the previous m2 merge (2bafb4f7):
- 8b-timer wiring + 5 new tests in delivery_retry_timer.sh
- :timers state field tracks live refs; cancel_timer_for before
  re-arming so stale timers don't keep the scheduler alive
- state_srv/1 + timer_ref_for/2 for test introspection
- merge-prep note documenting the duplicate-fix rebase strategy

m2 is now feature-complete. Conformance gate 771/771.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

# Conflicts:
#	lib/erlang/conformance.sh
#	lib/erlang/scoreboard.md
This commit is contained in:
2026-06-30 14:10:40 +00:00
3 changed files with 327 additions and 9 deletions

View File

@@ -562,10 +562,24 @@ a dead-letter list visible via `/admin/dead-letter`.
is cleared from `:next_retry`. `record_success_pure` clears
both. `next_due_pure` returns cids whose retry time has
passed. 11 cases in `delivery_retry.sh`.
- [ ] **8b-timer** — Erlang-side timer wiring (`erlang:send_after`
self-cast or equivalent). Needs the same substrate primitive
that `gen_server` uses for `timeout` returns. Defer behind
substrate gap discovery for now — see Blockers.
- [x] **8b-timer** — Erlang-side timer wiring on the
`delivery_worker` gen_server. handle_call(flush) drains then
arms a `send_after` self-cast per retried Cid (backoff from
the now-bumped attempt counter); handle_info({retry, Cid})
redrives that single Cid through deliver_one_pure. Success
clears bookkeeping via record_success; failure bumps attempts
via record_failure_pure and arms the next backoff slot — or
promotes to dead-letter on the 6th attempt and stops arming.
A `:timers [{Cid, Ref}]` state field tracks live refs so
schedule_retry_for can cancel the previous one before arming
the next (otherwise stale timers keep the scheduler's run
loop alive long after the work is done). 5/5 in
`delivery_retry_timer.sh`: T1 timer scheduled, T2 attempts=1,
T3 retry fires + attempts=2, T4 next timer rearmed, T5 ets-
counter dispatch (fail/fail/ok) lands in 3 attempts and
clears state. Substrate dependency landed via cherry-pick
from `loops/erlang` (3709460d / 98b0104c / 779e53b2) until
`loops/erlang` → architecture catches up.
- [x] **8c** — Delivery-state projection
(`next/kernel/delivery_state.erl`). Folds delivery events into
per-peer worker-shaped snapshots so the outbound queue survives
@@ -1105,8 +1119,16 @@ proceed.
through `delivery_worker`) and Step 10c (peer-actor doc
fetch in `peer_actors`) are now unblocked.
3. **`erlang:send_after`-style timer primitive** — discovered
during Step 8b prep. The retry loop needs a way for the
3. **`erlang:send_after`-style timer primitive** — ~~discovered
during Step 8b prep~~ **RESOLVED 2026-06-30** via the
`loops/erlang` `send_after`/`cancel_timer`/`monotonic_time`
work landing on `origin/loops/erlang` (commits 3709460d,
98b0104c, b10e55f0; 766/766 → 771/771). m2 cherry-picked all
three onto this branch so 8b-timer could land without waiting
for `loops/erlang` → architecture; the cherry-picks fall away
as no-op duplicates when architecture catches up. Original
diagnosis preserved below for the audit trail.
The retry loop needs a way for the
delivery_worker to wake itself up after `backoff_for(N)`
seconds. Erlang's `erlang:send_after/3` is the standard
primitive; this port doesn't seem to register it (looked at
@@ -1241,6 +1263,31 @@ proceed.
Newest first.
- **2026-06-30** — Step 8b-timer closed. Cherry-picked the three
`loops/erlang` send_after commits onto m2 (3709460d, 98b0104c,
779e53b2 — the substrate landed standalone on origin/loops/erlang
earlier and hadn't propagated to origin/architecture yet). Wired
the live timer loop in `next/kernel/delivery_worker.erl`: a
`:timers [{Cid, Ref}]` state field; `handle_call(flush)` drains
then arms a `send_after` self-cast per retried Cid; the new
`handle_info({retry, Cid})` callback redrives that one Cid through
`deliver_one_pure` and either records success / clears state, or
bumps and arms the next backoff slot (or dead-letters on the 6th
attempt). Two arm-paths split — `arm_retry_timer` (post-drain,
attempts already bumped) vs `schedule_retry_for` (post-retry
attempt, needs to bump). `cancel_timer_for/1` clears the previous
timer before arming the next so stale timers don't keep the
scheduler's run loop alive after the work is done. Two new public
APIs for tests: `state_srv/1` returns the worker's full state,
`timer_ref_for/2` looks up a Cid's live ref. 5/5 in new
`delivery_retry_timer.sh` (T1 timer scheduled, T2 attempts=1, T3
retry fires + attempts=2, T4 next timer rearmed, T5 ets-counter
dispatch fail/fail/ok lands in 3 attempts and clears state).
Existing `delivery_worker.sh` 17/17 and `delivery_retry.sh` 11/11
still green. Conformance gate 771/771 (was 761/761; the +10 is
the cherry-picked send_after suite). Blockers #3 RESOLVED.
Reply shape of `flush` unchanged; no caller updates needed.
- **2026-06-28** — Merge-prep pass. Conformance 761/761 still green
on m2 tip `cd0de8cb`. Both smoke tests still pass cold:
`next/tests/smoke_kernel_route.sh` 6/6 (port 54471, listener up