Files
rose-ash/plans/agent-briefings/erlang-send-after.md
giles 4c0a48834e preserve fed-sx-m1 loop briefings before pruning its worktree
4 untracked agent-briefing docs from the fed-sx-m1 worktree (merged branch loops/fed-sx-m2),
saved here so they survive the worktree cleanup.
2026-07-02 12:25:50 +00:00

6.9 KiB

; -- mode: markdown --

loops/erlang — erlang:send_after substrate primitive

Scoped briefing for a single focused iteration loop on loops/erlang. Not a replacement for the general erlang-loop.md; this is the load-bearing-blocker task that, when done, unblocks plans/fed-sx-milestone-2.md Step 8b-timer (delivery retry loop) and the standard gen_server timeout return.

description: loops/erlang — send_after primitive
subagent_type: general-purpose
run_in_background: true
isolation: worktree     # worktree at /root/rose-ash-loops/erlang

The goal

Implement the standard Erlang timer primitives so a process can schedule a message-to-self (or to another pid) after N milliseconds:

Ref = erlang:send_after(Time, Dest, Msg).   %% Time: int millis. Dest: pid or atom.
ok | TimeLeft = erlang:cancel_timer(Ref).    %% returns remaining ms, or false if fired/expired.

These are the same primitives gen_server uses internally to emit {noreply, S, Timeout} returns — when the gen_server's handle_* callback returns {noreply, NewState, T}, the gen_server schedules {timeout} to itself after T ms via send_after, and the next handle_info({timeout}, S) fires when no other message arrives first.

Without send_after, anything that wants a delayed self-cast has to busy-loop or block — neither acceptable for the kernel.

Acceptance — single SX file lib/erlang/tests/send_after.sx

The standard test suite plus this one. Run the conformance gate after each commit (bash lib/erlang/conformance.sh).

  • T1erlang:send_after(50, self(), hello) returns a Ref; after 60ms a receive picks up hello. Round-trip latency under 100ms in steady state.
  • T2Ref = erlang:send_after(1000, self(), late), erlang:cancel_timer(Ref) returns an integer ~1000 (remaining ms) AND a subsequent receive late -> got after 50 -> none end returns none — the cancelled message never arrives.
  • T3 — multiple in-flight timers fire in deadline order, not schedule order: send_after(80, self(), b), then send_after(20, self(), a) — selective receive on a first.
  • T4 — cancel_timer on an already-fired timer returns false.
  • T5send_after to a registered atom: register a process, queue a delayed message to its name, it lands in that process's mailbox.
  • T6gen_server {noreply, S, 100}handle_info({timeout}, S) fires after 100ms when no other message arrives. (Sanity check that the gen_server library, currently shipped, hooks up correctly.)

Conformance gate stays green (currently 725/725 on loops/erlang per lib/erlang/scoreboard.json, 761/761 after the m2 BIFs are in — both numbers will move when this lands).

Implementation shape (suggested, not prescribed)

The OCaml host has no event loop today — bin/sx_server.ml is a single-thread epoch protocol. But the Erlang scheduler in lib/erlang/runtime.sx IS a SX-level event loop already: er-sched holds the global dict; er-sched-step-alive! advances one runnable process; er-sched-run-all! drains the runnable queue until all processes are waiting / exiting / dead. Two options:

Option A — pure-SX timer wheel. Extend the scheduler dict with :timers (sorted-list-of {:deadline-ms _ :pid _ :msg _ :ref _ :alive true}). er-sched-run-all! already loops; when it finds the runnable queue empty but processes still alive in receive blocks, walk the timer list and check whether any deadline has passed (need a monotonic clock — see below). For deadlines in the future, sleep until the next one or until the runnable queue gets work. Cancel = set :alive false on the timer entry; cull on next scan.

Option B — OCaml-side Unix.setitimer / Unix.select wakeup. Heavier, lets the SX scheduler park properly on Unix.select against a self-pipe. Right shape for a long-running kernel that may sit idle for minutes between events. Probably the production answer, but significantly more work than A.

Recommend A first for the unblock; B as a follow-up if the kernel needs to idle gracefully.

Clock primitive

erlang:monotonic_time/0,1 and erlang:system_time/0,1 ought to exist for completeness — send_after needs at least a monotonic ms clock. If the substrate doesn't have one, add erlang:monotonic_time(millisecond) as a small native BIF in bin/sx_server.ml (Unix.gettimeofday) and expose it via er-bif-erlang-monotonic-time. Time travel safety: DO NOT use wall-clock for timer deadlines.

Files in play (loops/erlang scope)

  • lib/erlang/runtime.sx — scheduler state, BIFs, primitive registration. Scheduler dict at er-scheduler (single-element list). Step function: er-sched-step-alive! (line ~708 on m2; check line numbers freshly — recent commits moved code).
  • lib/erlang/tests/send_after.sx — new file, the test suite above.
  • lib/erlang/scoreboard.json + scoreboard.md — bump counts.
  • (If Option B is taken) bin/sx_server.ml — out of loops/erlang scope per the loop's standing rules. Surface a separate loops/fed-prims task for the native bits and keep send_after's SX wrapper minimal.

Base branch / rebase note

The substrate change for the kernel mutex (Blockers #4 in fed-sx-milestone-2.md) lands on architecture via the m2 merge. The key change is in lib/erlang/runtime.sx:

  • er-sched-step-alive! reads :pending-args from the process record when first invoked (line 736 on m2; was hardcoded to (list)).
  • er-bif-http-listen spawns the user's handler as a real er-process with :pending-args (list req-pl) instead of er-apply-fun-inline.

If loops/erlang rebases onto origin/architecture and that merge hasn't landed yet (still sitting on local /root/rose-ash at 089ed88f-or-later), fetch origin/loops/fed-sx-m2 and cherry-pick or rebase against 29e4234b to pick up the :pending-args field shape. The work area (er-sched-step-alive!) is the same surface this loop will touch — without the m2 change, the timer wakeup path won't dovetail cleanly with the spawn-from-BIF pattern. Conflict risk: medium. Conflict surface: small (~5 lines around the :pending-args read).

Test discipline

  • bash lib/erlang/conformance.sh green before and after every commit.
  • Commits scoped to lib/erlang/**. No edits to next/**, bin/sx_server.ml, spec/**, or other lib/<lang>/** from this loop.
  • One commit per acceptance test cluster — T1+T2, T3+T4, T5+T6 is a reasonable cadence.
  • Push to origin/loops/erlang after each commit. The fed-sx-m2 loop is dormant waiting on this.

Done when

  • 6/6 new send_after tests green.
  • conformance gate green at the new total.
  • single line ticked in plans/erlang-on-sx.md Progress log + a tick on whatever roadmap entry covers timers.
  • Then update the m2 plan (plans/fed-sx-milestone-2.md Blockers #3) to RESOLVED and unblock Step 8b-timer; that update belongs to the fed-sx-m2 loop, not this one.