Files
rose-ash/plans/lib-guest-scheduler.md
giles 1d3021d206
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 24s
go: after(d) timer stub + 13 pattern tests → runtime 40/40, Phase 5 closed [shapes-scheduler]
Acceptance bar hit (40 runtime, 497 total). Tests: timer ready,
select-with-timeout, fan-in (3 producers), worker queue, pipeline,
fan-out-then-fan-in, select source-order, fallback case, default,
producer-consumer, two-stage pipeline, channel-counter, after+default,
tick-collector.

Shape chiselled: timer collapses "after duration" into
"channel ready immediately" — select needs only ready? from each
case. Real time is when the flip happens, not what the protocol is.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 22:24:13 +00:00

17 KiB

lib/guest/scheduler — extraction plan

Two distinct concurrency models — Erlang's addressed processes + mailboxes, and Go's anonymous channels + goroutines — sit on the same underlying machinery: a fork/yield/block/resume scheduler over CEK io-suspended continuations. This plan captures that machinery as lib/guest/scheduler/ so language N+1 with a new concurrency model costs ~200 lines of model-specific code instead of re-inventing the scheduler.

Reference: plans/lib-guest.md (parent — two-language rule, stratification), plans/erlang-on-sx.md (first consumer, in production), Go-on-SX (second consumer, see plans/go-on-sx.md once that lands).

Branch: architecture. SX files via sx-tree MCP only.

Thesis

The substrate already provides what a scheduler needs: CEK io-suspension (make-cek-suspended, cek-resume) gives suspendable execution; first-class environments give each unit of execution its own scope; the trampolined evaluator means we never blow the host stack. What every guest with concurrency re-implements on top of this is the fork/yield/block/resume protocol — the bookkeeping that decides which suspended computation runs next.

Two concrete consumers, two different concurrency vocabularies, sharing one underlying scheduler, is the proof. If only Erlang lives on it, "scheduler kit" is a euphemism for "Erlang scheduler with a Go skin." The two-language rule is the gate.

Current state (2026-05-26)

  • Erlang-on-SX has the full pattern in production: 729/729 conformance, spawn/send/receive, selective receive, monitor/link, hot reload. The scheduler logic is currently coupled to Erlang-shaped concepts (PIDs, mailboxes, links) — extraction-blocking but not extraction-defeating.
  • Go-on-SX does not exist yet. plans/go-on-sx.md is the umbrella plan (TBD); this scheduler plan is a sibling/dependency.
  • lib/guest/scheduler/ does not exist. The two-language rule blocks extraction until Go-on-SX independently implements its scheduler.

Status: Phase 0 (Erlang shape capture). No code change in this plan yet.

Why the two models actually share a kit

The non-obvious claim is that Erlang processes and Go goroutines really do share machinery beneath their different vocabularies. The mapping:

Concept Erlang Go Common kit name
Unit of execution process (PID-addressed) goroutine (anonymous) task
Spawn spawn(Fun) → PID go expr → nothing task-spawn
Block target mailbox match channel send/recv task-block
Wake condition message arrives counterpart ready task-resume predicate
Yield receive with no match channel blocked scheduler hands off
Termination exit reason → linked tasks panic / return task lifecycle
Selection selective receive select statement both = "wait for any of N predicates"

What the kit owns:

  • The task table (token → suspended CEK continuation + status).
  • The runnable queue + scheduling policy (round-robin v1; pluggable).
  • The block→resume protocol: a blocked task registers a predicate; when any task changes state, blocked tasks are re-polled; first whose predicate fires becomes runnable.
  • The fairness/preemption budget — gas per step before forced yield.

What each language owns:

  • The semantics layer on top: Erlang's PID→task map + mailbox per task + selective-receive predicates; Go's channel value → blocked-task list per channel + send/recv pairing + select multiplexing.
  • The language-visible API (spawn/!/receive vs go/<-/select).

This is exactly the lib/guest pattern: extract the dispatch skeleton, keep the rules in the language layer.

API surface (proposed — design only, not yet implemented)

(make-scheduler &key gas-per-step ;; default 1000
                     policy)      ;; :round-robin | :fifo
  -> scheduler-handle

(task-spawn sched body-thunk) -> task-token
  ;; body-thunk is a 0-arg fn whose body runs as the task.
  ;; Returns immediately; task is enqueued runnable.

(task-current sched) -> task-token
  ;; Inside a task, the token of the running task. Useful for self-reference.

(task-yield sched) -> nil
  ;; Voluntary yield. Caller is re-enqueued at the tail of runnable.

(task-block sched resume-predicate) -> any
  ;; Caller suspends. Predicate is (fn () -> resume-value-or-#f).
  ;; When predicate returns non-#f, caller resumes with that value.
  ;; Predicate is polled on every scheduler tick when there's nothing
  ;; obviously runnable. (Optimisation: language layer can wake explicitly —
  ;; see task-wake.)

(task-wake sched task) -> nil
  ;; Hint to the scheduler: re-poll this task's resume-predicate now.
  ;; Used by sender-side when a receiver might unblock.

(task-status sched task) -> :runnable | :blocked | :finished | :crashed

(task-result sched task) -> value | {:error reason}
  ;; After :finished or :crashed.

(scheduler-step sched) -> :ran | :idle | :all-done
  ;; Run at most gas-per-step instructions of one task. Caller drives the
  ;; loop.

(scheduler-run sched) -> nil
  ;; Run until :all-done. Equivalent to (until (= :all-done (scheduler-step
  ;; sched))).

Notes on the design:

  • task-block with a resume-predicate is the universal blocking primitive. Erlang's receive is (task-block sched (fn () (mailbox-match self pat))). Go's <-ch is (task-block sched (fn () (channel-recv-ready ch))).
  • task-wake is the optimisation: instead of polling every blocked task every step, the language layer wakes the specific task whose predicate is now likely true. v1 can omit it; performance work later.
  • gas-per-step gives fairness without true preemption. Tasks that don't yield within their gas budget are force-yielded by the CEK loop. (CEK io-suspension already does this for IO; gas budget extends to plain instructions.)
  • No priority/affinity in v1. Both Erlang and Go default to non-priority scheduling; specialised cases (Erlang's high-priority processes) are language-layer concerns.

Build order — phases

This is a long-running plan paced against Go-on-SX. Phases are not loop-style "one commit per phase" — they're milestone gates.

Phase 0 — Erlang shape capture (doc-only)

  • Read lib/erlang/runtime.sx scheduler code (currently coupled to Erlang vocabulary).
  • Write a 1-page summary of what's actually a scheduler and what's actually Erlang. Identify the boundary.
  • Acceptance: summary committed to this plan as a new section "Erlang scheduler shape (captured 2026-MM-DD)". No code change.
  • Output: clear-eyed mental model. Without this, we'll merge Erlang's scheduler shape into the kit and pretend it generalises.

Phase 1 — Go scheduler independent implementation

  • During Go-on-SX, implement lib/go/sched.sx from scratch. Do NOT look at Erlang's scheduler while doing this. (Or read it once, then close it.)
  • Pass Go's channel + goroutine + select conformance tests.
  • Acceptance: Go scheduler green, lib/go/scoreboard.json includes scheduler tests, two-consumer rule now passable.
  • Output: two independent, working implementations of the same idea.

Phase 2 — Diff and proposed kit

  • Side-by-side diff: Erlang's scheduler vs Go's scheduler. Where do they agree? Where does each have language-specific bookkeeping?
  • The diff is the kit. Things in both go in lib/guest/scheduler/; things in only one stay in lib/erlang/ or lib/go/.
  • Draft lib/guest/scheduler/api.sx (signatures only, no body) reflecting the proposed surface.
  • Acceptance: API draft circulated for review; agreement that the surface covers both consumers; no merge yet.

Phase 3 — Implement lib/guest/scheduler/

  • Implement the kit per the agreed API. New file(s) in lib/guest/scheduler/.
  • The kit has its own tests in lib/guest/scheduler/tests/ — agnostic of any particular language vocabulary.
  • Acceptance: kit tests pass. Erlang and Go conformance scoreboards unchanged (the language implementations still use their own scheduler — we haven't refactored yet).

Phase 4 — Refactor Erlang to use the kit

  • lib/erlang/runtime.sx scheduler logic deleted; replaced with calls into lib/guest/scheduler/. Erlang's PID table, mailbox-per-PID, selective receive stay in lib/erlang/.
  • No-regression gate: Erlang conformance holds at current pass count (currently 729/729). Hard requirement.
  • Acceptance: Erlang scoreboard unchanged; lib/erlang/runtime.sx meaningfully smaller (the scheduler code is gone).

Phase 5 — Refactor Go to use the kit

  • Same exercise for Go. lib/go/sched.sx shrinks to channel/goroutine bookkeeping + delegation.
  • No-regression gate: Go conformance scoreboard at its current pass count.
  • Acceptance: Go scoreboard unchanged; lib/go/sched.sx meaningfully smaller.

Phase 6 — Documentation + design-diary close

  • Document lib/guest/scheduler/ API in lib/guest/README.md (or wherever the lib/guest API index lives).
  • Capture the chiselling diary: what almost went in the kit but ended up language-specific, and why. This is the load-bearing knowledge for the third consumer when it arrives.
  • Acceptance: API documented; diary section added to this plan.

Two-language rule — gating

The rule is hard. No code in lib/guest/scheduler/ lands until BOTH Phase 1 (Go independent) AND Phase 0 (Erlang capture) are complete AND a review confirms the two implementations actually share machinery in a way the kit captures.

If, during Phase 2 diff, we discover that the agreement is shallow (e.g., both have a runnable queue but the policies are fundamentally incompatible), the right outcome is to NOT extract. Add a "rejected extraction" note to this plan documenting what we learned and close it. That outcome is fine — it tells us the two concurrency models aren't actually sister, which is a real result.

Open questions

  • Preemption. v1 is cooperative; gas-per-step gives fairness but not hard preemption. Erlang BEAM does true preemption (reduction counting). Go uses async preemption (signal-driven since 1.14). Neither extreme fits cooperatively over CEK. Is gas-per-step + voluntary yield enough? Probably for v1; revisit if a guest needs hard real-time.
  • Priority/affinity. Both Erlang and Go can run without it. Defer.
  • Distribution. Erlang nodes, Go's distributed channels — both are language-specific layers on top of the local scheduler. Out of scope.
  • Cancellation. Go has context.Context; Erlang has exit/2. Both bottom out at "deliver an exception to a task." Worth modelling? Probably as a kit primitive (task-cancel sched task reason) that delivers an exception via CEK exception machinery, language layer wraps it.
  • Third consumer. If/when JS-on-SX gets a proper async/await + Promise scheduler, that'd be a great third consumer to validate the kit didn't over-fit to Erlang+Go.

Progress log

Newest first. Append one dated entry per milestone landed.

  • 2026-05-27 — Phase 5 acceptance crossed (40 runtime tests). Final shape observation: time-as-readiness-flip. The Go side added an after(d) builtin that returns a channel already holding a tick value — duration is ignored in v0. The select loop doesn't care that the channel got its value "via time"; it only consults ready?. This separates two concerns the eventual kit had been conflating:

    1. The wake-up protocol — what select asks of every case: "are you ready right now?" Channel-recv answers via "buffer non-empty or closed"; channel-send via "buffer has room"; timer via "deadline reached." All three flatten to a single ready? predicate.

    2. The scheduling oraclewhen a case's ready? flips from false to true. For channels this is driven by other goroutines sending/receiving; for timers it's driven by a wall-clock or monotonic source.

    v0 collapses #2 (timer = ready immediately, sends always ready, recvs ready iff buffer non-empty) and exposes #1 as the only thing the dispatcher needs to know. Phase 5b refines #2 with blocking semantics and real time, but #1 stays the same shape.

    Concretely: the kit's select-case should take :ready?-fn per case, not three different "is-this-a-send-or-recv-or-timer" tags. Send/recv/timer become factory functions that produce a (:ready? FN :commit! FN) record — the dispatcher walks cases, picks the first whose ready? returns true, calls commit! to extract the value (and side-effect: drain buffer, fire timer). This is the same shape as a STM transaction over case-set, and matches Erlang's receive clauses too (each pattern is a ready-predicate + commit-action over the mailbox head).

    Ping-pong remains impossible in v0 because the synchronous spawn collapses the ready?-flip oracle to "always immediate" — the spawned goroutine can never park waiting for the parent to send. Phase 5b must restore the wake-up dimension; until then the kit spec should encode the readiness-protocol design even though the oracle is degenerate.

  • 2026-05-27 — From Go-on-SX Phase 5 first slice: the channel primitive landed as closures-over-mutable-state in lib/go/sched.sx. Concrete shape:

    (list :go-chan SEND-FN RECV-FN CLOSED?-FN CLOSE!-FN)
    

    Each closure captures a shared buf (a mutable list) and closed flag (a let-bound boolean mutated via set!). Identity: two make() calls produce distinct closures, satisfying Go spec § Channel types' "distinct channels with same type" rule.

    Design insight for the kit: the channel-as-closure-bundle shape is the right scheduler-kit primitive — implementation-hide the buffer behind opaque accessor closures, so the underlying storage can be swapped (linked list → ring buffer → segmented array) without changing the API. Erlang's mailboxes will need the same trick.

    v0 limitation logged: no real preemption. SX doesn't expose first-class continuations to guest code, so v0 runs go f() synchronously and relies on the spawned goroutine completing before the main goroutine receives. Real concurrent semantics — blocking send on full buffer, blocking recv on empty — needs the scheduler kit to ship the suspension/resumption machinery (or for Phase 5b to bake CEK-style trampolining into the eval layer).

    Cross-ref: the :select-case uniform shape from the parser-side diary entry pairs with this — the kit's sched-select should accept a list of channel-op cases (built from the closures-over- state primitives logged here) and pick a ready one. Source: Go-on-SX commit landing lib/go/sched.sx first cut.

  • 2026-05-27 — Follow-up from same Phase 2 work: select AST shape landed. Each case is (list :select-case COMM-STMT BODY) where COMM-STMT is one of :send, :short-decl (recv into new var), :assign (recv into existing var), or a bare receive expression (:app (:var "<-") [chan]). The shape is uniform across all four comm-stmt kinds — the kit's sched-select primitive should accept a list of cases each described by (direction chan value-target?) and let the kit's runtime pick a ready case. That uniformity is what makes a single kit primitive cover all four Go case shapes.

    Also: Go's select with default makes the multiplexer non-blocking; without default it blocks until a case is ready. The kit primitive should mirror this — present-or-absent default determines blocking semantics. Erlang's receive ... after Timeout -> ... is a similar pattern with a timeout case rather than default; the kit primitive should handle both as instances of "non-blocking-fallback case." Source: Go-on-SX commit parse.sx — switch + select.

  • 2026-05-27 — From Go-on-SX Phase 2 (parser side, ahead of scheduler implementation): the parsed AST shapes for Go's concurrency primitives have landed and are worth recording before Phase 5 builds the scheduler.

    go EXPR              → (list :go EXPR)
    defer EXPR           → (list :defer EXPR)
    ch <- v              → (list :send CHAN VALUE)
    <-ch                 → (list :app (:var "<-") [CHAN])   ; unary recv
    for range COLL { }   → (list :range-for nil nil nil COLL BODY)
    for k, v := range C  → (list :range-for :short-decl KEY VAL COLL BODY)
    

    Design insight for the kit: the :go and :defer shapes are pleasingly minimal — both wrap a single expression. Erlang's spawn(Mod, Fun, Args) will produce something more elaborate; the scheduler kit primitive (sched-spawn task) should accept a thunk so both languages reduce to a uniform spawn API.

    The :send shape carries CHAN + VALUE — symmetric with channel-recv as the unary <- form. Once the scheduler has channel primitives, both shapes thunk-down to a single (chan-op direction chan value) abstraction.

    Range over channels (for v := range ch) is currently parsed as range-for with coll = ch; the scheduler kit will dispatch on the type of coll at execution time (channels yield via receive, collections via iteration). This dispatch is the right place for the scheduler kit to express the channel-receive ⇄ iteration polymorphism. Source: Go-on-SX commit parse.sx — go/defer/send/range.

  • 2026-05-26 — Plan drafted. Phase 0 unstarted. Awaiting Go-on-SX to begin Phase 1.