rose-ash/lib/erlang/bench_ring_results.md

# Ring Benchmark Results

Generated by `lib/erlang/bench_ring.sh` against `sx_server.exe` on the
synchronous Erlang-on-SX scheduler.

| N (processes) | Hops | Wall-clock | Throughput |
|---|---|---|---|
| 10 | 10 | 907ms | 11 hops/s |
| 50 | 50 | 2107ms | 24 hops/s |
| 100 | 100 | 3827ms | 26 hops/s |
| 500 | 500 | 17004ms | 29 hops/s |
| 1000 | 1000 | 29832ms | 34 hops/s |

(Each `Nm` row spawns N processes connected in a ring and passes a
single token N hops total — i.e. the token completes one full lap.)

## Status of the 1M-process target

Phase 3's stretch goal in `plans/erlang-on-sx.md` is a million-process
ring benchmark. **That target is not met** in the current synchronous
scheduler; extrapolating from the table above, 1M hops would take
~30 000 s. Correctness is fine — the program runs at every measured
size — but throughput is bound by per-hop overhead.

Per-hop cost is dominated by:
- `er-env-copy` per fun clause attempt (whole-dict copy each time)
- `call/cc` capture + `raise`/`guard` unwind on every `receive`
- `er-q-delete-at!` rebuilds the mailbox backing list on every match
- `dict-set!`/`dict-has?` lookups in the global processes table

To reach 1M-process throughput in this architecture would need at
least: persistent (path-copying) envs, an inline scheduler that
doesn't call/cc on the common path (msg-already-in-mailbox), and a
linked-list mailbox. None of those are in scope for the Phase 3
checkbox — captured here as the floor we're starting from.

## Phase 9 status (2026-05-14)

Specialized opcodes 9b–9f landed as **stub dispatchers** in
`lib/erlang/vm/dispatcher.sx`: `OP_PATTERN_TUPLE/LIST/BINARY`,
`OP_PERFORM/HANDLE`, `OP_RECEIVE_SCAN`, `OP_SPAWN/SEND`, and ten
`OP_BIF_*` hot dispatch entries. Each opcode's handler is a thin
wrapper over the existing `er-match-*` / `er-bif-*` / runtime impls,
so **the perf numbers above are unchanged** — same per-hop cost, same
scheduler. The stubs exist to nail down opcode IDs, operand contracts,
and tests against `er-match!` parity *before* 9a (the OCaml
opcode-extension mechanism in `hosts/ocaml/evaluator/`) lands.

When 9a integrates and the bytecode compiler can emit these opcodes
at hot call sites, the real speedup story (~3000× ring throughput,
~1000× spawn) starts. Until then this file documents the
pre-integration ceiling. 72 vm-suite tests guard the stub correctness;
full conformance is **709/709** with the stub infrastructure loaded.