Files
rose-ash/lib/erlang/conformance.sh
giles c361946974 perf: deadline tweaks (tcl 2400→300s, erlang 120→600s); plan + Phase 1 findings
Phase 1 of the jit-perf-regression plan reproduced and quantified the alleged
30× substrate slowdown across 5 guests (tcl, lua, erlang, prolog, haskell). On
a quiet machine all five suites pass cleanly:

  tcl test.sh         57.8s wall, 16.3s user, 376/376 ✓
  lua test.sh         27.3s wall,  4.2s user, 185/185 ✓
  erlang conformance  3m25s wall, 36.8s user, 530/530 ✓ (needs ≥600s budget)
  prolog conformance  3m54s wall, 1m08s user, 590/590 ✓
  haskell conformance 6m59s wall, 2m37s user, 156/156 ✓

Per-test user-time at architecture HEAD vs pre-substrate-merge baseline
(83dbb595) is essentially flat (tcl 0.83×, lua 1.4×, prolog 0.82×). The
symptoms reported in the plan (test timeouts, OOMs, 30-min hangs) were heavy
CPU contention from concurrent loops + one undersized internal `timeout 120`
in erlang's conformance script. There is no substrate regression to bisect.

Changes:

* lib/tcl/test.sh: `timeout 2400` → `timeout 300`. The original 180s deadline
  is comfortable on a quiet machine (3.1× headroom); 300s gives some safety
  margin for moderate contention without masking real regressions.
* lib/erlang/conformance.sh: `timeout 120` → `timeout 600`. The 120s budget
  was actually too tight for the full 9-suite chain even before this work.
* lib/erlang/scoreboard.{json,md}: 0/0 → 530/530 — populated by a successful
  conformance run with the new deadline. The previous 0/0 was a stale
  artefact of the run timing out before parsing any markers.
* plans/jit-perf-regression.md: full Phase 1 progress log including
  per-guest perf table, quiet-machine re-measurement, and conclusion.

Phases 2–4 (bisect, diagnose, fix) skipped — there is no substrate regression
to find. Phase 6 (perf-regression alarm) still planned to catch the next
quadratic blow-up early instead of via watchdog bumps.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 14:05:29 +00:00

154 lines
4.0 KiB
Bash
Executable File

#!/usr/bin/env bash
# Erlang-on-SX conformance runner.
#
# Loads every erlang test suite via the epoch protocol, collects
# pass/fail counts, and writes lib/erlang/scoreboard.json + .md.
#
# Usage:
# bash lib/erlang/conformance.sh # run all suites
# bash lib/erlang/conformance.sh -v # verbose per-suite
set -uo pipefail
cd "$(git rev-parse --show-toplevel)"
SX_SERVER="${SX_SERVER:-hosts/ocaml/_build/default/bin/sx_server.exe}"
if [ ! -x "$SX_SERVER" ]; then
SX_SERVER="/root/rose-ash/hosts/ocaml/_build/default/bin/sx_server.exe"
fi
if [ ! -x "$SX_SERVER" ]; then
echo "ERROR: sx_server.exe not found." >&2
exit 1
fi
VERBOSE="${1:-}"
TMPFILE=$(mktemp)
OUTFILE=$(mktemp)
trap "rm -f $TMPFILE $OUTFILE" EXIT
# Each suite: name | counter pass | counter total
SUITES=(
"tokenize|er-test-pass|er-test-count"
"parse|er-parse-test-pass|er-parse-test-count"
"eval|er-eval-test-pass|er-eval-test-count"
"runtime|er-rt-test-pass|er-rt-test-count"
"ring|er-ring-test-pass|er-ring-test-count"
"ping-pong|er-pp-test-pass|er-pp-test-count"
"bank|er-bank-test-pass|er-bank-test-count"
"echo|er-echo-test-pass|er-echo-test-count"
"fib|er-fib-test-pass|er-fib-test-count"
)
cat > "$TMPFILE" << 'EPOCHS'
(epoch 1)
(load "lib/erlang/tokenizer.sx")
(load "lib/erlang/parser.sx")
(load "lib/erlang/parser-core.sx")
(load "lib/erlang/parser-expr.sx")
(load "lib/erlang/parser-module.sx")
(load "lib/erlang/transpile.sx")
(load "lib/erlang/runtime.sx")
(load "lib/erlang/tests/tokenize.sx")
(load "lib/erlang/tests/parse.sx")
(load "lib/erlang/tests/eval.sx")
(load "lib/erlang/tests/runtime.sx")
(load "lib/erlang/tests/programs/ring.sx")
(load "lib/erlang/tests/programs/ping_pong.sx")
(load "lib/erlang/tests/programs/bank.sx")
(load "lib/erlang/tests/programs/echo.sx")
(load "lib/erlang/tests/programs/fib_server.sx")
(epoch 100)
(eval "(list er-test-pass er-test-count)")
(epoch 101)
(eval "(list er-parse-test-pass er-parse-test-count)")
(epoch 102)
(eval "(list er-eval-test-pass er-eval-test-count)")
(epoch 103)
(eval "(list er-rt-test-pass er-rt-test-count)")
(epoch 104)
(eval "(list er-ring-test-pass er-ring-test-count)")
(epoch 105)
(eval "(list er-pp-test-pass er-pp-test-count)")
(epoch 106)
(eval "(list er-bank-test-pass er-bank-test-count)")
(epoch 107)
(eval "(list er-echo-test-pass er-echo-test-count)")
(epoch 108)
(eval "(list er-fib-test-pass er-fib-test-count)")
EPOCHS
timeout 600 "$SX_SERVER" < "$TMPFILE" > "$OUTFILE" 2>&1
# Parse "(N M)" from the line after each "(ok-len <epoch> ...)" marker.
parse_pair() {
local epoch="$1"
local line
line=$(grep -A1 "^(ok-len $epoch " "$OUTFILE" | tail -1)
echo "$line" | sed -E 's/[()]//g'
}
TOTAL_PASS=0
TOTAL_COUNT=0
JSON_SUITES=""
MD_ROWS=""
idx=0
for entry in "${SUITES[@]}"; do
name="${entry%%|*}"
epoch=$((100 + idx))
pair=$(parse_pair "$epoch")
pass=$(echo "$pair" | awk '{print $1}')
count=$(echo "$pair" | awk '{print $2}')
if [ -z "$pass" ] || [ -z "$count" ]; then
pass=0
count=0
fi
TOTAL_PASS=$((TOTAL_PASS + pass))
TOTAL_COUNT=$((TOTAL_COUNT + count))
status="ok"
marker="✅"
if [ "$pass" != "$count" ]; then
status="fail"
marker="❌"
fi
if [ "$VERBOSE" = "-v" ]; then
printf " %-12s %s/%s\n" "$name" "$pass" "$count"
fi
if [ -n "$JSON_SUITES" ]; then JSON_SUITES+=","; fi
JSON_SUITES+=$'\n '
JSON_SUITES+="{\"name\":\"$name\",\"pass\":$pass,\"total\":$count,\"status\":\"$status\"}"
MD_ROWS+="| $marker | $name | $pass | $count |"$'\n'
idx=$((idx + 1))
done
printf '\nErlang-on-SX conformance: %d / %d\n' "$TOTAL_PASS" "$TOTAL_COUNT"
# scoreboard.json
cat > lib/erlang/scoreboard.json <<JSON
{
"language": "erlang",
"total_pass": $TOTAL_PASS,
"total": $TOTAL_COUNT,
"suites": [$JSON_SUITES
]
}
JSON
# scoreboard.md
cat > lib/erlang/scoreboard.md <<MD
# Erlang-on-SX Scoreboard
**Total: ${TOTAL_PASS} / ${TOTAL_COUNT} tests passing**
| | Suite | Pass | Total |
|---|---|---|---|
$MD_ROWS
Generated by \`lib/erlang/conformance.sh\`.
MD
if [ "$TOTAL_PASS" -eq "$TOTAL_COUNT" ]; then
exit 0
else
exit 1
fi