The OCaml suite's permanent ~273-failure band (in-progress hs-* + the r7rs radix shadow) is normalized, so real regressions hide in red noise (conformance.md F-10). A runner skip-list would rewrite the hs loops' scoreboards mid-flight — instead, pin the band: scripts/test-suite-baseline.sh runs the full suite and diffs its FAIL set against spec/tests/known-failures.txt (273 entries, identity = "suite > name", error text stripped). Red on a NEW failure (regression) AND red on a vanished failure (fix landed — delete it from the baseline, locking in the win). The band still prints as FAIL lines for the teams working through it; nothing in the runner changes. Bonus capture: 2 of the 273 have EMPTY suite labels (can-map-an-array, string->number) — live evidence for C9, the next checklist item. Validated end-to-end: GREEN on current tree (5800p/273f — 38 net passes above dc7aa709's 5762 from this loop's added pins). Runtime ~12 min. Test-only: no semantics edits, no push. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2.3 KiB
Executable File
2.3 KiB
Executable File