Commit Graph

11 Commits

Author SHA1 Message Date
0061db393c conformance: exclude tcl (foreign *.tcl programs vs expected annotations) — A1 worklist complete
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 55s
tcl conformance.sh walks foreign lib/tcl/tests/programs/*.tcl files, reads each
first line's '# expected: VALUE' annotation, uses python3 to escape the Tcl
source into an SX helper, evaluates via (tcl-eval-string ...), and string-compares
got vs expected in bash. No SX test suites and no SX counter/dict scoreboard, so
the shared driver can't drive it (same category as lua/js/forth). Left
conformance.sh untouched; recorded the exclusion.

This completes the A1 worklist: 4 migrated onto the shared driver (common-lisp,
erlang, feed, go) and 5 excluded as foreign runners (forth, js, ocaml,
smalltalk, tcl).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 13:03:45 +00:00
31603e636b conformance: exclude smalltalk (scrapes test.sh + foreign *.st corpus)
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 1m11s
smalltalk conformance.sh catalogs foreign lib/smalltalk/tests/programs/*.st
programs, runs 'bash lib/smalltalk/test.sh -v', and scrapes its output (the
'OK 403/403' summary plus per-file pass counts via awk). It loads no SX test
suites directly and emits no SX counter/dict scoreboard. This is the briefing's
own classification example ('smalltalk runs *.st via test.sh') and the same
'scrapes a test.sh' exclusion as ocaml/lua. Left conformance.sh untouched;
recorded the exclusion.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 12:42:44 +00:00
0309e3b5d5 conformance: exclude ocaml (scrapes lib/ocaml/test.sh + foreign .ml baseline)
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 54s
ocaml conformance.sh runs 'bash lib/ocaml/test.sh -v', scrapes its
human-readable ok/FAIL lines, and re-classifies each test into suites via bash
description-matching heuristics; it also scrapes lib/ocaml/baseline/run.sh
(foreign .ml programs). The underlying test.sh is a per-assertion epoch runner
(hundreds of individual (ocaml-test-...) evals, one epoch each) with no
suite-level counter variables or dict runners, so the driver's
counter/dict-scoreboard model has nothing to point at without rewriting the test
harness. 'Scrapes a test.sh' is the briefing's named exclusion criterion (test.sh
even notes it mirrors lib/lua/test.sh, the canonical excluded case). Left
conformance.sh untouched; recorded the exclusion.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 12:20:59 +00:00
93b27c74b5 conformance: exclude js (foreign test262 fixtures vs .expected files)
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 58s
js conformance.sh walks lib/js/test262-slice/**/*.js (foreign test262
fixtures), escapes each with python3, evals via (js-eval), and compares output
to a sibling .expected file by substring match — counting pass/fail in bash
against a >=50% target. It loads no SX test suites and emits no SX counter/dict
scoreboard (no scoreboard.json). The shared driver only epoch-loads SX preloads
and evals SX test suites emitting a scoreboard — it cannot drive a
foreign-fixture-vs-expected comparison harness (same category as
lua/forth/smalltalk). Left conformance.sh untouched; recorded the exclusion.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 11:58:45 +00:00
c00cca45ff conformance: migrate go onto shared driver (dict, 609/609 parity)
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 56s
Go has the same structure as erlang: suites load into one session and each
exposes a pass counter plus a *count* (total) counter rather than a fail
counter. MODE=dict fits — each suite's runner is a dict literal
{:passed P :failed (- count P) :total count}. No driver change; conformance.conf
+ 3-line shim, historical scoreboard schema preserved.

Parity verified 609/609 (0 fail), every suite matching baseline.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 11:37:46 +00:00
4b31828641 conformance: exclude forth (foreign Forth corpus via awk+python preprocessing)
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 40s
forth's conformance.sh reads a foreign Forth test corpus (Hayes Core core.fr),
preprocesses it with awk + an external python3 chunk-splitter that generates a
chunks.sx of raw source strings, then runs them through the interpreter via
(hayes-run-all). The shared driver only epoch-loads SX preloads and evals SX
test suites emitting a counter/dict scoreboard — it cannot reproduce the
external preprocessing pipeline over a foreign .fr corpus (same category as
lua/smalltalk). No SX tests/*.sx suites exist to migrate. Left conformance.sh
untouched; recorded the exclusion.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 11:11:49 +00:00
b4ecadaad9 conformance: migrate feed onto shared driver (counters, 189/189 parity)
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 34s
Feed is the canonical MODE=counters shape: each suite runs in a fresh session
with shared preloads and a single feed-test-pass/feed-test-fail pair. Lifted the
old script's inline epoch-2 counter + feed-test helper defs into
lib/feed/test-harness.sx (preloaded last) so the driver can load them before
each suite. conformance.conf + 3-line shim; historical scoreboard schema
preserved. No driver change needed.

Parity verified 189/189 (0 fail), every suite matching baseline.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 10:50:47 +00:00
bb85532cc6 conformance: migrate erlang onto shared driver (dict, 761/761 parity)
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 1m12s
Erlang's suites load into one session and each exposes a pass counter plus a
*count* (total) counter rather than a fail counter, so MODE=dict fits directly:
each suite's runner is a dict literal {:passed P :failed (- count P) :total count}.
No driver change needed (dict mode already supports arbitrary runner expressions).
conformance.conf + 3-line shim; historical scoreboard schema preserved.

Parity verified 761/761 (0 fail), every suite matching baseline.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 10:28:27 +00:00
2e7a08309c conformance: migrate common-lisp onto shared driver (counters, 487/487)
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 14m2s
Extend the shared driver's MODE=counters with a backward-compatible SUITES
format: name:file[:pass-var:fail-var[:extra-preload ...]]. Optional per-suite
counter symbols (override the global COUNTERS_PASS/COUNTERS_FAIL) and per-suite
preload chains (loaded after the global PRELOADS). Plain name:file entries are
unchanged — verified against haskell (fib/sieve/quicksort 2/2/5, matches
committed scoreboard).

common-lisp has 8 distinct per-suite counter pairs and a different preload
chain per suite, so it could not fit the single-counter/fixed-preload model;
the extended format expresses it directly. conformance.conf keeps the historical
scoreboard schema; conformance.sh becomes the 3-line shim.

Result 487/487 (0 fail) vs the old 305/0 baseline — higher and explained: the
old per-suite 'timeout 30' was too tight for the slow eval suite (~15-25s under
contention), silently recording it as 0; the driver's 180s budget recovers its
true 182. geometry/mop-trace stay 0/0 (pre-existing refl-class-chain-depth-with
load error; counter vars defined as 0 -> clean gc-result, no fail-fallback).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 09:55:44 +00:00
bfdd0fe65a conformance: record common-lisp blocker (per-suite counters + preloads)
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 1m6s
Classified migratable-in-kind (SX suites over epoch, not a foreign runner)
but blocked on driver feature gaps: 8 distinct per-suite counter variable
name pairs and per-suite preload chains, neither supported by MODE=counters
(single global counter + fixed preloads) nor MODE=dict (load-time counter
collisions across suites). Baseline 305/0 across 12 suites. Did not migrate;
conformance.sh left untouched. Driver unchanged (out of per-iteration scope).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 09:22:39 +00:00
e5686d2c31 conformance: A1 migration loop briefing (classify-then-migrate, parity-gated)
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 1m12s
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 09:16:38 +00:00