GUEST: step 1 — lib/guest/conformance.{sx,sh} config-driven driver
Extracted the duplicated conformance plumbing into a single driver:
- lib/guest/conformance.sx — two helper fns that emit (gc-result NAME P F T)
lines for the bash side to grep: gc-dict-result for runners returning
a {:passed :failed :total} dict, and gc-counters-result for guests that
bump a global pass/fail counter from a test file load.
- lib/guest/conformance.sh — config-driven bash driver. Sources a per-lang
conf, locates sx_server, runs sx_server in either single-session "dict"
mode (one preload + many suite evals) or per-suite "counters" mode
(fresh sx_server per suite, with shared preloads). Aggregates and writes
scoreboard.{json,md} via per-lang emit_scoreboard_* functions.
- Ported lib/prolog/conformance.sh and lib/haskell/conformance.sh down to
one-line wrappers that exec the shared driver against their .conf file.
Verification:
- Prolog: 590/590 — diff vs baseline is timestamp-only.
- Haskell: 156/156 — significantly higher than the 0/18 in baseline. The
old conformance.sh was buggy (its `(ok-len 3 ...)` grep never matched,
defaulting every program to 0 pass / 1 fail). Updated baseline to the
true count; no actual test regressed. Plan baseline cell updated.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -1,121 +1,122 @@
|
||||
{
|
||||
"lang": "haskell",
|
||||
"captured": "2026-05-06T22:01:00Z",
|
||||
"captured": "2026-05-06T22:46:16Z",
|
||||
"suite_command": "bash lib/haskell/conformance.sh",
|
||||
"totals": {
|
||||
"pass": 0,
|
||||
"fail": 18,
|
||||
"total": 18
|
||||
"pass": 156,
|
||||
"fail": 0,
|
||||
"total": 156
|
||||
},
|
||||
"suites": [
|
||||
{
|
||||
"name": "fib",
|
||||
"pass": 0,
|
||||
"fail": 1,
|
||||
"total": 1
|
||||
"pass": 2,
|
||||
"fail": 0,
|
||||
"total": 2
|
||||
},
|
||||
{
|
||||
"name": "sieve",
|
||||
"pass": 0,
|
||||
"fail": 1,
|
||||
"total": 1
|
||||
"pass": 2,
|
||||
"fail": 0,
|
||||
"total": 2
|
||||
},
|
||||
{
|
||||
"name": "quicksort",
|
||||
"pass": 0,
|
||||
"fail": 1,
|
||||
"total": 1
|
||||
"pass": 5,
|
||||
"fail": 0,
|
||||
"total": 5
|
||||
},
|
||||
{
|
||||
"name": "nqueens",
|
||||
"pass": 0,
|
||||
"fail": 1,
|
||||
"total": 1
|
||||
"pass": 2,
|
||||
"fail": 0,
|
||||
"total": 2
|
||||
},
|
||||
{
|
||||
"name": "calculator",
|
||||
"pass": 0,
|
||||
"fail": 1,
|
||||
"total": 1
|
||||
"pass": 5,
|
||||
"fail": 0,
|
||||
"total": 5
|
||||
},
|
||||
{
|
||||
"name": "collatz",
|
||||
"pass": 0,
|
||||
"fail": 1,
|
||||
"total": 1
|
||||
"pass": 11,
|
||||
"fail": 0,
|
||||
"total": 11
|
||||
},
|
||||
{
|
||||
"name": "palindrome",
|
||||
"pass": 0,
|
||||
"fail": 1,
|
||||
"total": 1
|
||||
"pass": 8,
|
||||
"fail": 0,
|
||||
"total": 8
|
||||
},
|
||||
{
|
||||
"name": "maybe",
|
||||
"pass": 0,
|
||||
"fail": 1,
|
||||
"total": 1
|
||||
"pass": 12,
|
||||
"fail": 0,
|
||||
"total": 12
|
||||
},
|
||||
{
|
||||
"name": "fizzbuzz",
|
||||
"pass": 0,
|
||||
"fail": 1,
|
||||
"total": 1
|
||||
"pass": 12,
|
||||
"fail": 0,
|
||||
"total": 12
|
||||
},
|
||||
{
|
||||
"name": "anagram",
|
||||
"pass": 0,
|
||||
"fail": 1,
|
||||
"total": 1
|
||||
"pass": 9,
|
||||
"fail": 0,
|
||||
"total": 9
|
||||
},
|
||||
{
|
||||
"name": "roman",
|
||||
"pass": 0,
|
||||
"fail": 1,
|
||||
"total": 1
|
||||
"pass": 14,
|
||||
"fail": 0,
|
||||
"total": 14
|
||||
},
|
||||
{
|
||||
"name": "binary",
|
||||
"pass": 0,
|
||||
"fail": 1,
|
||||
"total": 1
|
||||
"pass": 12,
|
||||
"fail": 0,
|
||||
"total": 12
|
||||
},
|
||||
{
|
||||
"name": "either",
|
||||
"pass": 0,
|
||||
"fail": 1,
|
||||
"total": 1
|
||||
"pass": 12,
|
||||
"fail": 0,
|
||||
"total": 12
|
||||
},
|
||||
{
|
||||
"name": "primes",
|
||||
"pass": 0,
|
||||
"fail": 1,
|
||||
"total": 1
|
||||
"pass": 12,
|
||||
"fail": 0,
|
||||
"total": 12
|
||||
},
|
||||
{
|
||||
"name": "zipwith",
|
||||
"pass": 0,
|
||||
"fail": 1,
|
||||
"total": 1
|
||||
"pass": 9,
|
||||
"fail": 0,
|
||||
"total": 9
|
||||
},
|
||||
{
|
||||
"name": "matrix",
|
||||
"pass": 0,
|
||||
"fail": 1,
|
||||
"total": 1
|
||||
"pass": 8,
|
||||
"fail": 0,
|
||||
"total": 8
|
||||
},
|
||||
{
|
||||
"name": "wordcount",
|
||||
"pass": 0,
|
||||
"fail": 1,
|
||||
"total": 1
|
||||
"pass": 7,
|
||||
"fail": 0,
|
||||
"total": 7
|
||||
},
|
||||
{
|
||||
"name": "powers",
|
||||
"pass": 0,
|
||||
"fail": 1,
|
||||
"total": 1
|
||||
"pass": 14,
|
||||
"fail": 0,
|
||||
"total": 14
|
||||
}
|
||||
],
|
||||
"source_scoreboard": "lib/haskell/scoreboard.json"
|
||||
"source_scoreboard": "lib/haskell/scoreboard.json",
|
||||
"note": "Step 1: previous baseline (0/18) was an artefact of the old conformance.sh bug \u2014 its (ok-len 3 ...) grep never matched, defaulting every program to 0 pass / 1 fail. Shared driver in Step 1 reads counters correctly."
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user