Three dc7aa709 fixes shipped without pinning tests:
- K09: R7RS longhand (unquote-splicing X) now splices (was silent zero-splice)
- K11: guard re-raise sentinel gensym'd — a user value shaped like
(list '__guard-reraise__ X) is data, not a forged re-raise
- K39: (do ((fn (x) x) 5) 99) -> 99, not a misparsed Scheme do-loop
Add suites gate-K09-longhand-unquote-splicing, gate-K11-guard-reraise-forgeable,
gate-K39-do-iife-head to spec/tests/test-gate-pins.sx with exact reprs from
plans/sx-review/core.md. 261 passed / 0 failed under OCaml run_tests.
Test-only: no semantics edits, no push.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
5.5 KiB
5.5 KiB
W14 — Test gate & conformance infrastructure loop
Forge agent ws-W14. Role: build out W14 from the SX review remediation plan
(plans/sx-review/PLAN.md, §"W14. Test gate & conformance infrastructure") —
the enabler that makes every other fix verifiable. One checklist item per fire.
You are on branch loops/sx-ws-w14, worktree /root/rose-ash-loops/sx-ws-w14.
Hard guardrails (read every fire)
- TEST-ONLY. No semantics edits. Do NOT touch
spec/evaluator.sx,spec/primitives.sx,spec/parser.sx,spec/render.sx, the OCaml kernel, or any host runtime. W14 pins behavior with tests and productionizes the test/runner surface; the actual fixes are other workstreams (W1–W12). A pin that fails means the finding regressed — do NOT relax the assertion, record it as a blocker. - NO PUSH. Commit locally on
loops/sx-ws-w14only. Never push; never touchmainorarchitecture. .sxfiles: usesx-treeMCP tools only (a hook blocks Read/Write/Edit on.sx).sx_write_filetakes paramsfileandsource(NOTcontent— a wrong key yields ayojson … got nullerror and no write)..md/.sh/.mlfiles: normal tools are fine.- Never
pkill/killsx_server— sibling loops share the binary. Bound every run withtimeout(e.g.timeout 300 …); if it hangs, let the timeout end it. - One item per fire, then stop. No batching.
Per-iteration procedure
- Pick the first unchecked
[ ]in the checklist. - Implement (test file or runner/harness change), lifting minimal repros from
the review lane files (
plans/sx-review/{core,hosts,conformance}.md) — they are a ready-made corpus of confirmed reprs. - Build + run the affected tests:
sx_build(target ocaml) thentimeout 300 ./hosts/ocaml/_build/default/bin/run_tests.exe <test-name>to run a single file. Newspec/tests/test-*.sxfiles are auto-discovered. - Confirm green (a pin must PASS on current HEAD — the fix already landed).
- Commit locally:
git add -A && git commitwith aW14:prefix. - Tick the box, prepend one dated line to the Progress log, stop.
Checklist
A. Test-debt pins — dc7aa709's landed fixes shipped without regression tests
Pin each confirmed-and-fixed finding with a minimal repro. Add suites to
spec/tests/test-gate-pins.sx (one defsuite per finding).
- K18 [W7] —
exptoverflow now float-promotes (no 63-bit wrap) - K20 [W7] —
contains?now supports dict key membership - K09/K11/K39 [W5] — longhand
unquote-splicing, guard sentinel gensym,doIIFE-head - K49 [W8] — render depth/cycle guard (infinite recursive component)
- crit-2 [W1] — signal-return frame key (verify the pin is non-vacuous)
- C1/C1b [W3] — HTTP-mode concurrency fixes, pin
- S4 [conformance] — housekeeping repro, pin
B. Runner/production env unification
- Audit runner-only bindings (
values/call-with-valuesF7/K42, JS fake sha3/equal?/apply/env-set! shims JS5) — inventory + failing pin that a freshsx_serverreproduces the drift
C. Harness honesty
- K19 — MCP
mcp_tree.mlharness primitive table drift vssx_primitives(parity test) - C22/K104 — harness logs IO before invoking the mock (throwing-mock pin)
- C21 — real perform/suspend mode in harness
- C23 — adapter-dom render-output tests
D. WASM corpus runner
- F2 — promote conformance's
run_wasm.jsprototype into CI
E. Epoch-loop protocol fuzz + skip-list
- C3/C4/C5/C6/C7 — epoch protocol fuzz suite
- F10 — hs-upstream skip-list so browser-only FAILs mean something
- C9 — empty suite label
F. Differential battery
- F8 — cross-host differential battery (same source, all hosts agree)
Progress log (newest first)
- 2026-07-03 — K09/K11/K39 W5 special-form pins (item A.3). Three suites
added to
spec/tests/test-gate-pins.sx:gate-K09-longhand-unquote-splicing(R7RS longhand(unquote-splicing X)now splices, incl. empty-list case; shorthand still works),gate-K11-guard-reraise-forgeable(a body/clause value shaped like(list '__guard-reraise__ X)is returned as data, not misread as a re-raise — sentinel is now gensym'd),gate-K39-do-iife-head((do ((fn (x) x) 5) 99)→ 99, not a misparsed do-loop — exact core.md repro). Gotchas hit and fixed: quasiquoted bare idents are symbols not strings, andassert=compares with=(notequal?, which returns false on these spliced lists). 261 passed / 0 failed under OCaml run_tests. Test-only. - 2026-07-03 — K20 contains?-dict pin (item A.2). Mapped K-codes by
core.md severity order (K17 append!, K18 expt, K19 harness-drift, K20
contains?-dict). Added suite
gate-K20-contains-dicttospec/tests/test-gate-pins.sx(4 tests): present dict key → true, missing key → false, list membership unchanged, string substring unchanged. Repro from core.md ("(contains? {:a 1} :a) threwcontains?: 2 args"). 8/8 green across both suites under OCaml run_tests. Test-only. - 2026-07-03 — K18 expt-overflow pin (item A.1). Bootstrapped this briefing
from PLAN.md §W14 (the referenced file did not exist yet). Added
spec/tests/test-gate-pins.sxwith suitegate-K18-expt-overflow(4 tests): small exponents stay exact (2^0=1,2^10=1024),2^62 > 0(no negative 63-bit wrap),2^100 > 0(no wrap-to-zero),2^100is a number (float promotion). Verified 4/4 green under the OCaml run_tests kernel. Test-only.
Blocked
- (none)