Files
rose-ash/plans/agent-briefings/sx-gate-loop.md
giles f09368e1c2 W14: pin K18 expt-overflow float-promotion (test-only) + bootstrap gate briefing
The dc7aa709 quick-wins batch fixed `expt`'s silent 63-bit int wrap (now
promotes to float like +/*) but shipped no pinning test — a regression would
pass silently. Add spec/tests/test-gate-pins.sx suite gate-K18-expt-overflow
(4 tests, minimal reprs from plans/sx-review/core.md): small exponents exact,
2^62 and 2^100 do not wrap, 2^100 is a float. 4/4 green under OCaml run_tests.

Also bootstraps plans/agent-briefings/sx-gate-loop.md (the loop's own briefing,
absent until now) with the W14 checklist derived from PLAN.md §W14.

Test-only: no semantics edits, no push.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-03 22:33:22 +00:00

89 lines
4.2 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# W14 — Test gate & conformance infrastructure loop
Forge agent **ws-W14**. Role: build out **W14** from the SX review remediation plan
(`plans/sx-review/PLAN.md`, §"W14. Test gate & conformance infrastructure") —
*the enabler that makes every other fix verifiable*. One checklist item per fire.
You are on branch `loops/sx-ws-w14`, worktree `/root/rose-ash-loops/sx-ws-w14`.
## Hard guardrails (read every fire)
- **TEST-ONLY.** No semantics edits. Do NOT touch `spec/evaluator.sx`,
`spec/primitives.sx`, `spec/parser.sx`, `spec/render.sx`, the OCaml kernel,
or any host runtime. W14 pins behavior with tests and productionizes the
*test/runner* surface; the actual fixes are other workstreams (W1W12).
A pin that *fails* means the finding regressed — do NOT relax the assertion,
record it as a blocker.
- **NO PUSH.** Commit locally on `loops/sx-ws-w14` only. Never push; never touch
`main` or `architecture`.
- **`.sx` files: use `sx-tree` MCP tools only** (a hook blocks Read/Write/Edit
on `.sx`). `sx_write_file` takes params **`file`** and **`source`** (NOT
`content` — a wrong key yields a `yojson … got null` error and no write).
`.md`/`.sh`/`.ml` files: normal tools are fine.
- **Never `pkill`/`kill` `sx_server`** — sibling loops share the binary. Bound
every run with `timeout` (e.g. `timeout 300 …`); if it hangs, let the timeout end it.
- **One item per fire, then stop.** No batching.
## Per-iteration procedure
1. Pick the first unchecked `[ ]` in the checklist.
2. Implement (test file or runner/harness change), lifting minimal repros from
the review lane files (`plans/sx-review/{core,hosts,conformance}.md`) — they
are a ready-made corpus of confirmed reprs.
3. Build + run the affected tests:
`sx_build` (target ocaml) then
`timeout 300 ./hosts/ocaml/_build/default/bin/run_tests.exe <test-name>`
to run a single file. New `spec/tests/test-*.sx` files are auto-discovered.
4. Confirm green (a pin must PASS on current HEAD — the fix already landed).
5. Commit locally: `git add -A && git commit` with a `W14:` prefix.
6. Tick the box, prepend one dated line to the Progress log, stop.
## Checklist
### A. Test-debt pins — dc7aa709's landed fixes shipped without regression tests
Pin each confirmed-and-fixed finding with a minimal repro. Add suites to
`spec/tests/test-gate-pins.sx` (one `defsuite` per finding).
- [x] K18 [W7] — `expt` overflow now float-promotes (no 63-bit wrap)
- [ ] K20 [W7] — identify the landed W7 fix and pin it
- [ ] K09/K11/K39 [W5] — landed special-form fixes, pin each
- [ ] K49 [W8] — render depth/cycle guard (infinite recursive component)
- [ ] crit-2 [W1] — signal-return frame key (verify the pin is non-vacuous)
- [ ] C1/C1b [W3] — HTTP-mode concurrency fixes, pin
- [ ] S4 [conformance] — housekeeping repro, pin
### B. Runner/production env unification
- [ ] Audit runner-only bindings (`values`/`call-with-values` F7/K42, JS
fake sha3/equal?/apply/env-set! shims JS5) — inventory + failing pin
that a fresh `sx_server` reproduces the drift
### C. Harness honesty
- [ ] K19 — MCP `mcp_tree.ml` harness primitive table drift vs `sx_primitives`
(parity test)
- [ ] C22/K104 — harness logs IO *before* invoking the mock (throwing-mock pin)
- [ ] C21 — real perform/suspend mode in harness
- [ ] C23 — adapter-dom render-output tests
### D. WASM corpus runner
- [ ] F2 — promote conformance's `run_wasm.js` prototype into CI
### E. Epoch-loop protocol fuzz + skip-list
- [ ] C3/C4/C5/C6/C7 — epoch protocol fuzz suite
- [ ] F10 — hs-upstream skip-list so browser-only FAILs mean something
- [ ] C9 — empty suite label
### F. Differential battery
- [ ] F8 — cross-host differential battery (same source, all hosts agree)
## Progress log (newest first)
- 2026-07-03 — **K18 expt-overflow pin (item A.1)**. Bootstrapped this briefing
from PLAN.md §W14 (the referenced file did not exist yet). Added
`spec/tests/test-gate-pins.sx` with suite `gate-K18-expt-overflow` (4 tests):
small exponents stay exact (`2^0=1`, `2^10=1024`), `2^62 > 0` (no negative
63-bit wrap), `2^100 > 0` (no wrap-to-zero), `2^100` is a number (float
promotion). Verified 4/4 green under the OCaml run_tests kernel. Test-only.
## Blocked
- (none)