Files
rose-ash/plans/agent-briefings/sx-gate-loop.md
giles 88e03daf4b W14: pin K49 void-elements spec fix; discover sx_render.ml regen drift (test-only)
K49: area/base/embed/param/track were in VOID_ELEMENTS but missing from
HTML_TAGS — render fell through to "Undefined symbol: base". dc7aa709 fixed
spec/render.sx; add suite gate-K49-void-elements-renderable (3 tests): the
spec registry contains all five, and render-to-html renders each as a
self-closing void. 264 passed / 0 failed under OCaml run_tests.

DISCOVERY (recorded in the briefing's Blocked section): the generated
hosts/ocaml/lib/sx_render.ml was never regenerated after the spec fix — its
stale html_tags_list still lacks the five tags, so the runner's native
render-html path STILL errors. Fix is a bootstrap_render.py regen (hosts
lane, out of scope for this test-only loop). Live evidence for F13
(regen-diff CI gate). Pin covers the spec side only for now.

Also corrects the checklist label: K49 = void elements; the depth/cycle
guard is K16 (OPEN, W8).

Test-only: no semantics edits, no push.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-07-03 23:43:34 +00:00

131 lines
7.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# W14 — Test gate & conformance infrastructure loop
Forge agent **ws-W14**. Role: build out **W14** from the SX review remediation plan
(`plans/sx-review/PLAN.md`, §"W14. Test gate & conformance infrastructure") —
*the enabler that makes every other fix verifiable*. One checklist item per fire.
You are on branch `loops/sx-ws-w14`, worktree `/root/rose-ash-loops/sx-ws-w14`.
## Hard guardrails (read every fire)
- **TEST-ONLY.** No semantics edits. Do NOT touch `spec/evaluator.sx`,
`spec/primitives.sx`, `spec/parser.sx`, `spec/render.sx`, the OCaml kernel,
or any host runtime. W14 pins behavior with tests and productionizes the
*test/runner* surface; the actual fixes are other workstreams (W1W12).
A pin that *fails* means the finding regressed — do NOT relax the assertion,
record it as a blocker.
- **NO PUSH.** Commit locally on `loops/sx-ws-w14` only. Never push; never touch
`main` or `architecture`.
- **`.sx` files: use `sx-tree` MCP tools only** (a hook blocks Read/Write/Edit
on `.sx`). `sx_write_file` takes params **`file`** and **`source`** (NOT
`content` — a wrong key yields a `yojson … got null` error and no write).
`.md`/`.sh`/`.ml` files: normal tools are fine.
- **Never `pkill`/`kill` `sx_server`** — sibling loops share the binary. Bound
every run with `timeout` (e.g. `timeout 300 …`); if it hangs, let the timeout end it.
- **One item per fire, then stop.** No batching.
## Per-iteration procedure
1. Pick the first unchecked `[ ]` in the checklist.
2. Implement (test file or runner/harness change), lifting minimal repros from
the review lane files (`plans/sx-review/{core,hosts,conformance}.md`) — they
are a ready-made corpus of confirmed reprs.
3. Build + run the affected tests:
`sx_build` (target ocaml) then
`timeout 300 ./hosts/ocaml/_build/default/bin/run_tests.exe <test-name>`
to run a single file. New `spec/tests/test-*.sx` files are auto-discovered.
4. Confirm green (a pin must PASS on current HEAD — the fix already landed).
5. Commit locally: `git add -A && git commit` with a `W14:` prefix.
6. Tick the box, prepend one dated line to the Progress log, stop.
## Checklist
### A. Test-debt pins — dc7aa709's landed fixes shipped without regression tests
Pin each confirmed-and-fixed finding with a minimal repro. Add suites to
`spec/tests/test-gate-pins.sx` (one `defsuite` per finding).
- [x] K18 [W7] — `expt` overflow now float-promotes (no 63-bit wrap)
- [x] K20 [W7] — `contains?` now supports dict key membership
- [x] K09/K11/K39 [W5] — longhand `unquote-splicing`, guard sentinel gensym, `do` IIFE-head
- [x] K49 [W8] — five void elements (area/base/embed/param/track) renderable
(spec side; native regen drift → see Blocked). NB: the depth/cycle guard
is K16 [W8], still OPEN — not a W14 pin target until its fix lands
- [ ] crit-2 [W1] — signal-return frame key (verify the pin is non-vacuous)
- [ ] C1/C1b [W3] — HTTP-mode concurrency fixes, pin
- [ ] S4 [conformance] — housekeeping repro, pin
### B. Runner/production env unification
- [ ] Audit runner-only bindings (`values`/`call-with-values` F7/K42, JS
fake sha3/equal?/apply/env-set! shims JS5) — inventory + failing pin
that a fresh `sx_server` reproduces the drift
### C. Harness honesty
- [ ] K19 — MCP `mcp_tree.ml` harness primitive table drift vs `sx_primitives`
(parity test)
- [ ] C22/K104 — harness logs IO *before* invoking the mock (throwing-mock pin)
- [ ] C21 — real perform/suspend mode in harness
- [ ] C23 — adapter-dom render-output tests
### D. WASM corpus runner
- [ ] F2 — promote conformance's `run_wasm.js` prototype into CI
### E. Epoch-loop protocol fuzz + skip-list
- [ ] C3/C4/C5/C6/C7 — epoch protocol fuzz suite
- [ ] F10 — hs-upstream skip-list so browser-only FAILs mean something
- [ ] C9 — empty suite label
### F. Differential battery
- [ ] F8 — cross-host differential battery (same source, all hosts agree)
## Progress log (newest first)
- 2026-07-03 — **K49 void-elements pin (item A.4) + regen-drift DISCOVERY**.
Corrected the checklist label first: K49 is "five void elements
unrenderable" (core.md:335), not the depth guard (that's K16, OPEN). Added
suite `gate-K49-void-elements-renderable` (3 tests): spec `HTML_TAGS`
contains all five; `(render-to-html '(base :href "x") (make-env))`
`<base href="x" />`; all five render self-closing. Runner-env gotchas:
`current-env`/`symbol` are not bound in run_tests — use `(make-env)` and
literal quoted forms. **Discovery:** the first draft pinned via the
runner's native `render-html` and FAILED — `hosts/ocaml/lib/sx_render.ml`
(generated) was never regenerated after dc7aa709's spec fix, so the native
render path still errors on the five tags. Recorded under Blocked; live
evidence for F13 (regen-diff gate). 264 passed / 0 failed. Test-only.
- 2026-07-03 — **K09/K11/K39 W5 special-form pins (item A.3)**. Three suites
added to `spec/tests/test-gate-pins.sx`: `gate-K09-longhand-unquote-splicing`
(R7RS longhand `(unquote-splicing X)` now splices, incl. empty-list case;
shorthand still works), `gate-K11-guard-reraise-forgeable` (a body/clause
value shaped like `(list '__guard-reraise__ X)` is returned as data, not
misread as a re-raise — sentinel is now gensym'd), `gate-K39-do-iife-head`
(`(do ((fn (x) x) 5) 99)` → 99, not a misparsed do-loop — exact core.md
repro). Gotchas hit and fixed: quasiquoted bare idents are *symbols* not
strings, and `assert=` compares with `=` (not `equal?`, which returns false
on these spliced lists). 261 passed / 0 failed under OCaml run_tests. Test-only.
- 2026-07-03 — **K20 contains?-dict pin (item A.2)**. Mapped K-codes by
core.md severity order (K17 append!, K18 expt, K19 harness-drift, K20
contains?-dict). Added suite `gate-K20-contains-dict` to
`spec/tests/test-gate-pins.sx` (4 tests): present dict key → true, missing
key → false, list membership unchanged, string substring unchanged. Repro
from core.md ("(contains? {:a 1} :a) threw `contains?: 2 args`"). 8/8 green
across both suites under OCaml run_tests. Test-only.
- 2026-07-03 — **K18 expt-overflow pin (item A.1)**. Bootstrapped this briefing
from PLAN.md §W14 (the referenced file did not exist yet). Added
`spec/tests/test-gate-pins.sx` with suite `gate-K18-expt-overflow` (4 tests):
small exponents stay exact (`2^0=1`, `2^10=1024`), `2^62 > 0` (no negative
63-bit wrap), `2^100 > 0` (no wrap-to-zero), `2^100` is a number (float
promotion). Verified 4/4 green under the OCaml run_tests kernel. Test-only.
## Blocked
- **K49 native path — sx_render.ml regen drift** (found 2026-07-03 while
pinning A.4): dc7aa709 fixed HTML_TAGS in `spec/render.sx` but never re-ran
`hosts/ocaml/bootstrap_render.py`, so the generated
`hosts/ocaml/lib/sx_render.ml` still carries a stale `html_tags_list`
without area/base/embed/param/track. The runner's native `render-html`
convenience (and any native fast-path render) therefore STILL throws
`Undefined symbol: base` — dc7aa709's "verified on the native binary" claim
did not cover this path. Fix = regen (hosts lane, semantics-adjacent — out
of scope for this test-only loop). This is a live instance of **F13**
(regen-diff CI gate, section-B/D territory): a regen-diff check would have
caught it at commit time. The K49 pin covers the spec side only; when the
regen lands, extend the suite with `render-html`-path assertions.