W14: pin K19 MCP-harness/runtime primitive parity (test-only)

mcp_tree.ml's parallel primitive table drifted from sx_primitives.ml —
the spec-mandated harness verification path silently produced false
findings ((get {:a 1} :a 99) -> nil vs 1, char-class vs substring split,
etc.). dc7aa709 aligned 8 entries as a stopgap; the real fix (linking
sx_primitives) is hosts-lane.

Add scripts/test-harness-parity.sh: drives mcp_tree.exe sx_eval via raw
JSON-RPC and a fresh sx_server.exe via the epoch protocol, runs the
finding's 12-probe battery through both, fails on any divergence (errors
compared by inner message). 12/12 parity today — the stopgap holds and
can no longer rot silently.

Test-only: no semantics edits, no push.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
2026-07-04 01:41:07 +00:00
parent f6d584629e
commit 01e5f876bc
2 changed files with 121 additions and 2 deletions

View File

@@ -66,8 +66,9 @@ Pin each confirmed-and-fixed finding with a minimal repro. Add suites to
stubs → test CIDs ≠ production CIDs)
### C. Harness honesty
- [ ] K19 — MCP `mcp_tree.ml` harness primitive table drift vs `sx_primitives`
(parity test)
- [x] K19 — harness/runtime parity pinned (`scripts/test-harness-parity.sh`:
drives mcp_tree sx_eval over JSON-RPC vs fresh sx_server over epoch,
12-probe battery from the finding, errors compared by message)
- [ ] C22/K104 — harness logs IO *before* invoking the mock (throwing-mock pin)
- [ ] C21 — real perform/suspend mode in harness
- [ ] C23 — adapter-dom render-output tests
@@ -85,6 +86,17 @@ Pin each confirmed-and-fixed finding with a minimal repro. Add suites to
## Progress log (newest first)
- 2026-07-04 — **K19 harness-parity pin (item C.1)**. Authored
`scripts/test-harness-parity.sh`: drives `mcp_tree.exe` `sx_eval` with
raw JSON-RPC over stdio and a fresh `sx_server.exe` over the epoch
protocol, running the finding's exact 12-probe battery (empty?/get/
split/equal?/contains?/keyword-name/char-code/parse-number) through both
and failing on ANY divergence. Errors normalized to their inner message
so identical failures compare equal (`keyword-name :kw` errors the same
way on both — keywords evaluate to strings before the call). Result:
12/12 parity — dc7aa709's 8-entry stopgap alignment holds; this pin keeps
it honest until the real fix (mcp_tree links sx_primitives) lands in the
hosts lane. Test-only.
- 2026-07-04 — **Section B: env-parity audit + ledger**. Probed a fresh
`sx_server` over the epoch protocol (`deps-check` + live eval). Confirmed
runner-only drift: `values`/`call-with-values` (run_tests.ml:1131/1140),