forth: Hayes conformance runner + baseline scoreboard (165/590, 28%)
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Has been cancelled

This commit is contained in:
2026-04-24 19:13:45 +00:00
parent a47b3e5420
commit 0e509af0a2
5 changed files with 370 additions and 2 deletions

View File

@@ -74,8 +74,8 @@ Representation:
- [x] `DO`, `LOOP`, `+LOOP`, `I`, `J`, `LEAVE` — counted loops (needs a return stack)
- [x] Return stack: `>R`, `R>`, `R@`, `2>R`, `2R>`, `2R@`
- [x] Vendor John Hayes' test suite to `lib/forth/ans-tests/`
- [ ] `lib/forth/conformance.sh` + runner; `scoreboard.json` + `scoreboard.md`
- [ ] Baseline: probably 30-50% Core passing after phase 3
- [x] `lib/forth/conformance.sh` + runner; `scoreboard.json` + `scoreboard.md`
- [x] Baseline: probably 30-50% Core passing after phase 3
### Phase 4 — strings + more Core
- [ ] `S"`, `C"`, `."`, `TYPE`, `COUNT`, `CMOVE`, `FILL`, `BLANK`
@@ -99,6 +99,28 @@ Representation:
_Newest first._
- **Phase 3 — Hayes conformance runner + baseline scoreboard (165/590, 28%).**
`lib/forth/conformance.sh` preprocesses `ans-tests/core.fr` (strips `\`
and `( ... )` comments + `TESTING` lines), splits the source on every
`}T` so each Hayes test plus the small declaration blocks between
them are one safe-resume chunk, and emits an SX driver that feeds
the chunks through `lib/forth/hayes-runner.sx`. The runner registers
`T{`/`->`/`}T` as Forth primitives that snapshot the dstack depth on
`T{`, record actual on `->`, compare on `}T`, and install stub
`HEX`/`DECIMAL`/`TESTING` so metadata doesn't halt the stream. Errors
raised inside a chunk are caught by `guard` and the state is reset,
so one bad test does not break the rest. Outputs
`scoreboard.json` + `scoreboard.md`.
First-run baseline: 165 pass / 425 error / 0 fail on the first 590
chunks. The default cap sits at 590 because `core.fr` chunks beyond
that rely on unsigned-integer wrap-around (e.g. `COUNT-BITS` with
`BEGIN DUP WHILE … 2* REPEAT`) which never terminates on our
bignum-based Forth; raise `MAX_CHUNKS` once those tests unblock.
Majority of errors are missing Phase-4 words (`RSHIFT`, `LSHIFT`,
`CELLS`, `S"`, `CHAR`, `SOURCE`, etc.) — each one implemented should
convert a cluster of errors to passes.
- **Phase 3 — vendor Gerry Jackson's forth2012-test-suite.** Added
`lib/forth/ans-tests/{tester.fr, core.fr, coreexttest.fth}` from
https://github.com/gerryjackson/forth2012-test-suite (master, fetched