Files
rose-ash/plans/forth-on-sx.md
giles 0f67021aa3 plans: briefings + roadmaps for lua, prolog, forth, erlang, haskell
Five new guest-language plans mirroring the js-on-sx / hs-loop pattern, each
with a phased roadmap (Progress log + Blockers), a self-contained agent
briefing for respawning a long-lived loop, and a shared restore-all.sh that
snapshots state across all seven language loops.

Briefings bake in the lessons from today's stall debugging: never call
sx_build (600s watchdog), only touch lib/<lang>/** + own plan file, commit
every feature, update Progress log on each commit, route shared-file
issues to Blockers rather than fixing them.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 15:16:45 +00:00

107 lines
4.7 KiB
Markdown

# Forth-on-SX: stack language on the VM
The smallest serious second language — Forth's stack-based semantics map directly onto the SX bytecode VM (OP_DUP, OP_SWAP, OP_DROP already exist as arithmetic primitives or can be added trivially). Compile-mode / interpret-mode is the one genuinely novel piece, but it's a classic technique and small.
End-state goal: **passes John Hayes' ANS-Forth test suite** (the canonical Forth conformance harness — small, well-documented, targets the Core word set).
## Scope decisions (defaults — override)
- **Standard:** ANS-Forth 1994 Core word set + Core Extension. No ANS-Forth Optional word sets (File Access, Floating Point, Search Order, etc.) in the first run.
- **Test suite:** John Hayes' "Test Suite for ANS Forth" (~250 tests, public domain, widely used).
- **Case-sensitivity:** case-insensitive (ANS default).
- **Number base:** support `BASE` variable, defaults to 10. Hex and binary literals (`$FF`, `%1010`) per standard.
## Ground rules
- **Scope:** only touch `lib/forth/**` and `plans/forth-on-sx.md`. No edits to `spec/`, `hosts/`, `shared/`, or other language dirs.
- **SX files:** use `sx-tree` MCP tools only.
- **Architecture:** reader (not tokenizer — Forth is whitespace-delimited) → interpreter → dictionary-backed compiler. The compiler emits SX AST (not bytecode directly) so we inherit the VM.
- **Commits:** one feature per commit. Keep `## Progress log` updated.
## Architecture sketch
```
Forth source text
lib/forth/reader.sx — whitespace-split words (that's it — no real tokenizer)
lib/forth/interpreter.sx — interpret mode: look up word in dict, execute
lib/forth/compiler.sx — compile mode (`:` opens, `;` closes): emit SX AST
lib/forth/runtime.sx — stack ops, dictionary, BASE, I/O
existing CEK / VM — runs compiled definitions natively
```
Representation:
- **Stack** = SX list, push = cons, pop = uncons
- **Dictionary** = dict `word-name → {:kind :immediate? :body}` where kind is `:primitive` or `:colon-def`
- **A colon definition** compiles to a thunk `(lambda () <body-as-sx-sequence>)`
- **Compile-mode** is a flag on the interpreter state; `:` sets it, `;` clears and installs the new word
- **IMMEDIATE** words run at compile time
## Roadmap
### Phase 1 — reader + interpret mode
- [ ] `lib/forth/reader.sx`: whitespace-split, number parsing (base-aware)
- [ ] `lib/forth/runtime.sx`: stack as SX list, push/pop/peek helpers
- [ ] Core stack words: `DUP`, `DROP`, `SWAP`, `OVER`, `ROT`, `NIP`, `TUCK`, `PICK`, `ROLL`, `?DUP`, `2DUP`, `2DROP`, `2SWAP`, `2OVER`
- [ ] Arithmetic: `+`, `-`, `*`, `/`, `MOD`, `/MOD`, `NEGATE`, `ABS`, `MIN`, `MAX`, `1+`, `1-`, `2*`, `2/`
- [ ] Comparison: `=`, `<`, `>`, `<=`, `>=`, `0=`, `0<`, `0>`
- [ ] Logical: `AND`, `OR`, `XOR`, `INVERT`
- [ ] I/O: `.` (print), `.S` (show stack), `EMIT`, `CR`, `SPACE`, `SPACES`
- [ ] Interpreter loop: read word, look up, execute, repeat
- [ ] Unit tests in `lib/forth/tests/interp.sx`
### Phase 2 — colon definitions + compile mode
- [ ] `:` opens compile mode and starts a definition
- [ ] `;` closes it and installs into the dictionary
- [ ] Compile mode: non-IMMEDIATE words get appended as SX references; numbers get compiled as literals; IMMEDIATE words (like `IF`) run now
- [ ] `VARIABLE`, `CONSTANT`, `VALUE`, `TO`
- [ ] `@` (fetch), `!` (store), `+!`
- [ ] Compile a colon def into an SX lambda that the CEK runs directly
- [ ] Tests: define words, call them, nest definitions
### Phase 3 — control flow + first Hayes tests green
- [ ] `IF`, `ELSE`, `THEN` — compile to SX `if`
- [ ] `BEGIN`, `UNTIL`, `WHILE`, `REPEAT`, `AGAIN` — compile to loops
- [ ] `DO`, `LOOP`, `+LOOP`, `I`, `J`, `LEAVE` — counted loops (needs a return stack)
- [ ] Return stack: `>R`, `R>`, `R@`, `2>R`, `2R>`, `2R@`
- [ ] Vendor John Hayes' test suite to `lib/forth/ans-tests/`
- [ ] `lib/forth/conformance.sh` + runner; `scoreboard.json` + `scoreboard.md`
- [ ] Baseline: probably 30-50% Core passing after phase 3
### Phase 4 — strings + more Core
- [ ] `S"`, `C"`, `."`, `TYPE`, `COUNT`, `CMOVE`, `FILL`, `BLANK`
- [ ] `CHAR`, `[CHAR]`, `KEY`, `ACCEPT`
- [ ] `BASE` manipulation: `DECIMAL`, `HEX`
- [ ] `DEPTH`, `SP@`, `SP!`
- [ ] Drive Hayes Core pass-rate up
### Phase 5 — Core Extension + optional word sets
- [ ] Full Core + Core Extension
- [ ] File Access word set (via SX IO)
- [ ] String word set (`SLITERAL`, `COMPARE`, `SEARCH`)
- [ ] Target: 100% Hayes Core
### Phase 6 — speed
- [ ] Inline primitive calls during compile (skip dict lookup)
- [ ] Tail-call optimise colon-def endings
- [ ] JIT cooperation: mark compiled colon-defs as VM-eligible
## Progress log
_Newest first._
- _(not started)_
## Blockers
- _(none yet)_