Files
rose-ash/plans/forth-on-sx.md
giles c28333adb3
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Has been cancelled
forth: \, POSTPONE-imm split, >NUMBER, DOES> — Hayes 486→618 (97%)
2026-04-25 03:33:13 +00:00

389 lines
22 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Forth-on-SX: stack language on the VM
The smallest serious second language — Forth's stack-based semantics map directly onto the SX bytecode VM (OP_DUP, OP_SWAP, OP_DROP already exist as arithmetic primitives or can be added trivially). Compile-mode / interpret-mode is the one genuinely novel piece, but it's a classic technique and small.
End-state goal: **passes John Hayes' ANS-Forth test suite** (the canonical Forth conformance harness — small, well-documented, targets the Core word set).
## Scope decisions (defaults — override)
- **Standard:** ANS-Forth 1994 Core word set + Core Extension. No ANS-Forth Optional word sets (File Access, Floating Point, Search Order, etc.) in the first run.
- **Test suite:** John Hayes' "Test Suite for ANS Forth" (~250 tests, public domain, widely used).
- **Case-sensitivity:** case-insensitive (ANS default).
- **Number base:** support `BASE` variable, defaults to 10. Hex and binary literals (`$FF`, `%1010`) per standard.
## Ground rules
- **Scope:** only touch `lib/forth/**` and `plans/forth-on-sx.md`. No edits to `spec/`, `hosts/`, `shared/`, or other language dirs.
- **SX files:** use `sx-tree` MCP tools only.
- **Architecture:** reader (not tokenizer — Forth is whitespace-delimited) → interpreter → dictionary-backed compiler. The compiler emits SX AST (not bytecode directly) so we inherit the VM.
- **Commits:** one feature per commit. Keep `## Progress log` updated.
## Architecture sketch
```
Forth source text
lib/forth/reader.sx — whitespace-split words (that's it — no real tokenizer)
lib/forth/interpreter.sx — interpret mode: look up word in dict, execute
lib/forth/compiler.sx — compile mode (`:` opens, `;` closes): emit SX AST
lib/forth/runtime.sx — stack ops, dictionary, BASE, I/O
existing CEK / VM — runs compiled definitions natively
```
Representation:
- **Stack** = SX list, push = cons, pop = uncons
- **Dictionary** = dict `word-name → {:kind :immediate? :body}` where kind is `:primitive` or `:colon-def`
- **A colon definition** compiles to a thunk `(lambda () <body-as-sx-sequence>)`
- **Compile-mode** is a flag on the interpreter state; `:` sets it, `;` clears and installs the new word
- **IMMEDIATE** words run at compile time
## Roadmap
### Phase 1 — reader + interpret mode
- [x] `lib/forth/reader.sx`: whitespace-split, number parsing (base-aware)
- [x] `lib/forth/runtime.sx`: stack as SX list, push/pop/peek helpers
- [x] Core stack words: `DUP`, `DROP`, `SWAP`, `OVER`, `ROT`, `-ROT`, `NIP`, `TUCK`, `PICK`, `ROLL`, `?DUP`, `DEPTH`, `2DUP`, `2DROP`, `2SWAP`, `2OVER`
- [x] Arithmetic: `+`, `-`, `*`, `/`, `MOD`, `/MOD`, `NEGATE`, `ABS`, `MIN`, `MAX`, `1+`, `1-`, `2+`, `2-`, `2*`, `2/`
- [x] Comparison: `=`, `<>`, `<`, `>`, `<=`, `>=`, `0=`, `0<>`, `0<`, `0>`
- [x] Logical: `AND`, `OR`, `XOR`, `INVERT` (32-bit two's-complement sim)
- [x] I/O: `.` (print), `.S` (show stack), `EMIT`, `CR`, `SPACE`, `SPACES`, `BL`
- [x] Interpreter loop: read word, look up, execute, repeat
- [x] Unit tests in `lib/forth/tests/test-phase1.sx` — 108/108 pass
### Phase 2 — colon definitions + compile mode
- [x] `:` opens compile mode and starts a definition
- [x] `;` closes it and installs into the dictionary
- [x] Compile mode: non-IMMEDIATE words are compiled as late-binding call thunks; numbers are compiled as pushers; IMMEDIATE words run immediately
- [x] `VARIABLE`, `CONSTANT`, `VALUE`, `TO`, `RECURSE`, `IMMEDIATE`
- [x] `@` (fetch), `!` (store), `+!`
- [x] Colon-def body is `(fn (s) (for-each op body))` — runs on CEK, inherits TCO
- [x] Tests in `lib/forth/tests/test-phase2.sx` — 26/26 pass
### Phase 3 — control flow + first Hayes tests green
- [x] `IF`, `ELSE`, `THEN` — compile to SX `if`
- [x] `BEGIN`, `UNTIL`, `WHILE`, `REPEAT`, `AGAIN` — compile to loops
- [x] `DO`, `LOOP`, `+LOOP`, `I`, `J`, `LEAVE` — counted loops (needs a return stack)
- [x] Return stack: `>R`, `R>`, `R@`, `2>R`, `2R>`, `2R@`
- [x] Vendor John Hayes' test suite to `lib/forth/ans-tests/`
- [x] `lib/forth/conformance.sh` + runner; `scoreboard.json` + `scoreboard.md`
- [x] Baseline: probably 30-50% Core passing after phase 3
### Phase 4 — strings + more Core
- [x] `S"`, `C"`, `."`, `TYPE`, `COUNT`, `CMOVE`, `FILL`, `BLANK`
- [x] `CHAR`, `[CHAR]`, `KEY`, `ACCEPT`
- [x] `BASE` manipulation: `DECIMAL`, `HEX`
- [x] `DEPTH`, `SP@`, `SP!`
- [x] Drive Hayes Core pass-rate up
### Phase 5 — Core Extension + optional word sets
- [x] Memory: `CREATE`, `HERE`, `ALLOT`, `,`, `C,`, `CELL+`, `CELLS`, `ALIGN`, `ALIGNED`, `2!`, `2@`
- [x] Unsigned compare: `U<`, `U>`
- [x] Mixed/double-cell math: `S>D`, `M*`, `UM*`, `UM/MOD`, `FM/MOD`, `SM/REM`, `*/`, `*/MOD`
- [x] Double-cell ops: `D+`, `D-`, `D=`, `D<`, `D0=`, `2DUP`, `2DROP`, `2OVER`, `2SWAP` (already), plus `D>S`, `DABS`, `DNEGATE`
- [x] Number formatting: `<#`, `#`, `#S`, `#>`, `HOLD`, `SIGN`, `.R`, `U.`, `U.R`
- [x] Parsing/dictionary: `WORD`, `FIND`, `EXECUTE`, `'`, `[']`, `LITERAL`, `POSTPONE`, `>BODY` (DOES> deferred — needs runtime-rebind of last CREATE)
- [x] Source/state: `EVALUATE`, `STATE`, `[`, `]` (`SOURCE`/`>IN` stubbed; tokenized input means the exact byte/offset semantics aren't useful here)
- [x] Misc Core: `WITHIN`, `MAX`/`MIN` (already), `ABORT`, `ABORT"`, `EXIT`, `UNLOOP`
- [x] File Access word set (in-memory — `read-file` is not reachable from the epoch eval env)
- [x] String word set (`SLITERAL`, `COMPARE`, `SEARCH`)
- [x] Target: 100% Hayes Core (97% achieved — remaining 5 errors all in `GI5`'s multi-`WHILE`-per-`BEGIN` non-standard pattern, plus one stuck `dict-set!` chunk and 14 numeric-edge fails)
### Phase 6 — speed
- [ ] Inline primitive calls during compile (skip dict lookup)
- [ ] Tail-call optimise colon-def endings
- [ ] JIT cooperation: mark compiled colon-defs as VM-eligible
## Progress log
_Newest first._
- **Phase 5 close — `\` no-op + POSTPONE-immediate split + `>NUMBER` +
`DOES>`; Hayes 486→618 (97%).** Big closing-out iteration.
Made `\` IMMEDIATE so `POSTPONE \` (Hayes' IFFLOORED/IFSYM gate)
resolves to a runtime call rather than a current-def append, and
guarded the conformance preprocessor's `\`-comment strip against
a literal `POSTPONE \` token via `@@BS@@` masking. Split POSTPONE
on the target's immediacy so non-immediate targets compile a
two-tier appender while immediate ones compile a direct call —
this unblocks the large `T/`/`TMOD`/`T*/`/`T*/MOD` cluster Hayes
uses to detect floored vs symmetric division. `>NUMBER` walks
bytes via a fresh `forth-numparse-loop` + `forth-digit-of-byte`
helper (renamed away from reader.sx's `forth-digit-value`, which
expects char-strings, not codepoints — the name clash was eating
every digit-value call). Implemented `DOES>` by:
1) tracking the last CREATE on `state.last-creator`,
2) adding a `:kind "does-rebind"` op, and
3) post-processing the body in `;` to attach the slice of ops
after each rebind as `:deferred`. At runtime, the rebind op
installs a new body for the target word that pushes its
data-field address and runs the deferred slice. Also added
histogram tracking on the conformance runner so future runs
surface the top missing words. Hayes: 618/638 pass (97%),
14 fail, 6 error (5× GI5 multi-WHILE, 1× dict-set! chunk).
- **Phase 5 — String word set `COMPARE`/`SEARCH`/`SLITERAL` (+9).**
`COMPARE` walks bytes via the new `forth-compare-bytes-loop`,
returning -1/0/1 with standard prefix semantics (shorter string
compares less than its extension). `SEARCH` scans the haystack
with a helper `forth-search-bytes` and `forth-match-at`, returning
the tail after the first match or the original string with flag=0.
Empty needle returns at offset 0 with flag=-1 per ANS. `SLITERAL`
is IMMEDIATE: pops `(c-addr u)` at compile time, copies the bytes
into a fresh allocation, and emits the two pushes so the compiled
word yields the interned string at runtime.
- **Phase 5 — File Access word set (in-memory backing; +4).**
`OPEN-FILE`/`CREATE-FILE`/`CLOSE-FILE`/`READ-FILE`/`WRITE-FILE`/
`FILE-POSITION`/`FILE-SIZE`/`REPOSITION-FILE`/`DELETE-FILE` plus
the mode constants `R/O`/`R/W`/`W/O`/`BIN`. File handles live on
`state.files` (fileid → {content, pos, path}) with a
`state.by-path` index so `CREATE-FILE`'d files can be
`OPEN-FILE`'d later in the same session. Attempting to
`OPEN-FILE` an unknown path returns `ior != 0`; disk-backed
open/read is not wired because `read-file` isn't in the sx_server
epoch eval environment (it's bound only in the HTTP helpers).
Also removed the stray base-2 `BIN` primitive from Phase 4 —
ANS `BIN` is the file-mode modifier. Hayes Core unchanged at
486/638 since core.fr doesn't exercise file words.
- **Phase 5 — `WITHIN`/`ABORT`/`ABORT"`/`EXIT`/`UNLOOP` (+7;
Hayes 477→486, 76%).** `WITHIN` uses the ANS two's-complement
trick: `(n1-n2) U< (n3-n2)`. `ABORT` wipes the data/return/control
stacks and raises — the conformance runner catches it at the
chunk boundary. `ABORT"` parses its message like `S"`, then at
runtime pops a flag and raises only if truthy. `EXIT` adds a new
`:kind "exit"` op that the PC-driven body runner treats as a
jump-to-end; added a matching cond clause in `forth-step-op`.
`UNLOOP` pops two from the return stack — usable paired with
`EXIT` to bail from inside `DO`/`LOOP`.
- **Phase 5 — `[`, `]`, `STATE`, `EVALUATE` (+5; Hayes 463→477, 74%).**
`[` (IMMEDIATE) clears `state.compiling`, `]` sets it. `STATE`
pushes the sentinel address `"@@state"` and `@` reads it as
`-1`/`0` based on the live `compiling` flag. `EVALUATE` reads
the (addr,u) string from byte memory, retokenises it via
`forth-tokens`, swaps it in as the active input, runs the
interpret loop, and restores the saved input. `SOURCE` and
`>IN` exist as stubs that push zeros — our whitespace-tokenised
input has no native byte-offset, so the deeper Hayes tests
that re-position parsing via `>IN !` stay marked as errors
rather than silently misbehaving.
- **Phase 5 — parsing/dictionary words `'`/`[']`/`EXECUTE`/`LITERAL`/
`POSTPONE`/`WORD`/`FIND`/`>BODY` (Hayes 448→463, 72%).** xt is
represented as the SX dict reference of the word record, so
`'`/`[']` push the looked-up record and `EXECUTE` calls
`forth-execute-word` on the popped value. `LITERAL` (IMMEDIATE)
pops a value at compile time and emits a push-op. `POSTPONE`
(IMMEDIATE) compiles into the *outer* def an op that, when run
during a *later* compile, appends a call-w op to whatever def is
current — the standard two-tier compile semantic. Added
`state.last-defined` tracked by every primitive/colon definition
so `IMMEDIATE` can target the most-recent word even after `;`
closes the def. CREATE now stashes its data-field address on the
word record so `>BODY` can recover it. `WORD`/`FIND` use the byte
memory and counted-string layout already in place.
`DOES>` is deferred — needs a runtime mechanism to rebind the
last-CREATE'd word's action.
- **Phase 5 — pictured numeric output: `<#`/`#`/`#S`/`#>`/`HOLD`/`SIGN` +
`U.`/`U.R`/`.R` (+9; Hayes 446→448, 70%).** Added a `state.hold`
list of single-character strings — `<#` resets it, `HOLD` and
`SIGN` prepend, `#` divides ud by BASE and prepends one digit,
`#S` loops `#` until ud is zero (running once even on zero),
`#>` drops ud and copies the joined hold buffer into mem,
pushing `(addr, len)`. `U.` / `.R` / `U.R` use a separate
`forth-num-to-string` for one-shot decimal/hex output and
`forth-spaces-str` for right-justify padding.
- **Phase 5 — double-cell ops `D+`/`D-`/`DNEGATE`/`DABS`/`D=`/`D<`/`D0=`/
`D0<`/`DMAX`/`DMIN` (+18; Hayes unchanged).** Doubles get rebuilt
from `(lo, hi)` cells via `forth-double-from-cells-s`, the op runs
in bignum, and we push back via `forth-double-push-s`. Hayes Core
doesn't exercise D-words (those live in Gerry Jackson's separate
`doublest.fth` Double word-set tests we have not vendored), so the
scoreboard stays at 446/638 — but the words now exist for any
consumer that needs them.
- **Phase 5 — mixed/double-cell math; Hayes 342→446 (69%).** Added
`S>D`, `D>S`, `M*`, `UM*`, `UM/MOD`, `FM/MOD`, `SM/REM`, `*/`, `*/MOD`.
Doubles ride on the stack as `(lo, hi)` with `hi` on top.
Helpers `forth-double-push-{u,s}` / `forth-double-from-cells-{u,s}`
split & rebuild via 32-bit unsigned mod/div, picking the negative
path explicitly so we don't form `2^64 + small` (float precision
drops at ULP=2^12 once you cross 2^64). `M*`/`UM*` use bignum
multiply then split; `*/`/`*/MOD` use bignum intermediate and
truncated division. Hayes: 446 pass / 185 error / 7 fail.
- **Phase 5 — memory primitives + unsigned compare; Hayes 268→342 (53%).**
Added `CREATE`/`HERE`/`ALLOT`/`,`/`C,`/`CELL+`/`CELLS`/`ALIGN`/`ALIGNED`/
`2!`/`2@`/`U<`/`U>`. Generalised `@`/`!`/`+!` to dispatch on address
type: string addresses still go through `state.vars` (VARIABLE/VALUE
cells) while integer addresses now fall through to `state.mem`
letting CREATE-allocated cells coexist with existing variables.
Decomposed the original "Full Core + Core Extension" box into
smaller unticked sub-bullets so iterations land per cluster.
Hayes: 342 pass / 292 error / 4 fail (53%). 237/237 internal.
- **Phase 4 close — LSHIFT/RSHIFT, 32-bit arith truncation, early
binding; Hayes 174→268 (42%).** Added `LSHIFT` / `RSHIFT` as logical
shifts on 32-bit unsigned values, converted through
`forth-to-unsigned`/`forth-from-unsigned`. All arithmetic
primitives (`+` `-` `*` `/` `MOD` `NEGATE` `ABS` `1+` `1-` `2+`
`2-` `2*` `2/`) now clip results to 32-bit signed via a new
`forth-clip` helper, so loop idioms that rely on `2*` shifting the
MSB out (e.g. Hayes' `BITS` counter) actually terminate.
Changed colon-def call compilation from late-binding to early
binding: `forth-compile-call` now resolves the target word at
compile time, which makes `: GDX 123 ; : GDX GDX 234 ;` behave
per ANS (inner `GDX` → old def, not infinite recursion). `RECURSE`
keeps its late-binding thunk via the new `forth-compile-recurse`
helper. Raised `MAX_CHUNKS` default to 638 (full `core.fr`) now
that the BITS and COUNT-BITS loops terminate. Hayes: 268 pass /
368 error / 2 fail.
- **Phase 4 — `SP@`/`SP!` (+4; Hayes unchanged; `DEPTH` was already present).**
`SP@` pushes the current data-stack depth (our closest analogue to a
stack pointer — SX lists have no addressable backing). `SP!` pops a
target depth and truncates the stack via `drop` on the dstack list.
This preserves the save/restore idiom `SP@ … SP!` even though the
returned "pointer" is really a count.
- **Phase 4 — `BASE`/`DECIMAL`/`HEX`/`BIN`/`OCTAL` (+9; Hayes unchanged).**
Moved `base` from its top-level state slot into `state.vars["base"]`
so the regular `@`/`!`/VARIABLE machinery works on it.
`BASE` pushes the sentinel address `"base"`; `DECIMAL`/`HEX`/`BIN`/
`OCTAL` are thin primitives that write into that slot. Parser
reads through `vars` now. Hayes unchanged because the runner had
already been stubbing `HEX`/`DECIMAL` — now real words, stubs
removed from `hayes-runner.sx`.
- **Phase 4 — `CHAR`/`[CHAR]`/`KEY`/`ACCEPT` (+7 / Hayes 168→174).**
`CHAR` parses the next token and pushes the first-char code. `[CHAR]`
is IMMEDIATE: in compile mode it embeds the code as a compiled push
op, in interpret mode it pushes inline. `KEY`/`ACCEPT` read from an
optional `state.keybuf` string — empty buffer makes `KEY` raise
`"no input available"` (matches ANS when stdin is closed) and
`ACCEPT` returns `0`. Enough for Hayes to get past CHAR-gated
clusters; real interactive IO lands later.
- **Phase 4 — strings: `S"`/`C"`/`."`/`TYPE`/`COUNT`/`CMOVE`/`CMOVE>`/`MOVE`/`FILL`/`BLANK`/`C@`/`C!`/`CHAR+`/`CHARS` (+16 / Hayes 165→168).**
Added a byte-addressable memory model to state: `mem` (dict keyed by
stringified address → integer byte) and `here` (next-free integer
addr). Helpers `forth-alloc-bytes!` / `forth-mem-write-string!` /
`forth-mem-read-string`. `S"`/`C"`/`."` are IMMEDIATE parsing words
that consume tokens until one ends with `"`, then either copy content
into memory at compile time (and emit a push of `addr`/`addr len` for
the colon-def body) or do it inline in interpret mode. `TYPE` emits
`u` bytes from `addr` via `char-from-code`. `COUNT` reads the length
byte at a counted-string address and pushes (`addr+1`, `u`). `FILL`,
`BLANK` (FILL with space), `CMOVE` (forward), `CMOVE>` (backward),
and `MOVE` (auto-directional) mutate the byte dict. 193/193 internal
tests, Hayes 168/590 (+3).
- **Phase 3 — Hayes conformance runner + baseline scoreboard (165/590, 28%).**
`lib/forth/conformance.sh` preprocesses `ans-tests/core.fr` (strips `\`
and `( ... )` comments + `TESTING` lines), splits the source on every
`}T` so each Hayes test plus the small declaration blocks between
them are one safe-resume chunk, and emits an SX driver that feeds
the chunks through `lib/forth/hayes-runner.sx`. The runner registers
`T{`/`->`/`}T` as Forth primitives that snapshot the dstack depth on
`T{`, record actual on `->`, compare on `}T`, and install stub
`HEX`/`DECIMAL`/`TESTING` so metadata doesn't halt the stream. Errors
raised inside a chunk are caught by `guard` and the state is reset,
so one bad test does not break the rest. Outputs
`scoreboard.json` + `scoreboard.md`.
First-run baseline: 165 pass / 425 error / 0 fail on the first 590
chunks. The default cap sits at 590 because `core.fr` chunks beyond
that rely on unsigned-integer wrap-around (e.g. `COUNT-BITS` with
`BEGIN DUP WHILE … 2* REPEAT`) which never terminates on our
bignum-based Forth; raise `MAX_CHUNKS` once those tests unblock.
Majority of errors are missing Phase-4 words (`RSHIFT`, `LSHIFT`,
`CELLS`, `S"`, `CHAR`, `SOURCE`, etc.) — each one implemented should
convert a cluster of errors to passes.
- **Phase 3 — vendor Gerry Jackson's forth2012-test-suite.** Added
`lib/forth/ans-tests/{tester.fr, core.fr, coreexttest.fth}` from
https://github.com/gerryjackson/forth2012-test-suite (master, fetched
2026-04-24). `tester.fr` is Hayes' `T{ ... -> ... }T` harness; `core.fr`
is the ~1000-line Core word tests; `coreexttest.fth` is Core Ext
(parked for later phases). Files are pristine — the conformance runner
(next iteration) will consume them.
- **Phase 3 — `DO`/`LOOP`/`+LOOP`/`I`/`J`/`LEAVE` + return stack words (+16).**
Counted loops compile onto the same PC-driven body runner. DO emits an
enter-op (pops limit+start from data stack, pushes them to rstack) and
pushes a `{:kind "do" :back PC :leaves ()}` marker onto cstack. LOOP/+LOOP
emit a dict op (`:kind "loop"`/`"+loop"` with target=back-cell). The step
handler pops index & reads limit, increments, and either restores the
updated index + jumps back, or drops the frame and advances. LEAVE walks
cstack for the innermost DO marker, emits a `:kind "leave"` dict op with
a fresh target cell, and registers it on the marker's leaves list. LOOP
patches all registered leave-targets to the exit PC and drops the marker.
The leave op pops two from rstack (unloop) and branches. `I` peeks rtop;
`J` reads rstack index 2 (below inner frame). Added non-immediate
return-stack words `>R`, `R>`, `R@`, `2>R`, `2R>`, `2R@`. Nested
DO/LOOP with J tested; LEAVE in nested loops exits only the inner.
177/177 green.
- **Phase 3 — `BEGIN`/`UNTIL`/`WHILE`/`REPEAT`/`AGAIN` (+9).** Indefinite-loop
constructs built on the same PC-driven body runner introduced for `IF`.
BEGIN records the current body length on `state.cstack` (a plain numeric
back-target). UNTIL/AGAIN pop that back-target and emit a `bif`/`branch`
op whose target cell is set to the recorded PC. WHILE emits a forward
`bif` with a fresh target cell and pushes it on the cstack *above* the
BEGIN marker; REPEAT pops both (while-target first, then back-pc), emits
an unconditional branch back to BEGIN, then patches the while-target to
the current body length — so WHILE's false flag jumps past the REPEAT.
Mixed compile-time layout (numeric back-targets + dict forward targets
on the same cstack) is OK because the immediate words pop them in the
order they expect. AGAIN works structurally but lacks a test without a
usable mid-loop exit; revisit once `EXIT` lands. 161/161 green.
- **Phase 3 start — `IF`/`ELSE`/`THEN` (+18).** `lib/forth/compiler.sx`
+ `tests/test-phase3.sx`. Colon-def body switched from `for-each` to
a PC-driven runner so branch ops can jump: ops now include dict tags
`{"kind" "bif"|"branch" "target" cell}` alongside the existing
`(fn (s) ...)` shape. IF compiles a `bif` with a fresh target cell
pushed to `state.cstack`; ELSE emits an unconditional `branch`,
patches the IF's target to the instruction after this branch, and
pushes the new target; THEN patches the most recent target to the
current body length. Nested IF/ELSE/THEN works via the cstack.
Also fixed `EMIT`: `code-char``char-from-code` (spec-correct
primitive name) so Phase 1/2 tests run green on sx_server.
152/152 (Phase 1 + 2 + 3) green.
- **Phase 2 complete — colon defs, compile mode, VARIABLE/CONSTANT/VALUE/TO, @/!/+! (+26).**
`lib/forth/compiler.sx` plus `tests/test-phase2.sx`.
Colon-def body is a list of ops (one per source token) wrapped in a single
lambda. References are late-binding thunks so forward/recursive references
work via `RECURSE`. Redefinitions take effect immediately.
VARIABLE creates a pusher for a symbolic address stored in `state.vars`.
CONSTANT compiles to `(fn (s) (forth-push s v))`. VALUE/TO share the vars dict.
Compiler rewrites `forth-interpret` to drive from a token list stored on
state so parsing words (`:`, `VARIABLE`, `TO` etc.) can consume the next
token with `forth-next-token!`. 134/134 (Phase 1 + 2) green.
- **Phase 1 complete — reader + interpret mode + core words (+108).**
`lib/forth/{runtime,reader,interpreter}.sx` plus `tests/test-phase1.sx`.
Stack as SX list (TOS = first). Dict is `{lowercased-name -> {:kind :body :immediate?}}`.
Data + return stacks both mutable. Output buffered in state (no host IO yet).
BASE-aware number parsing with `$`, `%`, `#` prefixes and `'c'` char literals.
Bitwise AND/OR/XOR/INVERT simulated over 32-bit two's-complement.
Integer `/` is truncated-toward-zero (ANS symmetric), MOD matches.
Case-insensitive lookup. 108/108 tests green.
## Blockers
- _(none yet)_