Files
rose-ash/plans/smalltalk-on-sx.md
giles 6a00df2609 smalltalk: plan + briefing + sx-loops 8th slot
Showcase: blocks with non-local return on captured method-return
continuation. ANSI-ish Smalltalk-80 subset, SUnit + Pharo Kernel-Tests
slice, 7 phases. Worktree: /root/rose-ash-loops/smalltalk on
branch loops/smalltalk.
2026-04-25 00:05:31 +00:00

117 lines
7.4 KiB
Markdown

# Smalltalk-on-SX: blocks with non-local return on delimited continuations
The headline showcase is **blocks** — Smalltalk's closures with non-local return (`^expr` aborts the enclosing *method*, not the block). Every other Smalltalk on top of a host VM (RSqueak on PyPy, GemStone on C, Maxine on Java) reinvents non-local return on whatever stack discipline the host gives them. On SX it's a one-liner: a block holds a captured continuation; `^` just invokes it. Message-passing OO falls out cheaply on top of the existing component / dispatch machinery.
End-state goal: ANSI-ish Smalltalk-80 subset, SUnit working, ~200 hand-written tests + a vendored slice of the Pharo kernel tests, classic corpus (eight queens, quicksort, mandelbrot, Conway's Life).
## Scope decisions (defaults — override by editing before we spawn)
- **Syntax:** Pharo / Squeak chunk format (`!` separators, `Object subclass: #Foo …`). No fileIn/fileOut images — text source only.
- **Conformance:** ANSI X3J20 *as a target*, not bug-for-bug Squeak. "Reads like Smalltalk, runs like Smalltalk."
- **Test corpus:** SUnit ported to SX-Smalltalk + custom programs + a curated slice of Pharo `Kernel-Tests` / `Collections-Tests`.
- **Image:** out of scope. Source-only. No `become:` between sessions, no snapshotting.
- **Reflection:** `class`, `respondsTo:`, `perform:`, `doesNotUnderstand:` in. `become:` (object-identity swap) **in** — it's a good CEK exercise. Method modification at runtime in.
- **GUI / Morphic / threads:** out entirely.
## Ground rules
- **Scope:** only touch `lib/smalltalk/**` and `plans/smalltalk-on-sx.md`. Don't edit `spec/`, `hosts/`, `shared/`, or any other `lib/<lang>/**`. Smalltalk primitives go in `lib/smalltalk/runtime.sx`.
- **SX files:** use `sx-tree` MCP tools only.
- **Commits:** one feature per commit. Keep `## Progress log` updated and tick roadmap boxes.
## Architecture sketch
```
Smalltalk source
lib/smalltalk/tokenizer.sx — selectors, keywords, literals, $c, #sym, #(…), $'…'
lib/smalltalk/parser.sx — AST: classes, methods, blocks, cascades, sends
lib/smalltalk/transpile.sx — AST → SX AST (entry: smalltalk-eval-ast)
lib/smalltalk/runtime.sx — class table, MOP, dispatch, primitives
```
Core mapping:
- **Class** = SX dict `{:name :superclass :ivars :methods :class-methods :metaclass}`. Class table is a flat dict keyed by class name.
- **Object** = SX dict `{:class :ivars}``ivars` keyed by symbol. Tagged ints / floats / strings / symbols are not boxed; their class is looked up by SX type.
- **Method** = SX lambda closing over a `self` binding + temps. Body wrapped in a delimited continuation so `^` can escape.
- **Message send** = `(st-send receiver selector args)` — does class-table lookup, walks superclass chain, falls back to `doesNotUnderstand:` with a `Message` object.
- **Block** `[:x | … ^v … ]` = lambda + captured `^k` (the method-return continuation). Invoking `^` calls `k`; outer block invocation past method return raises `BlockContext>>cannotReturn:`.
- **Cascade** `r m1; m2; m3` = `(let ((tmp r)) (st-send tmp 'm1 ()) (st-send tmp 'm2 ()) (st-send tmp 'm3 ()))`.
- **`ifTrue:ifFalse:` / `whileTrue:`** = ordinary block sends; the runtime intrinsifies them in the JIT path so they compile to native branches (Tier 1 of bytecode expansion already covers this pattern).
- **`become:`** = swap two object identities everywhere — in SX this is a heap walk, but we restrict to `oneWayBecome:` (cheap: rewrite class field) by default.
## Roadmap
### Phase 1 — tokenizer + parser
- [ ] Tokenizer: identifiers, keywords (`foo:`), binary selectors (`+`, `==`, `,`, `->`, `~=` etc.), numbers (radix `16r1F`, scaled `1.5s2`), strings `'…''…'`, characters `$c`, symbols `#foo` `#'foo bar'` `#+`, byte arrays `#[1 2 3]`, literal arrays `#(1 #foo 'x')`, comments `"…"`
- [ ] Parser: chunk format (`! !` separators), class definitions (`Object subclass: #X instanceVariableNames: '…' classVariableNames: '…' …`), method definitions (`extend: #Foo with: 'bar ^self'`), pragmas `<primitive: 1>`, blocks `[:a :b | | t1 t2 | …]`, cascades, message precedence (unary > binary > keyword)
- [ ] Unit tests in `lib/smalltalk/tests/parse.sx`
### Phase 2 — object model + sequential eval
- [ ] Class table + bootstrap: `Object`, `Behavior`, `Class`, `Metaclass`, `UndefinedObject`, `Boolean`/`True`/`False`, `Number`/`Integer`/`Float`, `String`, `Symbol`, `Array`, `Block`
- [ ] `smalltalk-eval-ast`: literals, variable reference, assignment, message send, cascade, sequence, return
- [ ] Method lookup: walk class → superclass; cache hit-class on `(class, selector)`
- [ ] `doesNotUnderstand:` fallback constructing `Message` object
- [ ] `super` send (lookup starts at superclass of *defining* class, not receiver class)
- [ ] 30+ tests in `lib/smalltalk/tests/eval.sx`
### Phase 3 — blocks + non-local return (THE SHOWCASE)
- [ ] Method invocation captures a `^k` (the return continuation) and binds it as the block's escape
- [ ] `^expr` from inside a block invokes that captured `^k`
- [ ] `BlockContext>>value`, `value:`, `value:value:`, …, `valueWithArguments:`
- [ ] `whileTrue:` / `whileTrue` / `whileFalse:` / `whileFalse` as ordinary block sends — runtime intrinsifies the loop in the bytecode JIT
- [ ] `ifTrue:` / `ifFalse:` / `ifTrue:ifFalse:` as block sends, similarly intrinsified
- [ ] Escape past returned-from method raises `BlockContext>>cannotReturn:`
- [ ] Classic programs in `lib/smalltalk/tests/programs/`:
- [ ] `eight-queens.st`
- [ ] `quicksort.st`
- [ ] `mandelbrot.st`
- [ ] `life.st` (Conway's Life, glider gun)
- [ ] `fibonacci.st` (recursive + memoised)
- [ ] `lib/smalltalk/conformance.sh` + runner, `scoreboard.json` + `scoreboard.md`
### Phase 4 — reflection + MOP
- [ ] `Object>>class`, `class>>name`, `class>>superclass`, `class>>methodDict`, `class>>selectors`
- [ ] `Object>>perform:` / `perform:with:` / `perform:withArguments:`
- [ ] `Object>>respondsTo:`, `Object>>isKindOf:`, `Object>>isMemberOf:`
- [ ] `Behavior>>compile:` — runtime method addition
- [ ] `Object>>becomeForward:` (one-way become; rewrites the class field of `aReceiver`)
- [ ] Exceptions: `Exception`, `Error`, `signal`, `signal:`, `on:do:`, `ensure:`, `ifCurtailed:` — built on top of SX `handler-bind`/`raise`
### Phase 5 — collections + numeric tower
- [ ] `SequenceableCollection`/`OrderedCollection`/`Array`/`String`/`Symbol`
- [ ] `HashedCollection`/`Set`/`Dictionary`/`IdentityDictionary`
- [ ] `Stream` hierarchy: `ReadStream`/`WriteStream`/`ReadWriteStream`
- [ ] `Number` tower: `SmallInteger`/`LargePositiveInteger`/`Float`/`Fraction`
- [ ] `String>>format:`, `printOn:` for everything
### Phase 6 — SUnit + corpus to 200+
- [ ] Port SUnit (TestCase, TestSuite, TestResult) — written in SX-Smalltalk, runs in itself
- [ ] Vendor a slice of Pharo `Kernel-Tests` and `Collections-Tests`
- [ ] Drive the scoreboard up: aim for 200+ green tests
- [ ] Stretch: ANSI Smalltalk validator subset
### Phase 7 — speed (optional)
- [ ] Method-dictionary inline caching (already in CEK as a primitive; just wire selector cache)
- [ ] Block intrinsification beyond `whileTrue:` / `ifTrue:`
- [ ] Compare against GNU Smalltalk on the corpus
## Progress log
_Newest first. Agent appends on every commit._
- _(none yet)_
## Blockers
_Shared-file issues that need someone else to fix. Minimal repro only._
- _(none yet)_