# Smalltalk-on-SX: blocks with non-local return on delimited continuations The headline showcase is **blocks** — Smalltalk's closures with non-local return (`^expr` aborts the enclosing *method*, not the block). Every other Smalltalk on top of a host VM (RSqueak on PyPy, GemStone on C, Maxine on Java) reinvents non-local return on whatever stack discipline the host gives them. On SX it's a one-liner: a block holds a captured continuation; `^` just invokes it. Message-passing OO falls out cheaply on top of the existing component / dispatch machinery. End-state goal: ANSI-ish Smalltalk-80 subset, SUnit working, ~200 hand-written tests + a vendored slice of the Pharo kernel tests, classic corpus (eight queens, quicksort, mandelbrot, Conway's Life). ## Scope decisions (defaults — override by editing before we spawn) - **Syntax:** Pharo / Squeak chunk format (`!` separators, `Object subclass: #Foo …`). No fileIn/fileOut images — text source only. - **Conformance:** ANSI X3J20 *as a target*, not bug-for-bug Squeak. "Reads like Smalltalk, runs like Smalltalk." - **Test corpus:** SUnit ported to SX-Smalltalk + custom programs + a curated slice of Pharo `Kernel-Tests` / `Collections-Tests`. - **Image:** out of scope. Source-only. No `become:` between sessions, no snapshotting. - **Reflection:** `class`, `respondsTo:`, `perform:`, `doesNotUnderstand:` in. `become:` (object-identity swap) **in** — it's a good CEK exercise. Method modification at runtime in. - **GUI / Morphic / threads:** out entirely. ## Ground rules - **Scope:** only touch `lib/smalltalk/**` and `plans/smalltalk-on-sx.md`. Don't edit `spec/`, `hosts/`, `shared/`, or any other `lib//**`. Smalltalk primitives go in `lib/smalltalk/runtime.sx`. - **SX files:** use `sx-tree` MCP tools only. - **Commits:** one feature per commit. Keep `## Progress log` updated and tick roadmap boxes. ## Architecture sketch ``` Smalltalk source │ ▼ lib/smalltalk/tokenizer.sx — selectors, keywords, literals, $c, #sym, #(…), $'…' │ ▼ lib/smalltalk/parser.sx — AST: classes, methods, blocks, cascades, sends │ ▼ lib/smalltalk/transpile.sx — AST → SX AST (entry: smalltalk-eval-ast) │ ▼ lib/smalltalk/runtime.sx — class table, MOP, dispatch, primitives ``` Core mapping: - **Class** = SX dict `{:name :superclass :ivars :methods :class-methods :metaclass}`. Class table is a flat dict keyed by class name. - **Object** = SX dict `{:class :ivars}` — `ivars` keyed by symbol. Tagged ints / floats / strings / symbols are not boxed; their class is looked up by SX type. - **Method** = SX lambda closing over a `self` binding + temps. Body wrapped in a delimited continuation so `^` can escape. - **Message send** = `(st-send receiver selector args)` — does class-table lookup, walks superclass chain, falls back to `doesNotUnderstand:` with a `Message` object. - **Block** `[:x | … ^v … ]` = lambda + captured `^k` (the method-return continuation). Invoking `^` calls `k`; outer block invocation past method return raises `BlockContext>>cannotReturn:`. - **Cascade** `r m1; m2; m3` = `(let ((tmp r)) (st-send tmp 'm1 ()) (st-send tmp 'm2 ()) (st-send tmp 'm3 ()))`. - **`ifTrue:ifFalse:` / `whileTrue:`** = ordinary block sends; the runtime intrinsifies them in the JIT path so they compile to native branches (Tier 1 of bytecode expansion already covers this pattern). - **`become:`** = swap two object identities everywhere — in SX this is a heap walk, but we restrict to `oneWayBecome:` (cheap: rewrite class field) by default. ## Roadmap ### Phase 1 — tokenizer + parser - [x] Tokenizer: identifiers, keywords (`foo:`), binary selectors (`+`, `==`, `,`, `->`, `~=` etc.), numbers (radix `16r1F`; **scaled `1.5s2` deferred**), strings `'…''…'`, characters `$c`, symbols `#foo` `#'foo bar'` `#+`, byte arrays `#[1 2 3]` (open token), literal arrays `#(1 #foo 'x')` (open token), comments `"…"` - [x] Parser (expression level): blocks `[:a :b | | t1 t2 | …]`, cascades, message precedence (unary > binary > keyword), assignment, return, statement sequences, literal arrays, byte arrays, paren grouping, method headers (`+ other`, `at:put:`, unary, with temps and body). Class-definition keyword messages parse as ordinary keyword sends — no special-case needed. - [x] Parser (chunk-stream level): `st-read-chunks` splits source on `!` (with `!!` doubling) and `st-parse-chunks` runs the Pharo file-in state machine — `methodsFor:` / `class methodsFor:` opens a method batch, an empty chunk closes it. Pragmas `` (incl. multiple keyword pairs, before or after temps, multiple per method) parsed into the method AST. - [x] Unit tests in `lib/smalltalk/tests/parse.sx` ### Phase 2 — object model + sequential eval - [x] Class table + bootstrap (`lib/smalltalk/runtime.sx`): canonical hierarchy installed (`Object`, `Behavior`, `ClassDescription`, `Class`, `Metaclass`, `UndefinedObject`, `Boolean`/`True`/`False`, `Magnitude`/`Number`/`Integer`/`SmallInteger`/`Float`/`Character`, `Collection`/`SequenceableCollection`/`ArrayedCollection`/`Array`/`String`/`Symbol`/`OrderedCollection`/`Dictionary`, `BlockClosure`). User class definition via `st-class-define!`, methods via `st-class-add-method!` (stamps `:defining-class` for super), method lookup walks chain, ivars accumulated through superclass chain, native SX value types map to Smalltalk classes via `st-class-of`. - [x] `smalltalk-eval-ast` (`lib/smalltalk/eval.sx`): all literal kinds, ident resolution (locals → ivars → class refs), self/super/thisContext, assignment (locals or ivars, mutating), message send, cascade, sequence, and ^return via a sentinel marker (proper continuation-based escape is the Phase 3 showcase). Frames carry a parent chain so blocks close over outer locals. Primitive method tables for SmallInteger/Float, String/Symbol, Boolean, UndefinedObject, Array, BlockClosure (value/value:/whileTrue:/etc.), and class-side `new`/`name`/etc. Also satisfies "30+ tests" — 60 eval tests. - [x] Method lookup: walk class → superclass already in `st-method-lookup-walk`; new cached wrapper `st-method-lookup` keys on `(class, selector, side)` and stores `:not-found` for negative results so DNU paths don't re-walk. Cache invalidates on `st-class-define!`, `st-class-add-method!`, `st-class-add-class-method!`, `st-class-remove-method!`, and full bootstrap. Stats helpers `st-method-cache-stats` / `st-method-cache-reset-stats!` for tests + later debugging. - [x] `doesNotUnderstand:` fallback. `Message` class added at bootstrap with `selector`/`arguments` ivars and accessor methods. Primitive senders (Number/String/Boolean/Nil/Array/BlockClosure/class-side) now return the `:unhandled` sentinel for unknown selectors; `st-send` builds a `Message` via `st-make-message` and routes through `st-dnu`, which looks up `doesNotUnderstand:` on the receiver's class chain (instance- or class-side as appropriate). User overrides intercept unknowns and see the symbol selector + arguments array in the Message. - [x] `super` send. Method invocation captures the defining class on the frame; `st-super-send` walks from `(st-class-superclass defining-class)` (instance- or class-side as appropriate). Falls through primitives → DNU when no method is found. Receiver is preserved as `self`, so ivar mutations stick. Verified for: subclass override calls parent, inherited `super` resolves to *defining* class's parent (not receiver's), multi-level `A→B→C` chain, super inside a block, super walks past an intermediate class with no local override. - [x] 30+ tests in `lib/smalltalk/tests/eval.sx` (60 tests, covering literals through user-class method dispatch with cascades and closures) ### Phase 3 — blocks + non-local return (THE SHOWCASE) - [ ] Method invocation captures a `^k` (the return continuation) and binds it as the block's escape - [ ] `^expr` from inside a block invokes that captured `^k` - [ ] `BlockContext>>value`, `value:`, `value:value:`, …, `valueWithArguments:` - [ ] `whileTrue:` / `whileTrue` / `whileFalse:` / `whileFalse` as ordinary block sends — runtime intrinsifies the loop in the bytecode JIT - [ ] `ifTrue:` / `ifFalse:` / `ifTrue:ifFalse:` as block sends, similarly intrinsified - [ ] Escape past returned-from method raises `BlockContext>>cannotReturn:` - [ ] Classic programs in `lib/smalltalk/tests/programs/`: - [ ] `eight-queens.st` - [ ] `quicksort.st` - [ ] `mandelbrot.st` - [ ] `life.st` (Conway's Life, glider gun) - [ ] `fibonacci.st` (recursive + memoised) - [ ] `lib/smalltalk/conformance.sh` + runner, `scoreboard.json` + `scoreboard.md` ### Phase 4 — reflection + MOP - [ ] `Object>>class`, `class>>name`, `class>>superclass`, `class>>methodDict`, `class>>selectors` - [ ] `Object>>perform:` / `perform:with:` / `perform:withArguments:` - [ ] `Object>>respondsTo:`, `Object>>isKindOf:`, `Object>>isMemberOf:` - [ ] `Behavior>>compile:` — runtime method addition - [ ] `Object>>becomeForward:` (one-way become; rewrites the class field of `aReceiver`) - [ ] Exceptions: `Exception`, `Error`, `signal`, `signal:`, `on:do:`, `ensure:`, `ifCurtailed:` — built on top of SX `handler-bind`/`raise` ### Phase 5 — collections + numeric tower - [ ] `SequenceableCollection`/`OrderedCollection`/`Array`/`String`/`Symbol` - [ ] `HashedCollection`/`Set`/`Dictionary`/`IdentityDictionary` - [ ] `Stream` hierarchy: `ReadStream`/`WriteStream`/`ReadWriteStream` - [ ] `Number` tower: `SmallInteger`/`LargePositiveInteger`/`Float`/`Fraction` - [ ] `String>>format:`, `printOn:` for everything ### Phase 6 — SUnit + corpus to 200+ - [ ] Port SUnit (TestCase, TestSuite, TestResult) — written in SX-Smalltalk, runs in itself - [ ] Vendor a slice of Pharo `Kernel-Tests` and `Collections-Tests` - [ ] Drive the scoreboard up: aim for 200+ green tests - [ ] Stretch: ANSI Smalltalk validator subset ### Phase 7 — speed (optional) - [ ] Method-dictionary inline caching (already in CEK as a primitive; just wire selector cache) - [ ] Block intrinsification beyond `whileTrue:` / `ifTrue:` - [ ] Compare against GNU Smalltalk on the corpus ## Progress log _Newest first. Agent appends on every commit._ - 2026-04-25: `super` send + 9 tests (`lib/smalltalk/tests/super.sx`). `st-super-send` walks from defining-class's superclass; class-side aware; primitives → DNU fallback. Also fixed top-level `| temps |` parsing in `st-parse` (the absence of which was silently aborting earlier eval/dnu tests — counts go from 274 → 287, with previously-skipped tests now actually running). - 2026-04-25: `doesNotUnderstand:` + 12 DNU tests (`lib/smalltalk/tests/dnu.sx`). Bootstrap installs `Message` (with selector/arguments accessors). Primitives signal `:unhandled` instead of erroring; `st-dnu` builds a Message and walks `doesNotUnderstand:` lookup. User Object DNU intercepts unknown sends to native receivers (Number, String, Block) too. 267/267 total. - 2026-04-25: method-lookup cache (`st-method-cache` keyed by `class|selector|side`, stores `:not-found` for misses). Invalidation on define/add/remove + bootstrap. `st-class-remove-method!` added. Stats helpers + 10 cache tests; 255/255 total. - 2026-04-25: `smalltalk-eval-ast` + 60 eval tests (`lib/smalltalk/eval.sx`, `lib/smalltalk/tests/eval.sx`). Frame chain with mutable locals/ivars (via `dict-set!`), full literal eval, send dispatch (user methods + native primitive tables for Number/String/Boolean/Nil/Array/Block/Class), block closures, while/to:do:, cascades returning last, sentinel-based `^return`. User Point class round-trip works including `+` returning a fresh point. 245/245 total. - 2026-04-25: class table + bootstrap (`lib/smalltalk/runtime.sx`, `lib/smalltalk/tests/runtime.sx`). Canonical hierarchy, type→class mapping for native SX values, instance construction, ivar inheritance, method install with `:defining-class` stamp, instance- and class-side method lookup walking the superclass chain. 54 new tests, 185/185 total. - 2026-04-25: chunk-stream parser + pragmas + 21 chunk/pragma tests (`lib/smalltalk/tests/parse_chunks.sx`). `st-read-chunks` (with `!!` doubling), `st-parse-chunks` state machine for `methodsFor:` batches incl. class-side. Pragmas with multiple keyword pairs, signed numeric / string / symbol args, in either pragma-then-temps or temps-then-pragma order. 131/131 tests pass. - 2026-04-25: expression-level parser + 47 parse tests (`lib/smalltalk/parser.sx`, `lib/smalltalk/tests/parse.sx`). Full message precedence (unary > binary > keyword), cascades, blocks with params/temps, literal/byte arrays, assignment chain, method headers (unary/binary/keyword). Chunk-format `! !` driver deferred to a follow-up box. 110/110 tests pass. - 2026-04-25: tokenizer + 63 tests (`lib/smalltalk/tokenizer.sx`, `lib/smalltalk/tests/tokenize.sx`, `lib/smalltalk/test.sh`). All token types covered except scaled decimals `1.5s2` (deferred). `#(` and `#[` emit open tokens; literal-array contents lexed as ordinary tokens for the parser to interpret. ## Blockers _Shared-file issues that need someone else to fix. Minimal repro only._ - _(none yet)_