Smalltalk-on-SX: blocks with non-local return on delimited continuations

The headline showcase is blocks — Smalltalk's closures with non-local return (^expr aborts the enclosing method, not the block). Every other Smalltalk on top of a host VM (RSqueak on PyPy, GemStone on C, Maxine on Java) reinvents non-local return on whatever stack discipline the host gives them. On SX it's a one-liner: a block holds a captured continuation; ^ just invokes it. Message-passing OO falls out cheaply on top of the existing component / dispatch machinery.

End-state goal: ANSI-ish Smalltalk-80 subset, SUnit working, ~200 hand-written tests + a vendored slice of the Pharo kernel tests, classic corpus (eight queens, quicksort, mandelbrot, Conway's Life).

Scope decisions (defaults — override by editing before we spawn)

Syntax: Pharo / Squeak chunk format (! separators, Object subclass: #Foo …). No fileIn/fileOut images — text source only.
Conformance: ANSI X3J20 as a target, not bug-for-bug Squeak. "Reads like Smalltalk, runs like Smalltalk."
Test corpus: SUnit ported to SX-Smalltalk + custom programs + a curated slice of Pharo Kernel-Tests / Collections-Tests.
Image: out of scope. Source-only. No become: between sessions, no snapshotting.
Reflection: class, respondsTo:, perform:, doesNotUnderstand: in. become: (object-identity swap) in — it's a good CEK exercise. Method modification at runtime in.
GUI / Morphic / threads: out entirely.

Ground rules

Scope: only touch lib/smalltalk/** and plans/smalltalk-on-sx.md. Don't edit spec/, hosts/, shared/, or any other lib/<lang>/**. Smalltalk primitives go in lib/smalltalk/runtime.sx.
SX files: use sx-tree MCP tools only.
Commits: one feature per commit. Keep ## Progress log updated and tick roadmap boxes.

Architecture sketch

Smalltalk source
    │
    ▼
lib/smalltalk/tokenizer.sx  — selectors, keywords, literals, $c, #sym, #(…), $'…'
    │
    ▼
lib/smalltalk/parser.sx     — AST: classes, methods, blocks, cascades, sends
    │
    ▼
lib/smalltalk/transpile.sx  — AST → SX AST (entry: smalltalk-eval-ast)
    │
    ▼
lib/smalltalk/runtime.sx    — class table, MOP, dispatch, primitives

Core mapping:

Class = SX dict {:name :superclass :ivars :methods :class-methods :metaclass}. Class table is a flat dict keyed by class name.
Object = SX dict {:class :ivars} — ivars keyed by symbol. Tagged ints / floats / strings / symbols are not boxed; their class is looked up by SX type.
Method = SX lambda closing over a self binding + temps. Body wrapped in a delimited continuation so ^ can escape.
Message send = (st-send receiver selector args) — does class-table lookup, walks superclass chain, falls back to doesNotUnderstand: with a Message object.
Block [:x | … ^v … ] = lambda + captured ^k (the method-return continuation). Invoking ^ calls k; outer block invocation past method return raises BlockContext>>cannotReturn:.
Cascade r m1; m2; m3 = (let ((tmp r)) (st-send tmp 'm1 ()) (st-send tmp 'm2 ()) (st-send tmp 'm3 ())).
ifTrue:ifFalse: / whileTrue: = ordinary block sends; the runtime intrinsifies them in the JIT path so they compile to native branches (Tier 1 of bytecode expansion already covers this pattern).
become: = swap two object identities everywhere — in SX this is a heap walk, but we restrict to oneWayBecome: (cheap: rewrite class field) by default.

Roadmap

Phase 1 — tokenizer + parser

Tokenizer: identifiers, keywords (foo:), binary selectors (+, ==, ,, ->, ~= etc.), numbers (radix 16r1F; scaled 1.5s2 deferred), strings '…''…', characters $c, symbols #foo #'foo bar' #+, byte arrays #[1 2 3] (open token), literal arrays #(1 #foo 'x') (open token), comments "…"
Parser (expression level): blocks [:a :b | | t1 t2 | …], cascades, message precedence (unary > binary > keyword), assignment, return, statement sequences, literal arrays, byte arrays, paren grouping, method headers (+ other, at:put:, unary, with temps and body). Class-definition keyword messages parse as ordinary keyword sends — no special-case needed.
Parser (chunk-stream level): st-read-chunks splits source on ! (with !! doubling) and st-parse-chunks runs the Pharo file-in state machine — methodsFor: / class methodsFor: opens a method batch, an empty chunk closes it. Pragmas <primitive: …> (incl. multiple keyword pairs, before or after temps, multiple per method) parsed into the method AST.
Unit tests in lib/smalltalk/tests/parse.sx

Phase 2 — object model + sequential eval

Class table + bootstrap (lib/smalltalk/runtime.sx): canonical hierarchy installed (Object, Behavior, ClassDescription, Class, Metaclass, UndefinedObject, Boolean/True/False, Magnitude/Number/Integer/SmallInteger/Float/Character, Collection/SequenceableCollection/ArrayedCollection/Array/String/Symbol/OrderedCollection/Dictionary, BlockClosure). User class definition via st-class-define!, methods via st-class-add-method! (stamps :defining-class for super), method lookup walks chain, ivars accumulated through superclass chain, native SX value types map to Smalltalk classes via st-class-of.
smalltalk-eval-ast (lib/smalltalk/eval.sx): all literal kinds, ident resolution (locals → ivars → class refs), self/super/thisContext, assignment (locals or ivars, mutating), message send, cascade, sequence, and ^return via a sentinel marker (proper continuation-based escape is the Phase 3 showcase). Frames carry a parent chain so blocks close over outer locals. Primitive method tables for SmallInteger/Float, String/Symbol, Boolean, UndefinedObject, Array, BlockClosure (value/value:/whileTrue:/etc.), and class-side new/name/etc. Also satisfies "30+ tests" — 60 eval tests.
Method lookup: walk class → superclass already in st-method-lookup-walk; new cached wrapper st-method-lookup keys on (class, selector, side) and stores :not-found for negative results so DNU paths don't re-walk. Cache invalidates on st-class-define!, st-class-add-method!, st-class-add-class-method!, st-class-remove-method!, and full bootstrap. Stats helpers st-method-cache-stats / st-method-cache-reset-stats! for tests + later debugging.
doesNotUnderstand: fallback. Message class added at bootstrap with selector/arguments ivars and accessor methods. Primitive senders (Number/String/Boolean/Nil/Array/BlockClosure/class-side) now return the :unhandled sentinel for unknown selectors; st-send builds a Message via st-make-message and routes through st-dnu, which looks up doesNotUnderstand: on the receiver's class chain (instance- or class-side as appropriate). User overrides intercept unknowns and see the symbol selector + arguments array in the Message.
super send. Method invocation captures the defining class on the frame; st-super-send walks from (st-class-superclass defining-class) (instance- or class-side as appropriate). Falls through primitives → DNU when no method is found. Receiver is preserved as self, so ivar mutations stick. Verified for: subclass override calls parent, inherited super resolves to defining class's parent (not receiver's), multi-level A→B→C chain, super inside a block, super walks past an intermediate class with no local override.
30+ tests in lib/smalltalk/tests/eval.sx (60 tests, covering literals through user-class method dispatch with cascades and closures)

Phase 3 — blocks + non-local return (THE SHOWCASE)

Method invocation captures a ^k (the return continuation) and binds it as the block's escape. st-invoke wraps body in (call/cc (fn (k) ...)); the frame's :return-k is set to k. Block creation copies (get frame :return-k) onto the block. Block invocation sets the new frame's :return-k to the block's saved one — so non-local return reaches back through any number of intermediate block invocations.
^expr from inside a block invokes that captured ^k. The "return" AST type evaluates the expression then calls (k v) on the frame's :return-k. Verified: detect:in: style early-exit, multi-level nested blocks, ^ from inside to:do:/whileTrue:, ^ from a block passed to a different method (Caller→Helper) returns from Caller.
BlockContext>>value, value:, value:value:, value:value:value:, value:value:value:value:, valueWithArguments:. Implemented in st-block-dispatch + st-block-apply (eval iteration); pinned by 19 dedicated tests in lib/smalltalk/tests/blocks.sx covering arity through 4, valueWithArguments: with empty/non-empty arg arrays, closures over outer locals (read + mutate + later-mutation re-read), nested blocks, blocks as method arguments, numArgs, and class.
whileTrue: / whileTrue / whileFalse: / whileFalse as ordinary block sends. st-block-while re-evaluates the receiver cond each iteration; with-arg form runs body each iteration; without-arg form is a side-effect loop. Now returns nil per ANSI/Pharo. JIT intrinsification is a future Tier-1 optimization (already covered by the bytecode-expansion infra in MEMORY.md). 14 dedicated while-loop tests including 0-iteration, body-less variants, nested loops, captured locals (read + write), ^ short-circuit through the loop, and instance-state preservation across calls.
ifTrue: / ifFalse: / ifTrue:ifFalse: / ifFalse:ifTrue: as block sends, plus and:/or: short-circuit, eager &/|, not. Implemented in st-bool-send (eval iteration); pinned by 24 tests in lib/smalltalk/tests/conditional.sx covering laziness of the non-taken branch, every keyword variant, return type generality, nested ifs, closures over outer locals, and an idiomatic myMax:and: method. Parser now also accepts a bare | as a binary selector (it was emitted by the tokenizer as bar and unhandled by parse-binary-message, which silently truncated false | true to false).
Escape past returned-from method raises (the SX-level analogue of BlockContext>>cannotReturn:). Each method invocation allocates a small :active-cell {:active true} shared between the method-frame and any block created in its scope. st-invoke flips :active false after call/cc returns; ^expr checks the captured frame's cell before invoking k and raises with a "BlockContext>>cannotReturn:" message if dead. Verified by lib/smalltalk/tests/cannot_return.sx (5 tests using SX guard to catch the raise). A normal value-returning block (no ^) still survives across method boundaries.
Classic programs in lib/smalltalk/tests/programs/:
- eight-queens.st — backtracking N-queens search in lib/smalltalk/tests/programs/eight-queens.st. The .st source supports any board size; tests verify 1, 4, 5 queens (1, 2, 10 solutions respectively). 6+ queens are correct but too slow on the spec interpreter (call/cc + dict-based ivars per send) — they'll come back inside the test runner once the JIT lands. The 8-queens canonical case will run in production.
- quicksort.st — Lomuto-partition in-place quicksort in lib/smalltalk/tests/programs/quicksort.st. Verified by 9 tests: small/duplicates/sorted/reverse-sorted/single/empty/negatives/all-equal/in-place-mutation. Exercises Array at:/at:put: mutation, recursion, to:do: over varying ranges.
- mandelbrot.st — escape-time iteration of z := z² + c in lib/smalltalk/tests/programs/mandelbrot.st. Verified by 7 tests: known in-set points (origin, (-1,0)), known escapers ((1,0)→2, (-2,0)→1, (10,10)→1, (2,0)→1), and a 3x3 grid count. Caught a real bug along the way: literal #(...) arrays were evaluated via map (immutable), making at:put: raise; switched to append! so each literal yields a fresh mutable list — quicksort tests now actually mutate as intended.
- life.st (Conway's Life). lib/smalltalk/tests/programs/life.st carries the canonical rules with edge handling. Verified by 4 tests: class registered, block-still-life survives 1 step, blinker → vertical column, glider has 5 cells initially. Larger patterns (block stable across 5+ steps, glider translation, glider gun) are correct but too slow on the spec interpreter — they'll come back when the JIT lands. Also added Pharo-style dynamic array literal {e1. e2. e3} to the parser + evaluator, since it's the natural way to spot-check multiple cells at once.
- fibonacci.st (recursive + Array-memoised) — lib/smalltalk/tests/programs/fibonacci.st. Loaded from chunk-format source by new smalltalk-load helper; verified by 13 tests in lib/smalltalk/tests/programs.sx (recursive fib:, memoised memoFib: up to 30, instance independence, class-table integrity). Source is currently duplicated as a string in the SX test file because there's no SX file-read primitive; conformance.sh will dedupe by piping the .st file directly.
lib/smalltalk/conformance.sh + runner, scoreboard.json + scoreboard.md. The runner runs bash lib/smalltalk/test.sh -v once, parses per-file counts, and emits both files. JSON has date / program names / corpus-test count / all-test pass/total / exit code. Markdown has a totals table, the program list, the verbatim per-file test counts block, and notes about JIT-deferred work. Both are checked into the tree as the latest baseline; the runner overwrites them.

Phase 4 — reflection + MOP

Object>>class, class>>name, class>>superclass, class>>methodDict, class>>selectors. class is universal in st-primitive-send (returns Metaclass for class-refs, the receiver's class otherwise). Class-side dispatch gains methodDict/classMethodDict (raw dict), selectors/classSelectors (Array of symbols), instanceVariableNames (own), allInstVarNames (inherited + own). 26 tests in lib/smalltalk/tests/reflection.sx.
Object>>perform: / perform:with: / perform:with:with: / perform:with:with:with: / perform:with:with:with:with: / perform:withArguments:. Universal in st-primitive-send; routes back through st-send so user methods, primitives, super, and DNU all still apply. Selector arg can be a symbol or string (we str it). 10 new tests in lib/smalltalk/tests/reflection.sx.
Object>>respondsTo:, Object>>isKindOf:, Object>>isMemberOf:. Universal in st-primitive-send. respondsTo: searches user method dicts (instance- or class-side based on receiver kind); native primitive selectors aren't enumerated, documented limitation. isKindOf: walks st-class-inherits-from?; isMemberOf: is exact class equality. 26 new tests in reflection.sx.
Behavior>>compile: — runtime method addition. Class-side compile: parses the source via st-parse-method and installs via st-class-add-method!. Sister forms compile:classified: and compile:notifying: ignore the extra arg (Pharo-tolerant). Returns the selector as a symbol. Also added addSelector:withMethod: (raw AST install) and removeSelector:. 9 new tests in reflection.sx.
Object>>becomeForward: — one-way become at the universal st-primitive-send layer. Mutates the receiver's :class and :ivars to match the target via dict-set!; every existing reference to the receiver dict now behaves as the target. Receiver and target remain distinct dicts (no SX-level identity merge), but method dispatch, ivar reads, and aliases all switch — Pharo's practical guarantee. 6 tests in reflection.sx, including the alias case (a and alias := a both see the new identity).
Exceptions: Exception, Error, ZeroDivide, MessageNotUnderstood in bootstrap. signal raises the receiver via SX raise; signal: sets messageText first. on:do: / ensure: / ifCurtailed: on BlockClosure use SX guard. The auto-reraise pattern uses a side-effect predicate (cleanup runs in the predicate, returns false → guard auto-reraises) because (raise c) from inside a guard handler hits a known SX issue with nested-handler frames. 15 tests in lib/smalltalk/tests/exceptions.sx. Phase 4 complete.

Phase 5 — collections + numeric tower

SequenceableCollection/OrderedCollection/Array/String/Symbol. Bootstrap installs shared methods on SequenceableCollection: inject:into:, detect:/detect:ifNone:, count:, allSatisfy:/anySatisfy:, includes:, do:separatedBy:, indexOf:/indexOf:ifAbsent:, reject:, isEmpty/notEmpty, asString. They each call self do:, which dispatches to the receiver's primitive do: — so Array, String, and Symbol inherit them uniformly. String/Symbol primitives gained at: (1-indexed), copyFrom:to:, first/last, do:. OrderedCollection class is in the bootstrap hierarchy; its instance shape will fill out alongside Set/Dictionary in the next box. 28 tests in lib/smalltalk/tests/collections.sx.
HashedCollection/Set/Dictionary/IdentityDictionary
Stream hierarchy: ReadStream/WriteStream/ReadWriteStream
Number tower: SmallInteger/LargePositiveInteger/Float/Fraction
String>>format:, printOn: for everything

Phase 6 — SUnit + corpus to 200+

Port SUnit (TestCase, TestSuite, TestResult) — written in SX-Smalltalk, runs in itself
Vendor a slice of Pharo Kernel-Tests and Collections-Tests
Drive the scoreboard up: aim for 200+ green tests
Stretch: ANSI Smalltalk validator subset

Phase 7 — speed (optional)

Method-dictionary inline caching (already in CEK as a primitive; just wire selector cache)
Block intrinsification beyond whileTrue: / ifTrue:
Compare against GNU Smalltalk on the corpus

Progress log

Newest first. Agent appends on every commit.

2026-04-25: Phase 5 sequenceable-collection methods + 28 tests (lib/smalltalk/tests/collections.sx). 13 shared methods on SequenceableCollection (inject:into:, detect:, count:, …), inherited by Array/String/Symbol via self do:. String primitives at:/copyFrom:to:/first/last/do:. 523/523 total.
2026-04-25: Exception system + 15 tests (lib/smalltalk/tests/exceptions.sx). Exception/Error/ZeroDivide/MessageNotUnderstood in bootstrap; signal/signal: raise via SX raise; on:do:/ensure:/ifCurtailed: on BlockClosure via SX guard. Phase 4 complete. 495/495 total.
2026-04-25: Object>>becomeForward: + 6 tests. In-place mutation of :class and :ivars via dict-set!; aliases see the new identity. 480/480 total.
2026-04-25: Behavior>>compile: + sisters + 9 tests. Parses source via st-parse-method, installs via runtime helpers; also added addSelector:withMethod: and removeSelector:. 474/474 total.
2026-04-25: respondsTo: / isKindOf: / isMemberOf: + 26 tests. Universal at st-primitive-send. 465/465 total.
2026-04-25: Object>>perform: family + 10 tests. Universal dispatch via st-send after (str (nth args 0)) for the selector. 439/439 total.
2026-04-25: Phase 4 reflection accessors (lib/smalltalk/tests/reflection.sx, 26 tests). Universal Object>>class, plus methodDict/selectors/instanceVariableNames/allInstVarNames/classMethodDict/classSelectors on class-refs. 429/429 total.
2026-04-25: conformance.sh + scoreboard.{json,md} (lib/smalltalk/conformance.sh, lib/smalltalk/scoreboard.json, lib/smalltalk/scoreboard.md). Single-pass runner over test.sh -v; baseline at 5 programs / 39 corpus tests / 403 total. Phase 3 complete.
2026-04-25: classic-corpus #5 Life (tests/programs/life.st, 4 tests). Spec-interpreter Conway's Life with edge handling. Block + blinker + glider initial setup verified; larger step counts pending JIT (each spec-interpreter step is ~5-8s on a 5x5 grid). Added {e1. e2. e3} dynamic array literal to parser + evaluator. 403/403 total.
2026-04-25: classic-corpus #4 mandelbrot (tests/programs/mandelbrot.st, 7 tests). Escape-time iterator + grid counter. Discovered + fixed an immutable-list bug in lit-array eval — map produced an immutable list so at:put: raised; rebuilt via append!. Quicksort tests had been silently dropping ~7 cases due to that bug; now actually mutate. 399/399 total.
2026-04-25: classic-corpus #3 quicksort (tests/programs/quicksort.st, 9 tests). Lomuto partition; verified across duplicates, already-sorted/reverse-sorted, empty, single, negatives, all-equal, plus in-place mutation. 385/385 total.
2026-04-25: classic-corpus #2 eight-queens (tests/programs/eight-queens.st, 5 tests). Backtracking search; verified for boards of size 1, 4, 5. Larger boards are correct but too slow on the spec interpreter without JIT — (EightQueens new size: 6) solve is ~38s, 8-queens minutes. 382/382 total.
2026-04-25: classic-corpus #1 fibonacci (tests/programs/fibonacci.st + tests/programs.sx, 13 tests). Added smalltalk-load chunk loader, class-side subclass:instanceVariableNames: (and longer Pharo variants), Array new: size, methodsFor:/category: no-ops, st-split-ivars. 377/377 total.
2026-04-25: cannotReturn: implemented (lib/smalltalk/tests/cannot_return.sx, 5 tests). Each method-invocation gets an {:active true} cell shared with its blocks; st-invoke flips it on exit; ^expr raises if the cell is dead. Tests use SX guard to catch the raise. Non-^ blocks unaffected. 364/364 total.
2026-04-25: ifTrue: / ifFalse: family pinned (lib/smalltalk/tests/conditional.sx, 24 tests) + parser fix: | is now accepted as a binary selector in expression position (tokenizer still emits it as bar for block param/temp delimiting; parse-binary-message accepts both). Caught by false | true truncating silently to false. 359/359 total.
2026-04-25: whileTrue: / whileFalse: / no-arg variants pinned (lib/smalltalk/tests/while.sx, 14 tests). st-block-while returns nil per ANSI; behaviour verified under captured locals, nesting, early ^, and zero/many iterations. 334/334 total.
2026-04-25: BlockContext value family pinned (lib/smalltalk/tests/blocks.sx, 19 tests). Each value/valueN/valueWithArguments: variant verified plus closure semantics (read, write, later-mutation re-read), nested blocks, and block-as-arg. 320/320 total.
2026-04-25: THE SHOWCASE — non-local return via captured method-return continuations + 14 NLR tests (lib/smalltalk/tests/nlr.sx). st-invoke wraps body in call/cc; blocks copy creating method's ^k; ^expr invokes that k. Verified across nested blocks, to:do: / whileTrue:, blocks passed to different methods (Caller→Helper escapes back to Caller), inner-vs-outer method nesting. Sentinel-based return removed. 301/301 total.
2026-04-25: super send + 9 tests (lib/smalltalk/tests/super.sx). st-super-send walks from defining-class's superclass; class-side aware; primitives → DNU fallback. Also fixed top-level | temps | parsing in st-parse (the absence of which was silently aborting earlier eval/dnu tests — counts go from 274 → 287, with previously-skipped tests now actually running).
2026-04-25: doesNotUnderstand: + 12 DNU tests (lib/smalltalk/tests/dnu.sx). Bootstrap installs Message (with selector/arguments accessors). Primitives signal :unhandled instead of erroring; st-dnu builds a Message and walks doesNotUnderstand: lookup. User Object DNU intercepts unknown sends to native receivers (Number, String, Block) too. 267/267 total.
2026-04-25: method-lookup cache (st-method-cache keyed by class|selector|side, stores :not-found for misses). Invalidation on define/add/remove + bootstrap. st-class-remove-method! added. Stats helpers + 10 cache tests; 255/255 total.
2026-04-25: smalltalk-eval-ast + 60 eval tests (lib/smalltalk/eval.sx, lib/smalltalk/tests/eval.sx). Frame chain with mutable locals/ivars (via dict-set!), full literal eval, send dispatch (user methods + native primitive tables for Number/String/Boolean/Nil/Array/Block/Class), block closures, while/to:do:, cascades returning last, sentinel-based ^return. User Point class round-trip works including + returning a fresh point. 245/245 total.
2026-04-25: class table + bootstrap (lib/smalltalk/runtime.sx, lib/smalltalk/tests/runtime.sx). Canonical hierarchy, type→class mapping for native SX values, instance construction, ivar inheritance, method install with :defining-class stamp, instance- and class-side method lookup walking the superclass chain. 54 new tests, 185/185 total.
2026-04-25: chunk-stream parser + pragmas + 21 chunk/pragma tests (lib/smalltalk/tests/parse_chunks.sx). st-read-chunks (with !! doubling), st-parse-chunks state machine for methodsFor: batches incl. class-side. Pragmas with multiple keyword pairs, signed numeric / string / symbol args, in either pragma-then-temps or temps-then-pragma order. 131/131 tests pass.
2026-04-25: expression-level parser + 47 parse tests (lib/smalltalk/parser.sx, lib/smalltalk/tests/parse.sx). Full message precedence (unary > binary > keyword), cascades, blocks with params/temps, literal/byte arrays, assignment chain, method headers (unary/binary/keyword). Chunk-format ! ! driver deferred to a follow-up box. 110/110 tests pass.
2026-04-25: tokenizer + 63 tests (lib/smalltalk/tokenizer.sx, lib/smalltalk/tests/tokenize.sx, lib/smalltalk/test.sh). All token types covered except scaled decimals 1.5s2 (deferred). #( and #[ emit open tokens; literal-array contents lexed as ordinary tokens for the parser to interpret.

Blockers

Shared-file issues that need someone else to fix. Minimal repro only.

(none yet)

25 KiB Raw Blame History