14 KiB
Smalltalk-on-SX: blocks with non-local return on delimited continuations
The headline showcase is blocks — Smalltalk's closures with non-local return (^expr aborts the enclosing method, not the block). Every other Smalltalk on top of a host VM (RSqueak on PyPy, GemStone on C, Maxine on Java) reinvents non-local return on whatever stack discipline the host gives them. On SX it's a one-liner: a block holds a captured continuation; ^ just invokes it. Message-passing OO falls out cheaply on top of the existing component / dispatch machinery.
End-state goal: ANSI-ish Smalltalk-80 subset, SUnit working, ~200 hand-written tests + a vendored slice of the Pharo kernel tests, classic corpus (eight queens, quicksort, mandelbrot, Conway's Life).
Scope decisions (defaults — override by editing before we spawn)
- Syntax: Pharo / Squeak chunk format (
!separators,Object subclass: #Foo …). No fileIn/fileOut images — text source only. - Conformance: ANSI X3J20 as a target, not bug-for-bug Squeak. "Reads like Smalltalk, runs like Smalltalk."
- Test corpus: SUnit ported to SX-Smalltalk + custom programs + a curated slice of Pharo
Kernel-Tests/Collections-Tests. - Image: out of scope. Source-only. No
become:between sessions, no snapshotting. - Reflection:
class,respondsTo:,perform:,doesNotUnderstand:in.become:(object-identity swap) in — it's a good CEK exercise. Method modification at runtime in. - GUI / Morphic / threads: out entirely.
Ground rules
- Scope: only touch
lib/smalltalk/**andplans/smalltalk-on-sx.md. Don't editspec/,hosts/,shared/, or any otherlib/<lang>/**. Smalltalk primitives go inlib/smalltalk/runtime.sx. - SX files: use
sx-treeMCP tools only. - Commits: one feature per commit. Keep
## Progress logupdated and tick roadmap boxes.
Architecture sketch
Smalltalk source
│
▼
lib/smalltalk/tokenizer.sx — selectors, keywords, literals, $c, #sym, #(…), $'…'
│
▼
lib/smalltalk/parser.sx — AST: classes, methods, blocks, cascades, sends
│
▼
lib/smalltalk/transpile.sx — AST → SX AST (entry: smalltalk-eval-ast)
│
▼
lib/smalltalk/runtime.sx — class table, MOP, dispatch, primitives
Core mapping:
- Class = SX dict
{:name :superclass :ivars :methods :class-methods :metaclass}. Class table is a flat dict keyed by class name. - Object = SX dict
{:class :ivars}—ivarskeyed by symbol. Tagged ints / floats / strings / symbols are not boxed; their class is looked up by SX type. - Method = SX lambda closing over a
selfbinding + temps. Body wrapped in a delimited continuation so^can escape. - Message send =
(st-send receiver selector args)— does class-table lookup, walks superclass chain, falls back todoesNotUnderstand:with aMessageobject. - Block
[:x | … ^v … ]= lambda + captured^k(the method-return continuation). Invoking^callsk; outer block invocation past method return raisesBlockContext>>cannotReturn:. - Cascade
r m1; m2; m3=(let ((tmp r)) (st-send tmp 'm1 ()) (st-send tmp 'm2 ()) (st-send tmp 'm3 ())). ifTrue:ifFalse:/whileTrue:= ordinary block sends; the runtime intrinsifies them in the JIT path so they compile to native branches (Tier 1 of bytecode expansion already covers this pattern).become:= swap two object identities everywhere — in SX this is a heap walk, but we restrict tooneWayBecome:(cheap: rewrite class field) by default.
Roadmap
Phase 1 — tokenizer + parser
- Tokenizer: identifiers, keywords (
foo:), binary selectors (+,==,,,->,~=etc.), numbers (radix16r1F; scaled1.5s2deferred), strings'…''…', characters$c, symbols#foo#'foo bar'#+, byte arrays#[1 2 3](open token), literal arrays#(1 #foo 'x')(open token), comments"…" - Parser (expression level): blocks
[:a :b | | t1 t2 | …], cascades, message precedence (unary > binary > keyword), assignment, return, statement sequences, literal arrays, byte arrays, paren grouping, method headers (+ other,at:put:, unary, with temps and body). Class-definition keyword messages parse as ordinary keyword sends — no special-case needed. - Parser (chunk-stream level):
st-read-chunkssplits source on!(with!!doubling) andst-parse-chunksruns the Pharo file-in state machine —methodsFor:/class methodsFor:opens a method batch, an empty chunk closes it. Pragmas<primitive: …>(incl. multiple keyword pairs, before or after temps, multiple per method) parsed into the method AST. - Unit tests in
lib/smalltalk/tests/parse.sx
Phase 2 — object model + sequential eval
- Class table + bootstrap (
lib/smalltalk/runtime.sx): canonical hierarchy installed (Object,Behavior,ClassDescription,Class,Metaclass,UndefinedObject,Boolean/True/False,Magnitude/Number/Integer/SmallInteger/Float/Character,Collection/SequenceableCollection/ArrayedCollection/Array/String/Symbol/OrderedCollection/Dictionary,BlockClosure). User class definition viast-class-define!, methods viast-class-add-method!(stamps:defining-classfor super), method lookup walks chain, ivars accumulated through superclass chain, native SX value types map to Smalltalk classes viast-class-of. smalltalk-eval-ast(lib/smalltalk/eval.sx): all literal kinds, ident resolution (locals → ivars → class refs), self/super/thisContext, assignment (locals or ivars, mutating), message send, cascade, sequence, and ^return via a sentinel marker (proper continuation-based escape is the Phase 3 showcase). Frames carry a parent chain so blocks close over outer locals. Primitive method tables for SmallInteger/Float, String/Symbol, Boolean, UndefinedObject, Array, BlockClosure (value/value:/whileTrue:/etc.), and class-sidenew/name/etc. Also satisfies "30+ tests" — 60 eval tests.- Method lookup: walk class → superclass already in
st-method-lookup-walk; new cached wrapperst-method-lookupkeys on(class, selector, side)and stores:not-foundfor negative results so DNU paths don't re-walk. Cache invalidates onst-class-define!,st-class-add-method!,st-class-add-class-method!,st-class-remove-method!, and full bootstrap. Stats helpersst-method-cache-stats/st-method-cache-reset-stats!for tests + later debugging. doesNotUnderstand:fallback.Messageclass added at bootstrap withselector/argumentsivars and accessor methods. Primitive senders (Number/String/Boolean/Nil/Array/BlockClosure/class-side) now return the:unhandledsentinel for unknown selectors;st-sendbuilds aMessageviast-make-messageand routes throughst-dnu, which looks updoesNotUnderstand:on the receiver's class chain (instance- or class-side as appropriate). User overrides intercept unknowns and see the symbol selector + arguments array in the Message.supersend. Method invocation captures the defining class on the frame;st-super-sendwalks from(st-class-superclass defining-class)(instance- or class-side as appropriate). Falls through primitives → DNU when no method is found. Receiver is preserved asself, so ivar mutations stick. Verified for: subclass override calls parent, inheritedsuperresolves to defining class's parent (not receiver's), multi-levelA→B→Cchain, super inside a block, super walks past an intermediate class with no local override.- 30+ tests in
lib/smalltalk/tests/eval.sx(60 tests, covering literals through user-class method dispatch with cascades and closures)
Phase 3 — blocks + non-local return (THE SHOWCASE)
- Method invocation captures a
^k(the return continuation) and binds it as the block's escape.st-invokewraps body in(call/cc (fn (k) ...)); the frame's:return-kis set to k. Block creation copies(get frame :return-k)onto the block. Block invocation sets the new frame's:return-kto the block's saved one — so non-local return reaches back through any number of intermediate block invocations. ^exprfrom inside a block invokes that captured^k. The "return" AST type evaluates the expression then calls(k v)on the frame's :return-k. Verified:detect:in:style early-exit, multi-level nested blocks, ^ from insideto:do:/whileTrue:, ^ from a block passed to a different method (Caller→Helper) returns from Caller.BlockContext>>value,value:,value:value:, …,valueWithArguments:whileTrue:/whileTrue/whileFalse:/whileFalseas ordinary block sends — runtime intrinsifies the loop in the bytecode JITifTrue:/ifFalse:/ifTrue:ifFalse:as block sends, similarly intrinsified- Escape past returned-from method raises
BlockContext>>cannotReturn: - Classic programs in
lib/smalltalk/tests/programs/:eight-queens.stquicksort.stmandelbrot.stlife.st(Conway's Life, glider gun)fibonacci.st(recursive + memoised)
lib/smalltalk/conformance.sh+ runner,scoreboard.json+scoreboard.md
Phase 4 — reflection + MOP
Object>>class,class>>name,class>>superclass,class>>methodDict,class>>selectorsObject>>perform:/perform:with:/perform:withArguments:Object>>respondsTo:,Object>>isKindOf:,Object>>isMemberOf:Behavior>>compile:— runtime method additionObject>>becomeForward:(one-way become; rewrites the class field ofaReceiver)- Exceptions:
Exception,Error,signal,signal:,on:do:,ensure:,ifCurtailed:— built on top of SXhandler-bind/raise
Phase 5 — collections + numeric tower
SequenceableCollection/OrderedCollection/Array/String/SymbolHashedCollection/Set/Dictionary/IdentityDictionaryStreamhierarchy:ReadStream/WriteStream/ReadWriteStreamNumbertower:SmallInteger/LargePositiveInteger/Float/FractionString>>format:,printOn:for everything
Phase 6 — SUnit + corpus to 200+
- Port SUnit (TestCase, TestSuite, TestResult) — written in SX-Smalltalk, runs in itself
- Vendor a slice of Pharo
Kernel-TestsandCollections-Tests - Drive the scoreboard up: aim for 200+ green tests
- Stretch: ANSI Smalltalk validator subset
Phase 7 — speed (optional)
- Method-dictionary inline caching (already in CEK as a primitive; just wire selector cache)
- Block intrinsification beyond
whileTrue:/ifTrue: - Compare against GNU Smalltalk on the corpus
Progress log
Newest first. Agent appends on every commit.
- 2026-04-25: THE SHOWCASE — non-local return via captured method-return continuations + 14 NLR tests (
lib/smalltalk/tests/nlr.sx).st-invokewraps body incall/cc; blocks copy creating method's^k;^exprinvokes that k. Verified across nested blocks,to:do:/whileTrue:, blocks passed to different methods (Caller→Helper escapes back to Caller), inner-vs-outer method nesting. Sentinel-based return removed. 301/301 total. - 2026-04-25:
supersend + 9 tests (lib/smalltalk/tests/super.sx).st-super-sendwalks from defining-class's superclass; class-side aware; primitives → DNU fallback. Also fixed top-level| temps |parsing inst-parse(the absence of which was silently aborting earlier eval/dnu tests — counts go from 274 → 287, with previously-skipped tests now actually running). - 2026-04-25:
doesNotUnderstand:+ 12 DNU tests (lib/smalltalk/tests/dnu.sx). Bootstrap installsMessage(with selector/arguments accessors). Primitives signal:unhandledinstead of erroring;st-dnubuilds a Message and walksdoesNotUnderstand:lookup. User Object DNU intercepts unknown sends to native receivers (Number, String, Block) too. 267/267 total. - 2026-04-25: method-lookup cache (
st-method-cachekeyed byclass|selector|side, stores:not-foundfor misses). Invalidation on define/add/remove + bootstrap.st-class-remove-method!added. Stats helpers + 10 cache tests; 255/255 total. - 2026-04-25:
smalltalk-eval-ast+ 60 eval tests (lib/smalltalk/eval.sx,lib/smalltalk/tests/eval.sx). Frame chain with mutable locals/ivars (viadict-set!), full literal eval, send dispatch (user methods + native primitive tables for Number/String/Boolean/Nil/Array/Block/Class), block closures, while/to:do:, cascades returning last, sentinel-based^return. User Point class round-trip works including+returning a fresh point. 245/245 total. - 2026-04-25: class table + bootstrap (
lib/smalltalk/runtime.sx,lib/smalltalk/tests/runtime.sx). Canonical hierarchy, type→class mapping for native SX values, instance construction, ivar inheritance, method install with:defining-classstamp, instance- and class-side method lookup walking the superclass chain. 54 new tests, 185/185 total. - 2026-04-25: chunk-stream parser + pragmas + 21 chunk/pragma tests (
lib/smalltalk/tests/parse_chunks.sx).st-read-chunks(with!!doubling),st-parse-chunksstate machine formethodsFor:batches incl. class-side. Pragmas with multiple keyword pairs, signed numeric / string / symbol args, in either pragma-then-temps or temps-then-pragma order. 131/131 tests pass. - 2026-04-25: expression-level parser + 47 parse tests (
lib/smalltalk/parser.sx,lib/smalltalk/tests/parse.sx). Full message precedence (unary > binary > keyword), cascades, blocks with params/temps, literal/byte arrays, assignment chain, method headers (unary/binary/keyword). Chunk-format! !driver deferred to a follow-up box. 110/110 tests pass. - 2026-04-25: tokenizer + 63 tests (
lib/smalltalk/tokenizer.sx,lib/smalltalk/tests/tokenize.sx,lib/smalltalk/test.sh). All token types covered except scaled decimals1.5s2(deferred).#(and#[emit open tokens; literal-array contents lexed as ordinary tokens for the parser to interpret.
Blockers
Shared-file issues that need someone else to fix. Minimal repro only.
- (none yet)