8.8 KiB
Smalltalk-on-SX: blocks with non-local return on delimited continuations
The headline showcase is blocks — Smalltalk's closures with non-local return (^expr aborts the enclosing method, not the block). Every other Smalltalk on top of a host VM (RSqueak on PyPy, GemStone on C, Maxine on Java) reinvents non-local return on whatever stack discipline the host gives them. On SX it's a one-liner: a block holds a captured continuation; ^ just invokes it. Message-passing OO falls out cheaply on top of the existing component / dispatch machinery.
End-state goal: ANSI-ish Smalltalk-80 subset, SUnit working, ~200 hand-written tests + a vendored slice of the Pharo kernel tests, classic corpus (eight queens, quicksort, mandelbrot, Conway's Life).
Scope decisions (defaults — override by editing before we spawn)
- Syntax: Pharo / Squeak chunk format (
!separators,Object subclass: #Foo …). No fileIn/fileOut images — text source only. - Conformance: ANSI X3J20 as a target, not bug-for-bug Squeak. "Reads like Smalltalk, runs like Smalltalk."
- Test corpus: SUnit ported to SX-Smalltalk + custom programs + a curated slice of Pharo
Kernel-Tests/Collections-Tests. - Image: out of scope. Source-only. No
become:between sessions, no snapshotting. - Reflection:
class,respondsTo:,perform:,doesNotUnderstand:in.become:(object-identity swap) in — it's a good CEK exercise. Method modification at runtime in. - GUI / Morphic / threads: out entirely.
Ground rules
- Scope: only touch
lib/smalltalk/**andplans/smalltalk-on-sx.md. Don't editspec/,hosts/,shared/, or any otherlib/<lang>/**. Smalltalk primitives go inlib/smalltalk/runtime.sx. - SX files: use
sx-treeMCP tools only. - Commits: one feature per commit. Keep
## Progress logupdated and tick roadmap boxes.
Architecture sketch
Smalltalk source
│
▼
lib/smalltalk/tokenizer.sx — selectors, keywords, literals, $c, #sym, #(…), $'…'
│
▼
lib/smalltalk/parser.sx — AST: classes, methods, blocks, cascades, sends
│
▼
lib/smalltalk/transpile.sx — AST → SX AST (entry: smalltalk-eval-ast)
│
▼
lib/smalltalk/runtime.sx — class table, MOP, dispatch, primitives
Core mapping:
- Class = SX dict
{:name :superclass :ivars :methods :class-methods :metaclass}. Class table is a flat dict keyed by class name. - Object = SX dict
{:class :ivars}—ivarskeyed by symbol. Tagged ints / floats / strings / symbols are not boxed; their class is looked up by SX type. - Method = SX lambda closing over a
selfbinding + temps. Body wrapped in a delimited continuation so^can escape. - Message send =
(st-send receiver selector args)— does class-table lookup, walks superclass chain, falls back todoesNotUnderstand:with aMessageobject. - Block
[:x | … ^v … ]= lambda + captured^k(the method-return continuation). Invoking^callsk; outer block invocation past method return raisesBlockContext>>cannotReturn:. - Cascade
r m1; m2; m3=(let ((tmp r)) (st-send tmp 'm1 ()) (st-send tmp 'm2 ()) (st-send tmp 'm3 ())). ifTrue:ifFalse:/whileTrue:= ordinary block sends; the runtime intrinsifies them in the JIT path so they compile to native branches (Tier 1 of bytecode expansion already covers this pattern).become:= swap two object identities everywhere — in SX this is a heap walk, but we restrict tooneWayBecome:(cheap: rewrite class field) by default.
Roadmap
Phase 1 — tokenizer + parser
- Tokenizer: identifiers, keywords (
foo:), binary selectors (+,==,,,->,~=etc.), numbers (radix16r1F; scaled1.5s2deferred), strings'…''…', characters$c, symbols#foo#'foo bar'#+, byte arrays#[1 2 3](open token), literal arrays#(1 #foo 'x')(open token), comments"…" - Parser (expression level): blocks
[:a :b | | t1 t2 | …], cascades, message precedence (unary > binary > keyword), assignment, return, statement sequences, literal arrays, byte arrays, paren grouping, method headers (+ other,at:put:, unary, with temps and body). Class-definition keyword messages parse as ordinary keyword sends — no special-case needed. - Parser (chunk-stream level):
st-read-chunkssplits source on!(with!!doubling) andst-parse-chunksruns the Pharo file-in state machine —methodsFor:/class methodsFor:opens a method batch, an empty chunk closes it. Pragmas<primitive: …>(incl. multiple keyword pairs, before or after temps, multiple per method) parsed into the method AST. - Unit tests in
lib/smalltalk/tests/parse.sx
Phase 2 — object model + sequential eval
- Class table + bootstrap:
Object,Behavior,Class,Metaclass,UndefinedObject,Boolean/True/False,Number/Integer/Float,String,Symbol,Array,Block smalltalk-eval-ast: literals, variable reference, assignment, message send, cascade, sequence, return- Method lookup: walk class → superclass; cache hit-class on
(class, selector) doesNotUnderstand:fallback constructingMessageobjectsupersend (lookup starts at superclass of defining class, not receiver class)- 30+ tests in
lib/smalltalk/tests/eval.sx
Phase 3 — blocks + non-local return (THE SHOWCASE)
- Method invocation captures a
^k(the return continuation) and binds it as the block's escape ^exprfrom inside a block invokes that captured^kBlockContext>>value,value:,value:value:, …,valueWithArguments:whileTrue:/whileTrue/whileFalse:/whileFalseas ordinary block sends — runtime intrinsifies the loop in the bytecode JITifTrue:/ifFalse:/ifTrue:ifFalse:as block sends, similarly intrinsified- Escape past returned-from method raises
BlockContext>>cannotReturn: - Classic programs in
lib/smalltalk/tests/programs/:eight-queens.stquicksort.stmandelbrot.stlife.st(Conway's Life, glider gun)fibonacci.st(recursive + memoised)
lib/smalltalk/conformance.sh+ runner,scoreboard.json+scoreboard.md
Phase 4 — reflection + MOP
Object>>class,class>>name,class>>superclass,class>>methodDict,class>>selectorsObject>>perform:/perform:with:/perform:withArguments:Object>>respondsTo:,Object>>isKindOf:,Object>>isMemberOf:Behavior>>compile:— runtime method additionObject>>becomeForward:(one-way become; rewrites the class field ofaReceiver)- Exceptions:
Exception,Error,signal,signal:,on:do:,ensure:,ifCurtailed:— built on top of SXhandler-bind/raise
Phase 5 — collections + numeric tower
SequenceableCollection/OrderedCollection/Array/String/SymbolHashedCollection/Set/Dictionary/IdentityDictionaryStreamhierarchy:ReadStream/WriteStream/ReadWriteStreamNumbertower:SmallInteger/LargePositiveInteger/Float/FractionString>>format:,printOn:for everything
Phase 6 — SUnit + corpus to 200+
- Port SUnit (TestCase, TestSuite, TestResult) — written in SX-Smalltalk, runs in itself
- Vendor a slice of Pharo
Kernel-TestsandCollections-Tests - Drive the scoreboard up: aim for 200+ green tests
- Stretch: ANSI Smalltalk validator subset
Phase 7 — speed (optional)
- Method-dictionary inline caching (already in CEK as a primitive; just wire selector cache)
- Block intrinsification beyond
whileTrue:/ifTrue: - Compare against GNU Smalltalk on the corpus
Progress log
Newest first. Agent appends on every commit.
- 2026-04-25: chunk-stream parser + pragmas + 21 chunk/pragma tests (
lib/smalltalk/tests/parse_chunks.sx).st-read-chunks(with!!doubling),st-parse-chunksstate machine formethodsFor:batches incl. class-side. Pragmas with multiple keyword pairs, signed numeric / string / symbol args, in either pragma-then-temps or temps-then-pragma order. 131/131 tests pass. - 2026-04-25: expression-level parser + 47 parse tests (
lib/smalltalk/parser.sx,lib/smalltalk/tests/parse.sx). Full message precedence (unary > binary > keyword), cascades, blocks with params/temps, literal/byte arrays, assignment chain, method headers (unary/binary/keyword). Chunk-format! !driver deferred to a follow-up box. 110/110 tests pass. - 2026-04-25: tokenizer + 63 tests (
lib/smalltalk/tokenizer.sx,lib/smalltalk/tests/tokenize.sx,lib/smalltalk/test.sh). All token types covered except scaled decimals1.5s2(deferred).#(and#[emit open tokens; literal-array contents lexed as ordinary tokens for the parser to interpret.
Blockers
Shared-file issues that need someone else to fix. Minimal repro only.
- (none yet)