Files
rose-ash/plans/smalltalk-on-sx.md
giles c33d03d2a2
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Has been cancelled
smalltalk: non-local return via captured ^k + 14 nlr tests
2026-04-25 03:40:01 +00:00

14 KiB

Smalltalk-on-SX: blocks with non-local return on delimited continuations

The headline showcase is blocks — Smalltalk's closures with non-local return (^expr aborts the enclosing method, not the block). Every other Smalltalk on top of a host VM (RSqueak on PyPy, GemStone on C, Maxine on Java) reinvents non-local return on whatever stack discipline the host gives them. On SX it's a one-liner: a block holds a captured continuation; ^ just invokes it. Message-passing OO falls out cheaply on top of the existing component / dispatch machinery.

End-state goal: ANSI-ish Smalltalk-80 subset, SUnit working, ~200 hand-written tests + a vendored slice of the Pharo kernel tests, classic corpus (eight queens, quicksort, mandelbrot, Conway's Life).

Scope decisions (defaults — override by editing before we spawn)

  • Syntax: Pharo / Squeak chunk format (! separators, Object subclass: #Foo …). No fileIn/fileOut images — text source only.
  • Conformance: ANSI X3J20 as a target, not bug-for-bug Squeak. "Reads like Smalltalk, runs like Smalltalk."
  • Test corpus: SUnit ported to SX-Smalltalk + custom programs + a curated slice of Pharo Kernel-Tests / Collections-Tests.
  • Image: out of scope. Source-only. No become: between sessions, no snapshotting.
  • Reflection: class, respondsTo:, perform:, doesNotUnderstand: in. become: (object-identity swap) in — it's a good CEK exercise. Method modification at runtime in.
  • GUI / Morphic / threads: out entirely.

Ground rules

  • Scope: only touch lib/smalltalk/** and plans/smalltalk-on-sx.md. Don't edit spec/, hosts/, shared/, or any other lib/<lang>/**. Smalltalk primitives go in lib/smalltalk/runtime.sx.
  • SX files: use sx-tree MCP tools only.
  • Commits: one feature per commit. Keep ## Progress log updated and tick roadmap boxes.

Architecture sketch

Smalltalk source
    │
    ▼
lib/smalltalk/tokenizer.sx  — selectors, keywords, literals, $c, #sym, #(…), $'…'
    │
    ▼
lib/smalltalk/parser.sx     — AST: classes, methods, blocks, cascades, sends
    │
    ▼
lib/smalltalk/transpile.sx  — AST → SX AST (entry: smalltalk-eval-ast)
    │
    ▼
lib/smalltalk/runtime.sx    — class table, MOP, dispatch, primitives

Core mapping:

  • Class = SX dict {:name :superclass :ivars :methods :class-methods :metaclass}. Class table is a flat dict keyed by class name.
  • Object = SX dict {:class :ivars}ivars keyed by symbol. Tagged ints / floats / strings / symbols are not boxed; their class is looked up by SX type.
  • Method = SX lambda closing over a self binding + temps. Body wrapped in a delimited continuation so ^ can escape.
  • Message send = (st-send receiver selector args) — does class-table lookup, walks superclass chain, falls back to doesNotUnderstand: with a Message object.
  • Block [:x | … ^v … ] = lambda + captured ^k (the method-return continuation). Invoking ^ calls k; outer block invocation past method return raises BlockContext>>cannotReturn:.
  • Cascade r m1; m2; m3 = (let ((tmp r)) (st-send tmp 'm1 ()) (st-send tmp 'm2 ()) (st-send tmp 'm3 ())).
  • ifTrue:ifFalse: / whileTrue: = ordinary block sends; the runtime intrinsifies them in the JIT path so they compile to native branches (Tier 1 of bytecode expansion already covers this pattern).
  • become: = swap two object identities everywhere — in SX this is a heap walk, but we restrict to oneWayBecome: (cheap: rewrite class field) by default.

Roadmap

Phase 1 — tokenizer + parser

  • Tokenizer: identifiers, keywords (foo:), binary selectors (+, ==, ,, ->, ~= etc.), numbers (radix 16r1F; scaled 1.5s2 deferred), strings '…''…', characters $c, symbols #foo #'foo bar' #+, byte arrays #[1 2 3] (open token), literal arrays #(1 #foo 'x') (open token), comments "…"
  • Parser (expression level): blocks [:a :b | | t1 t2 | …], cascades, message precedence (unary > binary > keyword), assignment, return, statement sequences, literal arrays, byte arrays, paren grouping, method headers (+ other, at:put:, unary, with temps and body). Class-definition keyword messages parse as ordinary keyword sends — no special-case needed.
  • Parser (chunk-stream level): st-read-chunks splits source on ! (with !! doubling) and st-parse-chunks runs the Pharo file-in state machine — methodsFor: / class methodsFor: opens a method batch, an empty chunk closes it. Pragmas <primitive: …> (incl. multiple keyword pairs, before or after temps, multiple per method) parsed into the method AST.
  • Unit tests in lib/smalltalk/tests/parse.sx

Phase 2 — object model + sequential eval

  • Class table + bootstrap (lib/smalltalk/runtime.sx): canonical hierarchy installed (Object, Behavior, ClassDescription, Class, Metaclass, UndefinedObject, Boolean/True/False, Magnitude/Number/Integer/SmallInteger/Float/Character, Collection/SequenceableCollection/ArrayedCollection/Array/String/Symbol/OrderedCollection/Dictionary, BlockClosure). User class definition via st-class-define!, methods via st-class-add-method! (stamps :defining-class for super), method lookup walks chain, ivars accumulated through superclass chain, native SX value types map to Smalltalk classes via st-class-of.
  • smalltalk-eval-ast (lib/smalltalk/eval.sx): all literal kinds, ident resolution (locals → ivars → class refs), self/super/thisContext, assignment (locals or ivars, mutating), message send, cascade, sequence, and ^return via a sentinel marker (proper continuation-based escape is the Phase 3 showcase). Frames carry a parent chain so blocks close over outer locals. Primitive method tables for SmallInteger/Float, String/Symbol, Boolean, UndefinedObject, Array, BlockClosure (value/value:/whileTrue:/etc.), and class-side new/name/etc. Also satisfies "30+ tests" — 60 eval tests.
  • Method lookup: walk class → superclass already in st-method-lookup-walk; new cached wrapper st-method-lookup keys on (class, selector, side) and stores :not-found for negative results so DNU paths don't re-walk. Cache invalidates on st-class-define!, st-class-add-method!, st-class-add-class-method!, st-class-remove-method!, and full bootstrap. Stats helpers st-method-cache-stats / st-method-cache-reset-stats! for tests + later debugging.
  • doesNotUnderstand: fallback. Message class added at bootstrap with selector/arguments ivars and accessor methods. Primitive senders (Number/String/Boolean/Nil/Array/BlockClosure/class-side) now return the :unhandled sentinel for unknown selectors; st-send builds a Message via st-make-message and routes through st-dnu, which looks up doesNotUnderstand: on the receiver's class chain (instance- or class-side as appropriate). User overrides intercept unknowns and see the symbol selector + arguments array in the Message.
  • super send. Method invocation captures the defining class on the frame; st-super-send walks from (st-class-superclass defining-class) (instance- or class-side as appropriate). Falls through primitives → DNU when no method is found. Receiver is preserved as self, so ivar mutations stick. Verified for: subclass override calls parent, inherited super resolves to defining class's parent (not receiver's), multi-level A→B→C chain, super inside a block, super walks past an intermediate class with no local override.
  • 30+ tests in lib/smalltalk/tests/eval.sx (60 tests, covering literals through user-class method dispatch with cascades and closures)

Phase 3 — blocks + non-local return (THE SHOWCASE)

  • Method invocation captures a ^k (the return continuation) and binds it as the block's escape. st-invoke wraps body in (call/cc (fn (k) ...)); the frame's :return-k is set to k. Block creation copies (get frame :return-k) onto the block. Block invocation sets the new frame's :return-k to the block's saved one — so non-local return reaches back through any number of intermediate block invocations.
  • ^expr from inside a block invokes that captured ^k. The "return" AST type evaluates the expression then calls (k v) on the frame's :return-k. Verified: detect:in: style early-exit, multi-level nested blocks, ^ from inside to:do:/whileTrue:, ^ from a block passed to a different method (Caller→Helper) returns from Caller.
  • BlockContext>>value, value:, value:value:, …, valueWithArguments:
  • whileTrue: / whileTrue / whileFalse: / whileFalse as ordinary block sends — runtime intrinsifies the loop in the bytecode JIT
  • ifTrue: / ifFalse: / ifTrue:ifFalse: as block sends, similarly intrinsified
  • Escape past returned-from method raises BlockContext>>cannotReturn:
  • Classic programs in lib/smalltalk/tests/programs/:
    • eight-queens.st
    • quicksort.st
    • mandelbrot.st
    • life.st (Conway's Life, glider gun)
    • fibonacci.st (recursive + memoised)
  • lib/smalltalk/conformance.sh + runner, scoreboard.json + scoreboard.md

Phase 4 — reflection + MOP

  • Object>>class, class>>name, class>>superclass, class>>methodDict, class>>selectors
  • Object>>perform: / perform:with: / perform:withArguments:
  • Object>>respondsTo:, Object>>isKindOf:, Object>>isMemberOf:
  • Behavior>>compile: — runtime method addition
  • Object>>becomeForward: (one-way become; rewrites the class field of aReceiver)
  • Exceptions: Exception, Error, signal, signal:, on:do:, ensure:, ifCurtailed: — built on top of SX handler-bind/raise

Phase 5 — collections + numeric tower

  • SequenceableCollection/OrderedCollection/Array/String/Symbol
  • HashedCollection/Set/Dictionary/IdentityDictionary
  • Stream hierarchy: ReadStream/WriteStream/ReadWriteStream
  • Number tower: SmallInteger/LargePositiveInteger/Float/Fraction
  • String>>format:, printOn: for everything

Phase 6 — SUnit + corpus to 200+

  • Port SUnit (TestCase, TestSuite, TestResult) — written in SX-Smalltalk, runs in itself
  • Vendor a slice of Pharo Kernel-Tests and Collections-Tests
  • Drive the scoreboard up: aim for 200+ green tests
  • Stretch: ANSI Smalltalk validator subset

Phase 7 — speed (optional)

  • Method-dictionary inline caching (already in CEK as a primitive; just wire selector cache)
  • Block intrinsification beyond whileTrue: / ifTrue:
  • Compare against GNU Smalltalk on the corpus

Progress log

Newest first. Agent appends on every commit.

  • 2026-04-25: THE SHOWCASE — non-local return via captured method-return continuations + 14 NLR tests (lib/smalltalk/tests/nlr.sx). st-invoke wraps body in call/cc; blocks copy creating method's ^k; ^expr invokes that k. Verified across nested blocks, to:do: / whileTrue:, blocks passed to different methods (Caller→Helper escapes back to Caller), inner-vs-outer method nesting. Sentinel-based return removed. 301/301 total.
  • 2026-04-25: super send + 9 tests (lib/smalltalk/tests/super.sx). st-super-send walks from defining-class's superclass; class-side aware; primitives → DNU fallback. Also fixed top-level | temps | parsing in st-parse (the absence of which was silently aborting earlier eval/dnu tests — counts go from 274 → 287, with previously-skipped tests now actually running).
  • 2026-04-25: doesNotUnderstand: + 12 DNU tests (lib/smalltalk/tests/dnu.sx). Bootstrap installs Message (with selector/arguments accessors). Primitives signal :unhandled instead of erroring; st-dnu builds a Message and walks doesNotUnderstand: lookup. User Object DNU intercepts unknown sends to native receivers (Number, String, Block) too. 267/267 total.
  • 2026-04-25: method-lookup cache (st-method-cache keyed by class|selector|side, stores :not-found for misses). Invalidation on define/add/remove + bootstrap. st-class-remove-method! added. Stats helpers + 10 cache tests; 255/255 total.
  • 2026-04-25: smalltalk-eval-ast + 60 eval tests (lib/smalltalk/eval.sx, lib/smalltalk/tests/eval.sx). Frame chain with mutable locals/ivars (via dict-set!), full literal eval, send dispatch (user methods + native primitive tables for Number/String/Boolean/Nil/Array/Block/Class), block closures, while/to:do:, cascades returning last, sentinel-based ^return. User Point class round-trip works including + returning a fresh point. 245/245 total.
  • 2026-04-25: class table + bootstrap (lib/smalltalk/runtime.sx, lib/smalltalk/tests/runtime.sx). Canonical hierarchy, type→class mapping for native SX values, instance construction, ivar inheritance, method install with :defining-class stamp, instance- and class-side method lookup walking the superclass chain. 54 new tests, 185/185 total.
  • 2026-04-25: chunk-stream parser + pragmas + 21 chunk/pragma tests (lib/smalltalk/tests/parse_chunks.sx). st-read-chunks (with !! doubling), st-parse-chunks state machine for methodsFor: batches incl. class-side. Pragmas with multiple keyword pairs, signed numeric / string / symbol args, in either pragma-then-temps or temps-then-pragma order. 131/131 tests pass.
  • 2026-04-25: expression-level parser + 47 parse tests (lib/smalltalk/parser.sx, lib/smalltalk/tests/parse.sx). Full message precedence (unary > binary > keyword), cascades, blocks with params/temps, literal/byte arrays, assignment chain, method headers (unary/binary/keyword). Chunk-format ! ! driver deferred to a follow-up box. 110/110 tests pass.
  • 2026-04-25: tokenizer + 63 tests (lib/smalltalk/tokenizer.sx, lib/smalltalk/tests/tokenize.sx, lib/smalltalk/test.sh). All token types covered except scaled decimals 1.5s2 (deferred). #( and #[ emit open tokens; literal-array contents lexed as ordinary tokens for the parser to interpret.

Blockers

Shared-file issues that need someone else to fix. Minimal repro only.

  • (none yet)