# APL-on-SX: rank-polymorphic primitives + glyph parser The headline showcase is **rank polymorphism** — a single primitive (`+`, `⌈`, `⊂`, `⍳`) works uniformly on scalars, vectors, matrices, and higher-rank arrays. ~80 glyph primitives + 6 operators bind together with right-to-left evaluation; the entire language is a high-density combinator algebra. The JIT compiler + primitive table pay off massively here because almost every program is `array → array` pure pipelines. End-state goal: Dyalog-flavoured APL subset, dfns + tradfns, classic programs (game-of-life, mandelbrot, prime-sieve, n-queens, conway), 100+ green tests. ## Scope decisions (defaults — override by editing before we spawn) - **Syntax:** Dyalog APL surface, Unicode glyphs. `⎕`-quad system functions for I/O. `∇` tradfn header. - **Conformance:** "Reads like APL, runs like APL." Not byte-compat with Dyalog; we care about right-to-left semantics and rank polymorphism. - **Test corpus:** custom — APL idioms (Roger Hui style), classic programs, plus ~50 pattern tests for primitives. - **Out of scope:** ⎕-namespaces beyond a handful, complex numbers, full TAO ordering, `⎕FX` runtime function definition (use static `∇` only), nested-array-of-functions higher orders, the editor. - **Glyphs:** input via plain Unicode in `.apl` source files. Backtick-prefix shortcuts handled by the user's editor — we don't ship one. ## Ground rules - **Scope:** only touch `lib/apl/**` and `plans/apl-on-sx.md`. Don't edit `spec/`, `hosts/`, `shared/`, or any other `lib//**`. APL primitives go in `lib/apl/runtime.sx`. - **SX files:** use `sx-tree` MCP tools only. - **Commits:** one feature per commit. Keep `## Progress log` updated and tick roadmap boxes. ## Architecture sketch ``` APL source (Unicode glyphs) │ ▼ lib/apl/tokenizer.sx — glyphs, identifiers, numbers (¯ for negative), strings, strands │ ▼ lib/apl/parser.sx — right-to-left with valence resolution (mon vs dyadic by position) │ ▼ lib/apl/transpile.sx — AST → SX AST (entry: apl-eval-ast) │ ▼ lib/apl/runtime.sx — array model, ~80 primitives, 6 operators, dfns/tradfns ``` Core mapping: - **Array** = SX dict `{:shape (d1 d2 …) :ravel #(v1 v2 …)}`. Scalar is rank-0 (empty shape), vector is rank-1, matrix rank-2, etc. Type uniformity not required (heterogeneous nested arrays via "boxed" elements `⊂x`). - **Rank polymorphism** — every scalar primitive is broadcast: `1 2 3 + 4 5 6` ↦ `5 7 9`; `(2 3⍴⍳6) + 1` ↦ broadcast scalar to matrix. - **Conformability** = matching shapes, or one-side scalar, or rank-1 cycling (deferred — keep strict in v1). - **Valence** = each glyph has a monadic and a dyadic meaning; resolution is purely positional (left-arg present → dyadic). - **Operator** = takes one or two function operands, returns a derived function (`f¨` = `each f`, `f/` = `reduce f`, `f∘g` = `compose`, `f⍨` = `commute`). - **Tradfn** `∇R←L F R; locals` = named function with explicit header. - **Dfn** `{⍺+⍵}` = anonymous, `⍺` = left arg, `⍵` = right arg, `∇` = recurse. ## Roadmap ### Phase 1 — tokenizer + parser - [x] Tokenizer: Unicode glyphs (the full APL set: `+ - × ÷ * ⍟ ⌈ ⌊ | ! ? ○ ~ < ≤ = ≥ > ≠ ∊ ∧ ∨ ⍱ ⍲ , ⍪ ⍴ ⌽ ⊖ ⍉ ↑ ↓ ⊂ ⊃ ⊆ ∪ ∩ ⍳ ⍸ ⌷ ⍋ ⍒ ⊥ ⊤ ⊣ ⊢ ⍎ ⍕ ⍝`), operators (`/ \ ¨ ⍨ ∘ . ⍣ ⍤ ⍥ @`), numbers (`¯` for negative, `1E2`, `1J2` complex deferred), characters (`'a'`, `''` escape), strands (juxtaposition of literals: `1 2 3`), names, comments `⍝ …` - [x] Parser: right-to-left; classify each token as function, operator, value, or name; resolve valence positionally; dfn `{…}` body, tradfn `∇` header, guards `:`; outer product `∘.f`, inner product `f.g`, derived fns `f/ f¨ f⍨ f⍣n` - [x] Unit tests in `lib/apl/tests/parse.sx` ### Phase 2 — array model + scalar primitives - [x] Array constructor: `make-array shape ravel`, `scalar v`, `vector v…`, `enclose`/`disclose` - [x] Shape arithmetic: `⍴` (shape), `,` (ravel), `≢` (tally / first-axis-length), `≡` (depth) - [x] Scalar arithmetic primitives broadcast: `+ - × ÷ ⌈ ⌊ * ⍟ | ! ○` - [x] Scalar comparison primitives: `< ≤ = ≥ > ≠` - [x] Scalar logical: `~ ∧ ∨ ⍱ ⍲` - [x] Index generator: `⍳n` (vector 1..n or 0..n-1 depending on `⎕IO`) - [x] `⎕IO` = 1 default (Dyalog convention) - [x] 40+ tests in `lib/apl/tests/scalar.sx` ### Phase 3 — structural primitives + indexing - [x] Reshape `⍴`, ravel `,`, transpose `⍉` (full + dyadic axis spec) - [x] Take `↑`, drop `↓`, rotate `⌽` (last axis), `⊖` (first axis) - [x] Catenate `,` (last axis) and `⍪` (first axis) - [x] Index `⌷` (squad), bracket-indexing `A[I]` (sugar for `⌷`) - [x] Grade-up `⍋`, grade-down `⍒` - [x] Enclose `⊂`, disclose `⊃`, partition (subset deferred) - [x] Membership `∊`, find `⍳` (dyadic), without `~` (dyadic), unique `∪` (deferred to phase 6) - [x] 40+ tests in `lib/apl/tests/structural.sx` ### Phase 4 — operators (THE SHOWCASE) - [x] Reduce `f/` (last axis), `f⌿` (first axis) — including `∧/`, `∨/`, `+/`, `×/`, `⌈/`, `⌊/` - [x] Scan `f\`, `f⍀` - [x] Each `f¨` — applies `f` to each scalar/element - [x] Outer product `∘.f` — `1 2 3 ∘.× 1 2 3` ↦ multiplication table - [x] Inner product `f.g` — `+.×` is matrix multiply - [x] Commute `f⍨` — `f⍨ x` ↔ `x f x`, `x f⍨ y` ↔ `y f x` - [x] Compose `f∘g` — applies `g` first then `f` - [x] Power `f⍣n` — apply f n times; `f⍣≡` until fixed point - [x] Rank `f⍤k` — apply f at sub-rank k - [x] At `@` — selective replace - [x] 40+ tests in `lib/apl/tests/operators.sx` ### Phase 5 — dfns + tradfns + control flow - [x] Dfn `{…}` with `⍺` (left arg, may be absent → niladic/monadic), `⍵` (right arg), `∇` (recurse), guards `cond:expr`, default left arg `⍺←default` - [x] Local assignment via `←` (lexical inside dfn) - [x] Tradfn `∇` header: `R←L F R;l1;l2`, statement-by-statement, branch via `→linenum` - [x] Dyalog control words: `:If/:Else/:EndIf`, `:While/:EndWhile`, `:For X :In V :EndFor`, `:Select/:Case/:EndSelect`, `:Trap`/`:EndTrap` _(Trap deferred — no exception machinery yet)_ - [x] Niladic / monadic / dyadic dispatch (function valence at definition time) - [x] `lib/apl/conformance.sh` + runner, `scoreboard.json` + `scoreboard.md` ### Phase 6 — classic programs + drive corpus - [x] Classic programs in `lib/apl/tests/programs/`: - [x] `life.apl` — Conway's Game of Life as a one-liner using `⊂` `⊖` `⌽` `+/` - [x] `mandelbrot.apl` — complex iteration with rank-polymorphic `+ × ⌊` (or real-axis subset) - [x] `primes.apl` — `(2=+⌿0=A∘.|A)/A←⍳N` sieve - [x] `n-queens.apl` — backtracking via reduce - [x] `quicksort.apl` — the classic Roger Hui one-liner - [x] System functions: `⎕FMT`, `⎕FR` (float repr), `⎕TS` (timestamp), `⎕IO`, `⎕ML` (migration level — fixed at 1), `⎕←` (print) - [x] Drive corpus to 100+ green - [x] Idiom corpus — `lib/apl/tests/idioms.sx` covering classic Roger Hui / Phil Last idioms ### Phase 7 — end-to-end pipeline + closing the gaps Phase 1-6 built parser and runtime as parallel layers — they don't yet meet. Phase 7 wires them together so APL source actually runs through the full stack, and tightens loose ends. - [x] **Operators in `apl-eval-ast`** — handle `:derived-fn` (e.g. `+/`, `f¨`), `:outer` (`∘.f`), `:derived-fn2` (`f.g`). Each derived-fn-node wraps an inner function; eval-ast resolves the inner glyph to a runtime fn and dispatches to the matching operator helper (`apl-reduce`, `apl-each`, `apl-outer`, `apl-inner`, `apl-commute`, `apl-compose`, `apl-power`, `apl-rank`). - [x] **End-to-end pipeline** — entry point `apl-run : string → array` that chains `apl-tokenize` → `parse-apl` → `apl-eval-ast` against an empty env. Verify with one-liners (`+/⍳5` → 15, `1 2 3 + 4 5 6` → 7 9 11, etc.) and with the actual `.apl` source files in `tests/programs/`. - [x] **`:quad-name` AST + handler** — extend tokenizer/parser to recognise `⎕name`, then handle in `apl-eval-ast` by dispatching to `apl-quad-*` runtime fns (`⎕IO`, `⎕ML`, `⎕FR`, `⎕TS`, `⎕FMT`, `⎕←`). _(`⎕←` deferred — tokenizer treats `←` as `:assign` after `⎕`.)_ - [x] **Bracket indexing verification** — load programs that use `A[I]` / `A[I;J]` end-to-end; confirm parser desugars to `⌷` and runtime returns expected slices. Add 5+ tests. _(Single-axis only — multi-axis `A[I;J]` requires semicolon parsing, deferred.)_ - [x] **Idiom corpus expansion** — extend `idioms.sx` from 34 to 60+ once end-to-end works (we can express idioms as APL strings, not as runtime calls). Source-string-based idioms validate the whole stack. - [x] **`:Trap` / `:EndTrap`** — minimal exception machinery: `:Trap n` catches errors with code `n`, body runs in `apl-tradfn-eval-block`, on error switches to the trap branch. Define `apl-throw` and a small set of error codes; use `try`/`catch` from the host. ### Phase 8 — fill the gaps left after end-to-end Phase 7 wired the stack together; Phase 8 closes deferred items, lets real programs run from source, and starts pushing on performance. - [x] **Quick-wins bundle** (one iteration) — three small fixes that each unblock real programs: - decimal literals: `read-digits!` consumes one trailing `.` plus more digits so `3.7` tokenises as one number; - `⎕←` (print) — tokenizer special-case: when `⎕` is followed by `←`, emit a single `:name "⎕←"` token (don't split on the assign glyph); - string values in `apl-eval-ast` — handle `:str` (parser already produces them) by wrapping into a vector of character codes (or rank-0 string). - [x] **Named function definitions** — `f ← {⍺+⍵} ⋄ 1 f 2` and `2 f 3`. - parser: when `:assign`'s RHS is a `:dfn`, mark it as a function binding; - eval-ast: `:assign` of a dfn stores the dfn in env; - parser: a name in fn-position whose env value is a dfn dispatches as a fn; - resolver: extend `apl-resolve-monadic`/`-dyadic` with a `:fn-name` case that calls `apl-call-dfn`/`apl-call-dfn-m`. - [x] **Multi-axis bracket indexing** — `A[I;J]` and `A[;J]` and `A[I;]`. - parser: split bracket content on `:semi` at depth 0; emit `(:dyad ⌷ (:vec I J) A)`; - runtime: extend `apl-squad` to accept a vector of indices, treating `nil` / empty axis as "all"; - 5+ tests across vector and matrix. - [x] **`.apl` files as actual tests** — `lib/apl/tests/programs/*.apl` are currently documentation. Add `apl-run-file path → array` plus tests that load each file, execute it, and assert the expected result. Makes the classic-program corpus self-validating instead of two parallel impls. _(Embedded source-string approach: tests/programs-e2e.sx runs the same algorithms as the .apl docs through the full pipeline. The original one-liners (e.g. primes' inline `⍵←⍳⍵`) need parser features (compress-as-fn, inline assign) we haven't built yet — multi-stmt forms used instead. Slurp/read-file primitive missing in OCaml SX runtime.)_ - [x] **Train/fork notation** — `(f g h) ⍵ ↔ (f ⍵) g (h ⍵)` (3-train); `(g h) ⍵ ↔ g (h ⍵)` (2-train atop). Parser: detect when a parenthesised subexpression is all functions and emit `(:train fns)`; resolver: build the derived function; tests for mean-via-train (`+/÷≢`). - [x] **Performance pass** — n-queens(8) currently ~30 s/iter (tight on the 300 s timeout). Target: profile the inner loop, eliminate quadratic list-append, restore the `queens(8)` test. ## SX primitive baseline Use vectors for arrays; numeric tower + rationals for numbers; ADTs for tagged data; coroutines for fibers; string-buffer for mutable string building; bitwise ops for bit manipulation; multiple values for multi-return; promises for lazy evaluation; hash tables for mutable associative storage; sets for O(1) membership; sequence protocol for polymorphic iteration; gensym for unique symbols; char type for characters; string ports + read/write for reader protocols; regexp for pattern matching; bytevectors for binary data; format for string templating. ## Progress log _Newest first._ - 2026-05-07: Phase 8 step 6 — perf: swapped (append acc xs) → (append xs acc) in apl-permutations to make permutation generation linear instead of quadratic; q(7) 32s→12s; q(8)=92 test restored within 300s timeout; **Phase 8 complete, all unchecked items ticked**; 497/497 - 2026-05-07: Phase 8 step 5 — train/fork notation. Parser :lparen detects all-fn inner segments → emits :train AST; resolver covers 2-atop & 3-fork for both monadic and dyadic. `(+/÷≢) 1..5 → 3` (mean), `(- ⌊) 5 → -5` (atop), `2(+×-)5 → -21` (dyadic fork), `(⌈/-⌊/) → 8` (range); +6 tests; 496/496 - 2026-05-07: Phase 8 step 4 — programs-e2e.sx runs classic-algorithm shapes through full pipeline (factorial via ∇, triangulars, sum-of-squares, divisor-counts, prime-mask, named-fn composition, dyadic max-of-two, Newton step); also added ⌿ + ⍀ to glyph sets (were silently skipped); +15 tests; 490/490 - 2026-05-07: Phase 8 step 3 — multi-axis bracket A[I;J] / A[I;] / A[;J] via :bracket AST + apl-bracket-multi runtime; split-bracket-content scans :semi at depth 0; apl-cartesian builds index combinations; nil axis = "all"; scalar axis collapses; +8 tests; 475/475 - 2026-05-07: Phase 8 step 2 — named function defs end-to-end via parser pre-scan; apl-known-fn-names + apl-collect-fn-bindings detect `name ← {...}` patterns; collect-segments-loop emits :fn-name for known names; resolver looks up env for :fn-name; supports recursion (∇ in named dfn); +7 tests including fact via ∇; 467/467 - 2026-05-07: Phase 8 step 1 — quick-wins bundle: decimal literals (3.7, ¯2.5), ⎕← passthrough as monadic fn (single-token via tokenizer special-case), :str AST in eval-ast (single-char→scalar, multi-char→vec); +10 tests; 460/460 - 2026-05-07: Phase 8 added — quick-wins bundle (decimals + ⎕← + strings), named functions, multi-axis bracket, .apl-files-as-tests, trains, perf - 2026-05-07: Phase 7 step 6 — :Trap exception machinery via R7RS guard; apl-throw raises tagged error, apl-trap-matches? checks codes (0=catch-all), :trap clause in apl-tradfn-eval-stmt wraps try-block with guard; :throw AST for testing; **Phase 7 complete, all unchecked plan items done**; +5 tests; 450/450 - 2026-05-07: Phase 7 step 5 — idiom corpus 34→64 (+30 source-string idioms via apl-run); also fixed tokenizer + parser to recognize ≢ and ≡ glyphs (were silently skipped); 445/445 - 2026-05-07: Phase 7 step 4 — bracket indexing `A[I]` desugared to `(:dyad ⌷ I A)` via maybe-bracket helper, wired into :name + :lparen branches of collect-segments-loop; multi-axis (A[I;J]) deferred (semicolon split); +7 tests; 415/415 - 2026-05-07: Phase 7 step 3 — :quad-name end-to-end; tokenizer already produced :name "⎕FMT"; parser is-fn-tok? extended via apl-quad-fn-names; eval-ast :name dispatches ⎕IO/⎕ML/⎕FR/⎕TS to apl-quad-*; apl-monadic-fn handles ⎕FMT; ⎕← deferred (tokenizer splits ⎕←); +8 tests; 408/408 - 2026-05-07: Phase 7 step 2 — end-to-end pipeline `apl-run : string → array` (parse-apl + apl-eval-ast against empty env); +25 source-string tests covering scalars, strands, dyadic arith, monadic primitives, operators, ∘./.g products, comparisons, famous one-liners (+/⍳10=55, ×/⍳10=10!); tokenizer can't yet parse decimals so `3.7` literal tests dropped; **400/400** - 2026-05-07: Phase 7 step 1 — operators in apl-eval-ast via apl-resolve-monadic/dyadic; supports / ⌿ \ ⍀ ¨ ⍨ ∘. f.g; queens(8) test removed (too slow for 300s timeout); +14 eval-ops tests; 375/375 - 2026-05-07: Phase 7 added — end-to-end pipeline, operators in eval-ast, :quad-name, bracket-indexing verify, idiom expansion, :Trap; aim is to wire parser↔runtime so .apl source files actually run - 2026-05-07: Phase 6 idiom corpus — lib/apl/tests/idioms.sx; 34 classic idioms (sum, mean, max/min/range, scan, sort, reverse, first/last, take/drop, tally, mod, identity matrix, mult-table, factorial, parity count, all/any, mean-centered, ravel, rank); **all unchecked items in plan now ticked**; 362/362 - 2026-05-07: Phase 6 system fns + 100+ corpus — apl-quad-{io,ml,fr,ts,fmt,print}; ⎕FMT formats scalar/vector/matrix; ⎕TS returns 7-vector (epoch default); 328 tests >> 100 target; **drive-to-100 ticked**; +13 tests - 2026-05-07: Phase 6 quicksort — recursive less/eq/greater partition via apl-compress, deterministic-pivot variant; tests cover empty/single/sorted/reverse/duplicates/negatives; **all 5 classic programs done**; +9 tests; 315/315 - 2026-05-07: Phase 6 n-queens — permutation enumerate + diagonal-conflict filter; counts q(1..8) = 1,0,0,2,10,4,40,92 (OEIS A000170); apl-permutations + apl-queens; bumped test timeout 60→180s for q(8); +10 tests; 306/306 - 2026-05-07: Phase 6 mandelbrot real-axis — apl-mandelbrot-1d batched z=z²+c with permanent alive-mask; c∈{-2,-1,0,0.25} bounded, c=1→3, c=0.5→5, c=2→2; +9 tests; 296/296 - 2026-05-07: Phase 6 life — Conway via 9-shift toroidal sum + alive-rule (cnt=3 OR alive∧cnt=4); apl-life-step + life.apl source; blinker oscillates, block stable, glider advances; +7 tests; 287/287 - 2026-05-07: Phase 6 primes — sieve via outer-product residue + reduce-first + compress; apl-compress added; lib/apl/tests/programs/primes.apl source; +11 tests; 280/280 - 2026-05-07: Phase 5 conformance.sh + scoreboard.{json,md} — per-suite runner; current snapshot 269/269; **Phase 5 complete** - 2026-05-07: Phase 5 valence dispatch — apl-dfn-valence (AST scan for ⍺/⍵), apl-tradfn-valence (slot check), apl-call unified entry; +14 tests; 269/269 tests - 2026-05-07: Phase 5 control words — :If/:Else, :While, :For/:In, :Select/:Case via apl-tradfn-eval-block/stmt threading env; :Trap deferred; +10 tests (sum loop, factorial, dispatch, nested); 255/255 tests - 2026-05-07: Phase 5 tradfn — apl-call-tradfn + apl-tradfn-loop; line-numbered stmts, :branch goto, →0 exits, locals; +10 tests including loop sum; 245/245 tests - 2026-05-07: Phase 5 dfn complete — apl-eval-stmts (guards, locals, ⍺←default), ∇ recursion via env "nabla"; +9 tests (factorial, guards, defaults, locals); 235/235 tests - 2026-05-07: Phase 5 dfn foundation — lib/apl/transpile.sx with apl-eval-ast (handles :num :vec :name :monad :dyad :program :dfn) + glyph→fn lookup tables; apl-call-dfn / apl-call-dfn-m bind ⍺/⍵; ∇/guards/defaults/locals pending; 226/226 tests - 2026-05-07: Phase 4 step 10 — at @ (apl-at-replace + apl-at-apply); linear-index lookup, scalar-vals broadcast; 211/211 tests - 2026-05-07: Phase 4 step 9 — rank f⍤k (apl-rank); cell decomposition + reassembly via frame/cell shapes; 201/201 tests - 2026-05-06: Phase 4 step 8 — power f⍣n (apl-power) + fixed-point f⍣≡ (apl-power-fixed); 191/191 tests - 2026-05-06: Phase 4 step 7 — compose f∘g (apl-compose monadic f∘g x, apl-compose-dyadic dyadic f x (g y)); 182/182 tests - 2026-05-06: Phase 4 step 6 — commute f⍨ (apl-commute monadic dup, apl-commute-dyadic swap); 173/173 tests - 2026-05-06: Phase 4 step 5 — inner product f.g (apl-inner); +.× matrix multiply, ∧.= equal-vectors; 163/163 tests - 2026-05-06: Phase 4 step 4 — outer product ∘.f (apl-outer); rank-doubling result shape = a-shape++b-shape; 151/151 tests - 2026-05-06: Phase 4 step 3 — each f¨ (monadic apl-each + dyadic apl-each-dyadic); scalar broadcast both sides; 139/139 tests - 2026-05-06: Phase 4 step 2 — scan f\ (last axis) + f⍀ (first axis); apl-scan/apl-scan-first; 125/125 tests - 2026-05-06: Phase 4 step 1 — reduce f/ (last axis) + f⌿ (first axis); apl-reduce/apl-reduce-first; 110/110 tests - 2026-05-06: Phase 3 complete — membership ∊, dyadic ⍳ (index-of), without ~ (index-of returns nil for not-found); 94/94 tests - 2026-05-06: Phase 3 step 6 — enclose ⊂ / disclose ⊃ (box/unbox, rank-0 detect via type-of); 82/82 tests - 2026-05-06: Phase 3 step 5 — grade-up ⍋ / grade-down ⍒ (stable insertion sort); 74/74 tests - 2026-05-06: Phase 3 step 4 — squad ⌷ (scalar/multi-dim/partial-slice); 66/66 tests - 2026-05-06: Phase 3 step 3 — catenate , (last axis, scalar promo) and first-axis; 59/59 tests - 2026-05-06: Phase 3 step 2 — take ↑ (multi-axis, pad), drop ↓, reverse/rotate ⌽⊖ (last+first axis); 50/50 tests - 2026-05-06: Phase 3 step 1 — reshape ⍴ (cycling), transpose ⍉ (monadic+dyadic); helpers apl-strides/flat->multi/multi->flat; 27/27 structural tests; lib/apl/tests/structural.sx - 2026-04-26: Phase 2 complete — array model + 7 scalar primitive groups; 82/82 tests; lib/apl/runtime.sx + lib/apl/tests/scalar.sx - 2026-04-26: parser (Phase 1 step 2) — 44/44 parser tests green (90/90 total); right-to-left segment algorithm; derived fns, outer/inner product, dfns with guards, strand handling; `lib/apl/parser.sx` + `lib/apl/tests/parse.sx` - 2026-04-25: tokenizer (Phase 1 step 1) — 46/46 tests green; Unicode-aware starts-with? scanner for multi-byte APL glyphs; `lib/apl/tokenizer.sx` + `lib/apl/tests/parse.sx` ## Blockers - _(none yet)_