Files
rose-ash/plans/lua-on-sx.md
giles fb18629916
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Has been cancelled
lua: parenthesized expressions truncate multi-return via new lua-paren AST node +2 tests
2026-04-24 22:48:33 +00:00

19 KiB
Raw Blame History

Lua-on-SX: Lua 5.1 on the CEK/VM

Compile Lua 5.1 AST to SX AST; the existing CEK evaluator runs it. Same architecture as plans/js-on-sx.md — reuse SX semantics wherever they fit, only shim the Lua-specific parts (tables/metatables, nil/false-only-falsy, multi-return, coroutines via perform/resume).

End-state goal: 100% of PUC-Rio Lua 5.1.5 test suite. Running as a long-lived background agent that drives the scoreboard up one failure-mode at a time, like lib/js/.

Ground rules

  • Scope: only touch lib/lua/** and plans/lua-on-sx.md. Do not edit spec/, hosts/, shared/, lib/js/**, lib/hyperscript/**, lib/prolog/**, lib/stdlib.sx, or anything in lib/ root. Lua-specific primitives go in lib/lua/runtime.sx.
  • Shared-file issues go under "Blockers" below with a minimal repro; do not fix from this loop.
  • SX files: use sx-tree MCP tools only (never Edit/Read/Write on .sx files). sx_write_file for new files, path/pattern edits for changes.
  • Architecture: Lua source → Lua AST → SX AST → CEK. No standalone Lua evaluator.
  • Commits: one feature per commit. Keep ## Progress log updated (dated entries, newest first) and tick boxes in the roadmap.

Architecture sketch

Lua source text
    │
    ▼
lib/lua/tokenizer.sx     — numbers, strings (short + long [[…]]), idents, ops, comments
    │
    ▼
lib/lua/parser.sx        — Lua AST as SX trees, e.g. (lua-for-num i a b c body)
    │
    ▼
lib/lua/transpile.sx     — Lua AST → SX AST (entry: lua-eval-ast)
    │
    ▼
existing CEK / VM

Runtime shims in lib/lua/runtime.sx: lua-truthy?, string coercion for ../arithmetic, table ops (array + hash part), metatable dispatch, pcall/error bridge, string/math/table libs.

Roadmap

Each item: implement → tests → tick box → update progress log.

Phase 1 — tokenizer + parser

  • Tokenizer: numbers (int, float, hex), strings (short + long [[…]]), idents, keywords, operators, comments (--, --[[…]])
  • Parser: blocks, local, if/elseif/else/end, while, numeric for, function, return, expressions, table constructors, indexing (., []), calls (f(…), f:m(…))
  • Skip for phase 1: generic for … in …, goto/labels, nested varargs ...
  • Unit tests in lib/lua/tests/parse.sx: source → expected AST

Phase 2 — transpile: control flow + arithmetic

  • lua-eval-ast entry
  • Arithmetic (Lua 5.1 semantics — / is float)
  • Comparison + logical (short-circuit, Lua truthy)
  • .. concat with string/number coercion
  • if, while, numeric for, local, assignment, blocks
  • 30+ eval tests in lib/lua/tests/eval.sx

Phase 3 — tables + functions + first PUC-Rio slice

  • function (anon, local, top-level), closures
  • Multi-return: return as list, unpack at call sites
  • Table constructors (array + hash + computed keys)
  • Raw table access t.k / t[k] (no metatables yet)
  • Vendor PUC-Rio 5.1.5 suite to lib/lua/lua-tests/ (just .lua files)
  • lib/lua/conformance.sh + Python runner (model on lib/js/test262-runner.py)
  • scoreboard.json + scoreboard.md baseline

Phase 4 — metatables + error handling (next run)

  • Metatable dispatch: __index, __newindex, __add/__sub/…, __eq, __lt, __call, __tostring, __len
  • pcall/xpcall/error via handler-bind
  • Generic for … in …

Phase 5 — coroutines (the showcase)

  • coroutine.create/.resume/.yield/.status/.wrap via perform/cek-resume

Phase 6 — standard library

  • stringformat, sub, find, match, gmatch, gsub, len, rep, upper, lower, byte, char
  • math — full surface
  • tableinsert, remove, concat, sort, unpack
  • io — minimal stub (read/write to SX IO surface)
  • os — time/date subset

Phase 7 — modules + full conformance

  • require / package via SX define-library/import
  • Drive PUC-Rio scoreboard to 100%

Progress log

Newest first. Agent appends on every commit.

  • 2026-04-24: lua: scoreboard iteration — parenthesized expressions truncate multi-return (Lua spec: (f()) forces single value even if f returns multi). Parser wraps (expr) in a new lua-paren AST node; transpile emits (lua-first inner). Fixes constructs.lua@30 (a,b,c = (f()) expects a=1, b=nil, c=nil) and math.lua@13. 375/375 green (+2 paren tests). Scoreboard: 8× asserts (was 10).
  • 2026-04-24: lua: scoreboard iteration — stripped (else (raise e)) from lua-tx-loop-guard. SX guard with (else (raise e)) hangs in a loop (re-enters the same guard). Since unmatched sentinels fall through to the enclosing guard naturally, the else is unnecessary. Diagnosed calls.lua undefined-fat: function fat(x) defined at Lua top-level is scoped inside the SX top-level guard's scope; loadstring-captured closures don't see it via lexical env. Fix would require either dropping the top-level guard (breaking top-level return) or dynamic env access — deferred.
  • 2026-04-24: lua: scoreboard iteration — method-call double-evaluation bug. lua-tx-method-call emitted (lua-call (lua-get OBJ name) OBJ args…) which evaluated OBJ TWICE, so a:add(10):add(20):add(30).x computed 110 instead of 60 (side effects applied twice). Fixed by (let ((__obj OBJ)) (lua-call (lua-get __obj name) __obj args…)). 373/373 green (+1 chaining test).
  • 2026-04-24: lua: 🎉 FIRST PASSING PUC-Rio TEST — 1/16 runnable (6.2%). verybig.lua now passes: needed io.output/io.input/io.stdout/io.stderr stubs, made os.remove return true (test asserts on it), and added dofile/loadfile stubs. All cumulative fixes (returns/break/scoping/escapes/precedence/vararg/tonumber-trim) combined make this test's full happy path work end-to-end. 372 unit tests. Failure mix: 10× assertion / 4× timeout / 1× call-non-fn.
  • 2026-04-24: lua: scoreboard iteration — proper break via guard+raise sentinel (lua-brk) + auto-first multi-values in arith/concat. Loop break dispatch was previously a no-op (emitted bare 'lua-break-marker symbol that nothing caught); converted to raise+catch pattern, wrapping the OUTER invocation of _while_loop/_for_loop/_repeat_loop/__for_loop in a break-guard (wrapping body doesn't work — break would just be caught and loop keeps recursing). Also lua-arith/lua-concat/lua-concat-coerce now lua-first their operands so multi-returns auto-truncate at scalar boundaries. 372/372 green (+4 break tests). Scoreboard: 10×assert / 4×timeout / 2×call-non-fn (no more undef-symbol or compare-incompat).
  • 2026-04-24: lua: scoreboard iteration — proper early-return via guard+raise sentinel. Fixes long-logged limitation: if cond then return X end ...rest now exits the enclosing function; rest is skipped. lua-tx-return raises (list 'lua-ret value); every function body and the top-level chunk + loadstring'd chunks wrap in a guard that catches the sentinel and returns its value. Eliminates "compare incompatible types" from constructs.lua (past line 40). 368/368 green (+3 early-return tests).
  • 2026-04-24: lua: scoreboard iteration — unary-minus / ^ precedence fix. Per Lua spec, ^ binds tighter than unary -, so -2^2 should parse as -(2^2) = -4, not (-2)^2 = 4. My parser recursed into parse-unary and then let ^ bind to the already-negated operand. Added parse-pow-chain helper and changed the else branch of parse-unary to parse a primary + ^-chain before returning; unary operators now wrap the full ^-chain. Fixed constructs.lua past assert #3 (moved to compare-incompatible). 365/365 green (+3 precedence tests).
  • 2026-04-24: lua: scoreboard iteration — lua-byte-to-char regression fix. My previous change returned 2-char strings ("\a" etc.) for bytes that SX string literals can't express (0, 7, 8, 11, 12, 1431, 127+), breaking 'a\0a' length from 3 → 4. Now only 9/10/13 and printable 32-126 produce real bytes; others use a single "?" placeholder so string.len stays correct. literals.lua back to failing at assert #4 (was regressed to #2).
  • 2026-04-24: lua: scoreboard iteration — decimal string escapes \ddd (1-3 digits). Tokenizer read-string previously fell through to literal for digits, so "\65" came out as "65" not "A". Added read-decimal-escape! consuming up to 3 digits while keeping value ≤255, plus \a/\b/\f/\v control escapes and lua-byte-to-char ASCII lookup. 362 tests (+2 escape tests).
  • 2026-04-24: lua: scoreboard iteration — loadstring error propagation. When loadstring(s)() was implemented as eval-expr ( (let () compiled)), SX's eval-expr wrapped any propagated raise as "Unhandled exception: X" — so error('hi') inside a loadstring'd chunk came out as that wrapped string instead of the clean "hi" Lua expects. Fix: transpile source once into a lambda AST, eval-expr it ONCE to get a callable fn value, return that — subsequent calls propagate raises cleanly. Guarded parse-failure path returns (nil, err) per Lua convention. vararg.lua now runs past assert #18; errors.lua past parse stage.
  • 2026-04-24: lua: scoreboard iteration — table.sort O(n²) insertion-sort → quicksort (Lomuto partition). 1000-element sorts finish in ms; but sort.lua uses 30k elements and still times out even at 90s (metamethod-heavy interpreter overhead). Correctness verified on 1000/5000 element random arrays.
  • 2026-04-24: lua: scoreboard iteration — dostring(s) alias for loadstring(s)() (Lua 5.0 compat used by literals.lua). Diagnosed locals.lua call-non-fn at call #18 → getfenv/setfenv stub-return pattern fails assert(getfenv(foo("")) == a) (need real env tracking, deferred). Tokenizer long-string-leading-NL rule verified correct.
  • 2026-04-24: lua: scoreboard iteration — Lua 5.0-style arg auto-binding inside vararg functions (some PUC-Rio tests still rely on it). lua-varargs-arg-table builds {1=v1, 2=v2, …, n=count}; transpile adds arg binding alongside __varargs when is-vararg. Diagnosis done with assert-counter instrumentation — literals.lua fails at #4 (long-string NL rule), vararg.lua was at #2 (arg table — FIXED), attrib.lua at #9, locals.lua now past asserts into call-non-fn. 360 tests.
  • 2026-04-24: lua: scoreboard iteration — loadstring scoping. Temporarily instrumented lua-assert with a counter, found locals.lua fails at assertion #5: loadstring('local a = {}')() → assert(type(a) ~= 'table'). The loadstring'd code's local a was leaking to outer scope because lua-eval-ast ran at top-level. Fixed by transpiling once and wrapping the AST in (let () …) before eval-expr.
  • 2026-04-24: lua: scoreboard iteration — if/else/elseif body scoping (latent bug). else local x = 99 was leaking to enclosing scope. Wrap all three branches in (let () …) via lua-tx-if-body. 358 tests.
  • 2026-04-24: lua: scoreboard iteration — do-block proper scoping. Was transpiling do ... end to a raw lua-tx pass-through, so defines inside leaked to the enclosing scope (do local i = 100 end overwrote outer i). Now wraps in (let () body) for proper lexical isolation. 355 tests, +2 scoping tests.
  • 2026-04-24: lua: scoreboard iteration — lua-to-number trims whitespace before parse-number (Lua coerces " 3e0 " in arithmetic). math.lua moved past the arith-type error to deeper assertion-land. 12× asserts / 3× timeouts / 1× call-non-fn.
  • 2026-04-24: lua: scoreboard iteration — table.getn/setn/foreach/foreachi (Lua 5.0-era), string.reverse. sort.lua unblocked past getn-undef; now times out on the 30k-element sort body (insertion sort too slow). 13 fail / 3 timeout / 0 pass.
  • 2026-04-24: lua: scoreboard iteration — parser consumes trailing ; after return; added collectgarbage/setfenv/getfenv/T stubs. All parse errors and undefined-symbol failures eliminated — every runnable test now executes deep into the script. Failure mix: 11× assertion failed, 2× timeout, 2× call-non-fn, 1× arith. Still 0/16 pass but the remaining work is substantive (stdlib fidelity vs the exact PUC-Rio assertions).
  • 2026-04-24: lua: scoreboard iteration — trailing-dot number literals (5.), preload stdlibs in package.loaded (string/math/table/io/os/coroutine/package/_G), arg stub, debug module stub. Assertion-failure count 4→8, parse errors 3→1, call-non-fn stable, module-not-found gone.
  • 2026-04-24: lua: scoreboard iteration — vararg ... transpile. Parser already emitted (lua-vararg); transpile now: (a) binds __varargs in function body when is-vararg, (b) emits __varargs for ... uses; lua-varargs/lua-spread-last-multi runtime helpers spread multi in last call-arg and last table-pos positions. Eliminated all 6× "transpile: unsupported" failures; top-5 now all real asserts. 353 unit tests.
  • 2026-04-24: lua: scoreboard iteration — added rawget/rawset/rawequal/rawlen, loadstring/load, select, assert, _G, _VERSION. Failure mix now 6×vararg-transpile / 4×real-assertion / 3×parse / 2×call-non-fn / 1×timeout (was 14 parse + 1 print undef at baseline); tests now reach deep into real assertions. Still 0/16 runnable — next targets: vararg transpile, goto, loadstring-compile depth. 347 unit tests.
  • 2026-04-24: lua: require/package via preload-only (no filesystem search). package.loaded caching, nil-returning modules cache as true, unknown modules error. 347 tests.
  • 2026-04-24: lua: os stub — time/clock monotonic counter, difftime, date (default string / *t dict), getenv/remove/rename/tmpname/execute/exit stubs. Phase 6 complete. 342 tests.
  • 2026-04-24: lua: io stub + print/tostring/tonumber globals. io buffers to internal __io-buffer (tests drain it via io.__buffer()). print: tab-sep + NL. tostring respects __tostring metamethod. 334 tests.
  • 2026-04-24: lua: table lib — insert (append / at pos, shifts up), remove (last / at pos, shifts down), concat (sep, i, j), sort (insertion sort, optional cmp), unpack + table.unpack, maxn. Caught trap: local helper named shift collides with SX's shift special form → renamed to tbl-shift-up/tbl-shift-down. 322 tests.
  • 2026-04-24: lua: math lib — pi/huge + abs/ceil/floor/sqrt/exp/log/log10/pow/trig (sin/cos/tan/asin/acos/atan/atan2)/deg/rad/min/max (&rest)/fmod/modf/random (0/1/2 arg)/randomseed. Most ops delegate to SX primitives; log w/ base via change-of-base. 309 tests.
  • 2026-04-24: lua: string lib — len/upper/lower/rep/sub (1-idx + neg)/byte/char/find/match/gmatch/gsub/format. Patterns are literal-only (no %d/etc.); format is %s/%d/%f/%% only. string.char uses printable-ASCII lookup + tab/nl/cr. 292 tests.
  • 2026-04-24: lua: phase 5 — coroutines (create/resume/yield/status/wrap) via call/cc (perform/cek-resume not exposed to SX userland). Handles multi-yield + final return + arg passthrough. Fix: body's final return must jump via caller-k to the current resume's caller, not unwind through the stale first-call continuation. 273 tests.
  • 2026-04-24: lua: generic for … in … — parser split (= → num, else in), new lua-for-in node, transpile to let-bound f,s,var + recursive __for_loop. Added ipairs/pairs/next/lua-arg globals. Lua fns now arity-tolerant (&rest __args + indexed bind) — needed because generic for always calls iter with 2 args. Noted early-return-in-nested-block as pre-existing limitation. 265 tests.
  • 2026-04-24: lua: pcall/xpcall/error via SX guard + raise. Added lua-apply (arity-dispatch 0-8, apply fallback) because SX apply re-wraps raises as "Unhandled exception". Table payloads preserved (error({code = 42})). 256 total tests.
  • 2026-04-24: lua: phase 4 — metatable dispatch (__index/__newindex/arith/compare/__call/__len), setmetatable/getmetatable/type globals, OO self:method pattern. Transpile routes all calls through lua-call (stashed sx-apply-ref to dodge user-shadowing of SX apply). Skipped __tostring (needs tostring() builtin). 247 total tests.
  • 2026-04-24: lua: PUC-Rio scoreboard baseline — 0/16 runnable pass (0.0%). Top modes: 14× parse error, 1× print undef, 1× vararg transpile. Phase 3 complete.
  • 2026-04-24: lua: conformance runner — conformance.sh shim + conformance.py (long-lived sx_server, epoch protocol, classify_error, writes scoreboard.{json,md}). 24 files classified in full run: 8 skip / 16 fail / 0 timeout.
  • 2026-04-24: lua: vendored PUC-Rio 5.1 test suite (lua5.1-tests.tar.gz from lua.org) to lib/lua/lua-tests/ — 22 .lua files, 6304 lines; README kept for context.
  • 2026-04-24: lua: raw table access — fix lua-set! to use dict-set! (mutating), fix lua-len has?has-key?, #t works, mutation/chained/computed-key writes + reference semantics. 224 total tests.
  • 2026-04-24: lua: phase 3 — table constructors verified (array, hash, computed keys, mixed, nested, dynamic values, fn values, sep variants). 205 total tests.
  • 2026-04-24: lua: multi-return — lua-multi tagged value, lua-first/lua-nth-ret/lua-pack-return runtime, tail-position spread in return/local/assign. 185 total tests.
  • 2026-04-24: lua: phase 3 — functions (anon/local/top-level) + closures verified (lexical capture, mutation-through-closure, recursion, HOFs). 175 total tests.
  • 2026-04-24: lua: phase 2 transpile — arithmetic, comparison, short-circuit logical, .. concat, if/while/repeat/for-num/local/assign. 157 total tests green.
  • 2026-04-24: lua: parser (exprs with precedence, all phase-1 statements, funcbody, table ctors, method/chained calls) — 112 total tokenizer+parser tests
  • 2026-04-24: lua: tokenizer (numbers/strings/long-brackets/keywords/ops/comments) + 56 tests

Blockers

Shared-file issues that need someone else to fix. Minimal repro only.

  • (none yet)

Known limitations (own code, not shared)

  • require supports package.preload only — no filesystem search (we don't have Lua-file resolution inside sx_server). Users register a loader in package.preload.name and require("name") calls it with name as arg. Results cached in package.loaded; nil return caches as true per Lua convention.
  • os library is a stubos.time() returns a monotonic counter (not Unix epoch), os.clock() = counter/1000, os.date() returns hardcoded "1970-01-01 00:00:00" or a *t table with fixed fields; os.getenv returns nil; os.remove/rename return nil+error. No real clock/filesystem access.
  • io library is a stubio.write/print append to an internal __io-buffer (accessible via io.__buffer() which returns + clears it) instead of real stdout. io.read/open/lines return nil. Suitable for tests that inspect output; no actual stdio.
  • string.find/match/gmatch/gsub patterns are LITERAL only — no %d/%a/./*/+/etc. Implementing Lua patterns is a separate work item; literal search covers the common case.
  • string.format supports only %s, %d, %f, %%. No width/precision flags (%.2f, %5d).
  • string.char supports printable ASCII 32126 plus \t/\n/\r; other codes error.
  • Early return inside nested blockFIXED 2026-04-24 via guard+raise sentinel (lua-ret). All function bodies and the top-level chunk wrap in a guard that catches the return-sentinel; return statements raise it.