Files
rose-ash/plans/lua-on-sx.md
giles 3ec52d4556
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Has been cancelled
lua: package.cpath/config/loaders/searchers stubs (attrib.lua past #9)
2026-04-24 22:56:57 +00:00

144 lines
20 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Lua-on-SX: Lua 5.1 on the CEK/VM
Compile Lua 5.1 AST to SX AST; the existing CEK evaluator runs it. Same architecture as `plans/js-on-sx.md` — reuse SX semantics wherever they fit, only shim the Lua-specific parts (tables/metatables, `nil`/`false`-only-falsy, multi-return, coroutines via `perform`/resume).
End-state goal: **100% of PUC-Rio Lua 5.1.5 test suite.** Running as a long-lived background agent that drives the scoreboard up one failure-mode at a time, like `lib/js/`.
## Ground rules
- **Scope:** only touch `lib/lua/**` and `plans/lua-on-sx.md`. Do **not** edit `spec/`, `hosts/`, `shared/`, `lib/js/**`, `lib/hyperscript/**`, `lib/prolog/**`, `lib/stdlib.sx`, or anything in `lib/` root. Lua-specific primitives go in `lib/lua/runtime.sx`.
- **Shared-file issues** go under "Blockers" below with a minimal repro; do not fix from this loop.
- **SX files:** use `sx-tree` MCP tools only (never `Edit`/`Read`/`Write` on `.sx` files). `sx_write_file` for new files, path/pattern edits for changes.
- **Architecture:** Lua source → Lua AST → SX AST → CEK. No standalone Lua evaluator.
- **Commits:** one feature per commit. Keep `## Progress log` updated (dated entries, newest first) and tick boxes in the roadmap.
## Architecture sketch
```
Lua source text
lib/lua/tokenizer.sx — numbers, strings (short + long [[…]]), idents, ops, comments
lib/lua/parser.sx — Lua AST as SX trees, e.g. (lua-for-num i a b c body)
lib/lua/transpile.sx — Lua AST → SX AST (entry: lua-eval-ast)
existing CEK / VM
```
Runtime shims in `lib/lua/runtime.sx`: `lua-truthy?`, string coercion for `..`/arithmetic, table ops (array + hash part), metatable dispatch, `pcall`/`error` bridge, `string`/`math`/`table` libs.
## Roadmap
Each item: implement → tests → tick box → update progress log.
### Phase 1 — tokenizer + parser
- [x] Tokenizer: numbers (int, float, hex), strings (short + long `[[…]]`), idents, keywords, operators, comments (`--`, `--[[…]]`)
- [x] Parser: blocks, `local`, `if/elseif/else/end`, `while`, numeric `for`, `function`, `return`, expressions, table constructors, indexing (`.`, `[]`), calls (`f(…)`, `f:m(…)`)
- [x] Skip for phase 1: generic `for … in …`, goto/labels, nested varargs `...`
- [x] Unit tests in `lib/lua/tests/parse.sx`: source → expected AST
### Phase 2 — transpile: control flow + arithmetic
- [x] `lua-eval-ast` entry
- [x] Arithmetic (Lua 5.1 semantics — `/` is float)
- [x] Comparison + logical (short-circuit, Lua truthy)
- [x] `..` concat with string/number coercion
- [x] `if`, `while`, numeric `for`, `local`, assignment, blocks
- [x] 30+ eval tests in `lib/lua/tests/eval.sx`
### Phase 3 — tables + functions + first PUC-Rio slice
- [x] `function` (anon, local, top-level), closures
- [x] Multi-return: return as list, unpack at call sites
- [x] Table constructors (array + hash + computed keys)
- [x] Raw table access `t.k` / `t[k]` (no metatables yet)
- [x] Vendor PUC-Rio 5.1.5 suite to `lib/lua/lua-tests/` (just `.lua` files)
- [x] `lib/lua/conformance.sh` + Python runner (model on `lib/js/test262-runner.py`)
- [x] `scoreboard.json` + `scoreboard.md` baseline
### Phase 4 — metatables + error handling (next run)
- [x] Metatable dispatch: `__index`, `__newindex`, `__add`/`__sub`/…, `__eq`, `__lt`, `__call`, `__tostring`, `__len`
- [x] `pcall`/`xpcall`/`error` via handler-bind
- [x] Generic `for … in …`
### Phase 5 — coroutines (the showcase)
- [x] `coroutine.create`/`.resume`/`.yield`/`.status`/`.wrap` via `perform`/`cek-resume`
### Phase 6 — standard library
- [x] `string``format`, `sub`, `find`, `match`, `gmatch`, `gsub`, `len`, `rep`, `upper`, `lower`, `byte`, `char`
- [x] `math` — full surface
- [x] `table``insert`, `remove`, `concat`, `sort`, `unpack`
- [x] `io` — minimal stub (read/write to SX IO surface)
- [x] `os` — time/date subset
### Phase 7 — modules + full conformance
- [x] `require` / `package` via SX `define-library`/`import`
- [ ] Drive PUC-Rio scoreboard to 100%
## Progress log
_Newest first. Agent appends on every commit._
- 2026-04-24: lua: scoreboard iteration — `package.cpath`/`config`/`loaders`/`searchers`/`searchpath` stubs. attrib.lua moves from #9 (checking `package.cpath` is a string) to "module 'C' not found" — test requires filesystem-based module loading, not tractable. Most remaining failures need Lua pattern matching (pm.lua/strings.lua), env tracking (locals.lua/events.lua), or filesystem (attrib.lua).
- 2026-04-24: lua: scoreboard iteration — **parenthesized expressions truncate multi-return** (Lua spec: `(f())` forces single value even if `f` returns multi). Parser wraps `(expr)` in a new `lua-paren` AST node; transpile emits `(lua-first inner)`. Fixes `constructs.lua`@30 (`a,b,c = (f())` expects `a=1, b=nil, c=nil`) and `math.lua`@13. 375/375 green (+2 paren tests). Scoreboard: 8× asserts (was 10).
- 2026-04-24: lua: scoreboard iteration — stripped `(else (raise e))` from `lua-tx-loop-guard`. SX `guard` with `(else (raise e))` hangs in a loop (re-enters the same guard). Since unmatched sentinels fall through to the enclosing guard naturally, the else is unnecessary. Diagnosed `calls.lua` undefined-`fat`: `function fat(x)` defined at Lua top-level is scoped inside the SX top-level guard's scope; loadstring-captured closures don't see it via lexical env. Fix would require either dropping the top-level guard (breaking top-level `return`) or dynamic env access — deferred.
- 2026-04-24: lua: scoreboard iteration — **method-call double-evaluation bug**. `lua-tx-method-call` emitted `(lua-call (lua-get OBJ name) OBJ args…)` which evaluated OBJ TWICE, so `a:add(10):add(20):add(30).x` computed `110` instead of `60` (side effects applied twice). Fixed by `(let ((__obj OBJ)) (lua-call (lua-get __obj name) __obj args…))`. 373/373 green (+1 chaining test).
- 2026-04-24: lua: **🎉 FIRST PASSING PUC-Rio TEST — 1/16 runnable (6.2%)**. `verybig.lua` now passes: needed `io.output`/`io.input`/`io.stdout`/`io.stderr` stubs, made `os.remove` return `true` (test asserts on it), and added `dofile`/`loadfile` stubs. All cumulative fixes (returns/break/scoping/escapes/precedence/vararg/tonumber-trim) combined make this test's full happy path work end-to-end. 372 unit tests. Failure mix: 10× assertion / 4× timeout / 1× call-non-fn.
- 2026-04-24: lua: scoreboard iteration — **proper `break` via guard+raise sentinel** (`lua-brk`) + auto-first multi-values in arith/concat. Loop break dispatch was previously a no-op (emitted bare `'lua-break-marker` symbol that nothing caught); converted to raise+catch pattern, wrapping the OUTER invocation of `_while_loop`/`_for_loop`/`_repeat_loop`/`__for_loop` in a break-guard (wrapping body doesn't work — break would just be caught and loop keeps recursing). Also `lua-arith`/`lua-concat`/`lua-concat-coerce` now `lua-first` their operands so multi-returns auto-truncate at scalar boundaries. 372/372 green (+4 break tests). Scoreboard: 10×assert / 4×timeout / 2×call-non-fn (no more undef-symbol or compare-incompat).
- 2026-04-24: lua: scoreboard iteration — **proper early-return via guard+raise sentinel**. Fixes long-logged limitation: `if cond then return X end ...rest` now exits the enclosing function; `rest` is skipped. `lua-tx-return` raises `(list 'lua-ret value)`; every function body and the top-level chunk + loadstring'd chunks wrap in a guard that catches the sentinel and returns its value. Eliminates "compare incompatible types" from constructs.lua (past line 40). 368/368 green (+3 early-return tests).
- 2026-04-24: lua: scoreboard iteration — **unary-minus / `^` precedence fix**. Per Lua spec, `^` binds tighter than unary `-`, so `-2^2` should parse as `-(2^2) = -4`, not `(-2)^2 = 4`. My parser recursed into `parse-unary` and then let `^` bind to the already-negated operand. Added `parse-pow-chain` helper and changed the `else` branch of `parse-unary` to parse a primary + `^`-chain before returning; unary operators now wrap the full `^`-chain. Fixed `constructs.lua` past assert #3 (moved to compare-incompatible). 365/365 green (+3 precedence tests).
- 2026-04-24: lua: scoreboard iteration — `lua-byte-to-char` regression fix. My previous change returned 2-char strings (`"\a"` etc.) for bytes that SX string literals can't express (0, 7, 8, 11, 12, 1431, 127+), breaking `'a\0a'` length from 3 → 4. Now only 9/10/13 and printable 32-126 produce real bytes; others use a single `"?"` placeholder so `string.len` stays correct. literals.lua back to failing at assert #4 (was regressed to #2).
- 2026-04-24: lua: scoreboard iteration — **decimal string escapes** `\ddd` (1-3 digits). Tokenizer `read-string` previously fell through to literal for digits, so `"\65"` came out as `"65"` not `"A"`. Added `read-decimal-escape!` consuming up to 3 digits while keeping value ≤255, plus `\a`/`\b`/`\f`/`\v` control escapes and `lua-byte-to-char` ASCII lookup. 362 tests (+2 escape tests).
- 2026-04-24: lua: scoreboard iteration — **`loadstring` error propagation**. When `loadstring(s)()` was implemented as `eval-expr ( (let () compiled))`, SX's `eval-expr` wrapped any propagated `raise` as "Unhandled exception: X" — so `error('hi')` inside a loadstring'd chunk came out as that wrapped string instead of the clean `"hi"` Lua expects. Fix: transpile source once into a lambda AST, `eval-expr` it ONCE to get a callable fn value, return that — subsequent calls propagate raises cleanly. Guarded parse-failure path returns `(nil, err)` per Lua convention. vararg.lua now runs past assert #18; errors.lua past parse stage.
- 2026-04-24: lua: scoreboard iteration — `table.sort` O(n²) insertion-sort → **quicksort** (Lomuto partition). 1000-element sorts finish in ms; but `sort.lua` uses 30k elements and still times out even at 90s (metamethod-heavy interpreter overhead). Correctness verified on 1000/5000 element random arrays.
- 2026-04-24: lua: scoreboard iteration — `dostring(s)` alias for `loadstring(s)()` (Lua 5.0 compat used by literals.lua). Diagnosed `locals.lua` call-non-fn at call #18`getfenv/setfenv` stub-return pattern fails `assert(getfenv(foo("")) == a)` (need real env tracking, deferred). Tokenizer long-string-leading-NL rule verified correct.
- 2026-04-24: lua: scoreboard iteration — Lua 5.0-style `arg` auto-binding inside vararg functions (some PUC-Rio tests still rely on it). `lua-varargs-arg-table` builds `{1=v1, 2=v2, …, n=count}`; transpile adds `arg` binding alongside `__varargs` when `is-vararg`. Diagnosis done with assert-counter instrumentation — literals.lua fails at #4 (long-string NL rule), vararg.lua was at #2 (arg table — FIXED), attrib.lua at #9, locals.lua now past asserts into call-non-fn. 360 tests.
- 2026-04-24: lua: scoreboard iteration — **`loadstring` scoping**. Temporarily instrumented `lua-assert` with a counter, found `locals.lua` fails at assertion #5: `loadstring('local a = {}')() → assert(type(a) ~= 'table')`. The loadstring'd code's `local a` was leaking to outer scope because `lua-eval-ast` ran at top-level. Fixed by transpiling once and wrapping the AST in `(let () …)` before `eval-expr`.
- 2026-04-24: lua: scoreboard iteration — **`if`/`else`/`elseif` body scoping** (latent bug). `else local x = 99` was leaking to enclosing scope. Wrap all three branches in `(let () …)` via `lua-tx-if-body`. 358 tests.
- 2026-04-24: lua: scoreboard iteration — **`do`-block proper scoping**. Was transpiling `do ... end` to a raw `lua-tx` pass-through, so `define`s inside leaked to the enclosing scope (`do local i = 100 end` overwrote outer `i`). Now wraps in `(let () body)` for proper lexical isolation. 355 tests, +2 scoping tests.
- 2026-04-24: lua: scoreboard iteration — `lua-to-number` trims whitespace before `parse-number` (Lua coerces `" 3e0 "` in arithmetic). math.lua moved past the arith-type error to deeper assertion-land. 12× asserts / 3× timeouts / 1× call-non-fn.
- 2026-04-24: lua: scoreboard iteration — `table.getn`/`setn`/`foreach`/`foreachi` (Lua 5.0-era), `string.reverse`. `sort.lua` unblocked past `getn`-undef; now times out on the 30k-element sort body (insertion sort too slow). 13 fail / 3 timeout / 0 pass.
- 2026-04-24: lua: scoreboard iteration — parser consumes trailing `;` after `return`; added `collectgarbage`/`setfenv`/`getfenv`/`T` stubs. All parse errors and undefined-symbol failures eliminated — every runnable test now executes deep into the script. Failure mix: **11× assertion failed**, 2× timeout, 2× call-non-fn, 1× arith. Still 0/16 pass but the remaining work is substantive (stdlib fidelity vs the exact PUC-Rio assertions).
- 2026-04-24: lua: scoreboard iteration — trailing-dot number literals (`5.`), preload stdlibs in `package.loaded` (`string`/`math`/`table`/`io`/`os`/`coroutine`/`package`/`_G`), `arg` stub, `debug` module stub. Assertion-failure count 4→**8**, parse errors 3→**1**, call-non-fn stable, module-not-found gone.
- 2026-04-24: lua: scoreboard iteration — **vararg `...` transpile**. Parser already emitted `(lua-vararg)`; transpile now: (a) binds `__varargs` in function body when `is-vararg`, (b) emits `__varargs` for `...` uses; `lua-varargs`/`lua-spread-last-multi` runtime helpers spread multi in last call-arg and last table-pos positions. Eliminated all 6× "transpile: unsupported" failures; top-5 now all real asserts. 353 unit tests.
- 2026-04-24: lua: scoreboard iteration — added `rawget`/`rawset`/`rawequal`/`rawlen`, `loadstring`/`load`, `select`, `assert`, `_G`, `_VERSION`. Failure mix now 6×vararg-transpile / 4×real-assertion / 3×parse / 2×call-non-fn / 1×timeout (was 14 parse + 1 print undef at baseline); tests now reach deep into real assertions. Still 0/16 runnable — next targets: vararg transpile, goto, loadstring-compile depth. 347 unit tests.
- 2026-04-24: lua: `require`/`package` via preload-only (no filesystem search). `package.loaded` caching, nil-returning modules cache as `true`, unknown modules error. 347 tests.
- 2026-04-24: lua: `os` stub — time/clock monotonic counter, difftime, date (default string / `*t` dict), getenv/remove/rename/tmpname/execute/exit stubs. Phase 6 complete. 342 tests.
- 2026-04-24: lua: `io` stub + `print`/`tostring`/`tonumber` globals. io buffers to internal `__io-buffer` (tests drain it via `io.__buffer()`). print: tab-sep + NL. tostring respects `__tostring` metamethod. 334 tests.
- 2026-04-24: lua: `table` lib — insert (append / at pos, shifts up), remove (last / at pos, shifts down), concat (sep, i, j), sort (insertion sort, optional cmp), unpack + table.unpack, maxn. Caught trap: local helper named `shift` collides with SX's `shift` special form → renamed to `tbl-shift-up`/`tbl-shift-down`. 322 tests.
- 2026-04-24: lua: `math` lib — pi/huge + abs/ceil/floor/sqrt/exp/log/log10/pow/trig (sin/cos/tan/asin/acos/atan/atan2)/deg/rad/min/max (&rest)/fmod/modf/random (0/1/2 arg)/randomseed. Most ops delegate to SX primitives; log w/ base via change-of-base. 309 tests.
- 2026-04-24: lua: `string` lib — len/upper/lower/rep/sub (1-idx + neg)/byte/char/find/match/gmatch/gsub/format. Patterns are literal-only (no `%d`/etc.); format is `%s`/`%d`/`%f`/`%%` only. `string.char` uses printable-ASCII lookup + tab/nl/cr. 292 tests.
- 2026-04-24: lua: phase 5 — coroutines (create/resume/yield/status/wrap) via `call/cc` (perform/cek-resume not exposed to SX userland). Handles multi-yield + final return + arg passthrough. Fix: body's final return must jump via `caller-k` to the **current** resume's caller, not unwind through the stale first-call continuation. 273 tests.
- 2026-04-24: lua: generic `for … in …` — parser split (`=` → num, else `in`), new `lua-for-in` node, transpile to `let`-bound `f,s,var` + recursive `__for_loop`. Added `ipairs`/`pairs`/`next`/`lua-arg` globals. Lua fns now arity-tolerant (`&rest __args` + indexed bind) — needed because generic for always calls iter with 2 args. Noted early-return-in-nested-block as pre-existing limitation. 265 tests.
- 2026-04-24: lua: `pcall`/`xpcall`/`error` via SX `guard` + `raise`. Added `lua-apply` (arity-dispatch 0-8, apply fallback) because SX `apply` re-wraps raises as "Unhandled exception". Table payloads preserved (`error({code = 42})`). 256 total tests.
- 2026-04-24: lua: phase 4 — metatable dispatch (`__index`/`__newindex`/arith/compare/`__call`/`__len`), `setmetatable`/`getmetatable`/`type` globals, OO `self:method` pattern. Transpile routes all calls through `lua-call` (stashed `sx-apply-ref` to dodge user-shadowing of SX `apply`). Skipped `__tostring` (needs `tostring()` builtin). 247 total tests.
- 2026-04-24: lua: PUC-Rio scoreboard baseline — 0/16 runnable pass (0.0%). Top modes: 14× parse error, 1× `print` undef, 1× vararg transpile. Phase 3 complete.
- 2026-04-24: lua: conformance runner — `conformance.sh` shim + `conformance.py` (long-lived sx_server, epoch protocol, classify_error, writes scoreboard.{json,md}). 24 files classified in full run: 8 skip / 16 fail / 0 timeout.
- 2026-04-24: lua: vendored PUC-Rio 5.1 test suite (lua5.1-tests.tar.gz from lua.org) to `lib/lua/lua-tests/` — 22 .lua files, 6304 lines; README kept for context.
- 2026-04-24: lua: raw table access — fix `lua-set!` to use `dict-set!` (mutating), fix `lua-len` `has?``has-key?`, `#t` works, mutation/chained/computed-key writes + reference semantics. 224 total tests.
- 2026-04-24: lua: phase 3 — table constructors verified (array, hash, computed keys, mixed, nested, dynamic values, fn values, sep variants). 205 total tests.
- 2026-04-24: lua: multi-return — `lua-multi` tagged value, `lua-first`/`lua-nth-ret`/`lua-pack-return` runtime, tail-position spread in return/local/assign. 185 total tests.
- 2026-04-24: lua: phase 3 — functions (anon/local/top-level) + closures verified (lexical capture, mutation-through-closure, recursion, HOFs). 175 total tests.
- 2026-04-24: lua: phase 2 transpile — arithmetic, comparison, short-circuit logical, `..` concat, if/while/repeat/for-num/local/assign. 157 total tests green.
- 2026-04-24: lua: parser (exprs with precedence, all phase-1 statements, funcbody, table ctors, method/chained calls) — 112 total tokenizer+parser tests
- 2026-04-24: lua: tokenizer (numbers/strings/long-brackets/keywords/ops/comments) + 56 tests
## Blockers
_Shared-file issues that need someone else to fix. Minimal repro only._
- _(none yet)_
## Known limitations (own code, not shared)
- **`require` supports `package.preload` only** — no filesystem search (we don't have Lua-file resolution inside sx_server). Users register a loader in `package.preload.name` and `require("name")` calls it with name as arg. Results cached in `package.loaded`; nil return caches as `true` per Lua convention.
- **`os` library is a stub** — `os.time()` returns a monotonic counter (not Unix epoch), `os.clock()` = counter/1000, `os.date()` returns hardcoded "1970-01-01 00:00:00" or a `*t` table with fixed fields; `os.getenv` returns nil; `os.remove`/`rename` return nil+error. No real clock/filesystem access.
- **`io` library is a stub** — `io.write`/`print` append to an internal `__io-buffer` (accessible via `io.__buffer()` which returns + clears it) instead of real stdout. `io.read`/`open`/`lines` return nil. Suitable for tests that inspect output; no actual stdio.
- **`string.find`/`match`/`gmatch`/`gsub` patterns are LITERAL only** — no `%d`/`%a`/`.`/`*`/`+`/etc. Implementing Lua patterns is a separate work item; literal search covers the common case.
- **`string.format`** supports only `%s`, `%d`, `%f`, `%%`. No width/precision flags (`%.2f`, `%5d`).
- **`string.char`** supports printable ASCII 32126 plus `\t`/`\n`/`\r`; other codes error.
- ~~Early `return` inside nested block~~ — **FIXED 2026-04-24** via guard+raise sentinel (`lua-ret`). All function bodies and the top-level chunk wrap in a guard that catches the return-sentinel; `return` statements raise it.