Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Has been cancelled
142 lines
19 KiB
Markdown
142 lines
19 KiB
Markdown
# Lua-on-SX: Lua 5.1 on the CEK/VM
|
||
|
||
Compile Lua 5.1 AST to SX AST; the existing CEK evaluator runs it. Same architecture as `plans/js-on-sx.md` — reuse SX semantics wherever they fit, only shim the Lua-specific parts (tables/metatables, `nil`/`false`-only-falsy, multi-return, coroutines via `perform`/resume).
|
||
|
||
End-state goal: **100% of PUC-Rio Lua 5.1.5 test suite.** Running as a long-lived background agent that drives the scoreboard up one failure-mode at a time, like `lib/js/`.
|
||
|
||
## Ground rules
|
||
|
||
- **Scope:** only touch `lib/lua/**` and `plans/lua-on-sx.md`. Do **not** edit `spec/`, `hosts/`, `shared/`, `lib/js/**`, `lib/hyperscript/**`, `lib/prolog/**`, `lib/stdlib.sx`, or anything in `lib/` root. Lua-specific primitives go in `lib/lua/runtime.sx`.
|
||
- **Shared-file issues** go under "Blockers" below with a minimal repro; do not fix from this loop.
|
||
- **SX files:** use `sx-tree` MCP tools only (never `Edit`/`Read`/`Write` on `.sx` files). `sx_write_file` for new files, path/pattern edits for changes.
|
||
- **Architecture:** Lua source → Lua AST → SX AST → CEK. No standalone Lua evaluator.
|
||
- **Commits:** one feature per commit. Keep `## Progress log` updated (dated entries, newest first) and tick boxes in the roadmap.
|
||
|
||
## Architecture sketch
|
||
|
||
```
|
||
Lua source text
|
||
│
|
||
▼
|
||
lib/lua/tokenizer.sx — numbers, strings (short + long [[…]]), idents, ops, comments
|
||
│
|
||
▼
|
||
lib/lua/parser.sx — Lua AST as SX trees, e.g. (lua-for-num i a b c body)
|
||
│
|
||
▼
|
||
lib/lua/transpile.sx — Lua AST → SX AST (entry: lua-eval-ast)
|
||
│
|
||
▼
|
||
existing CEK / VM
|
||
```
|
||
|
||
Runtime shims in `lib/lua/runtime.sx`: `lua-truthy?`, string coercion for `..`/arithmetic, table ops (array + hash part), metatable dispatch, `pcall`/`error` bridge, `string`/`math`/`table` libs.
|
||
|
||
## Roadmap
|
||
|
||
Each item: implement → tests → tick box → update progress log.
|
||
|
||
### Phase 1 — tokenizer + parser
|
||
- [x] Tokenizer: numbers (int, float, hex), strings (short + long `[[…]]`), idents, keywords, operators, comments (`--`, `--[[…]]`)
|
||
- [x] Parser: blocks, `local`, `if/elseif/else/end`, `while`, numeric `for`, `function`, `return`, expressions, table constructors, indexing (`.`, `[]`), calls (`f(…)`, `f:m(…)`)
|
||
- [x] Skip for phase 1: generic `for … in …`, goto/labels, nested varargs `...`
|
||
- [x] Unit tests in `lib/lua/tests/parse.sx`: source → expected AST
|
||
|
||
### Phase 2 — transpile: control flow + arithmetic
|
||
- [x] `lua-eval-ast` entry
|
||
- [x] Arithmetic (Lua 5.1 semantics — `/` is float)
|
||
- [x] Comparison + logical (short-circuit, Lua truthy)
|
||
- [x] `..` concat with string/number coercion
|
||
- [x] `if`, `while`, numeric `for`, `local`, assignment, blocks
|
||
- [x] 30+ eval tests in `lib/lua/tests/eval.sx`
|
||
|
||
### Phase 3 — tables + functions + first PUC-Rio slice
|
||
- [x] `function` (anon, local, top-level), closures
|
||
- [x] Multi-return: return as list, unpack at call sites
|
||
- [x] Table constructors (array + hash + computed keys)
|
||
- [x] Raw table access `t.k` / `t[k]` (no metatables yet)
|
||
- [x] Vendor PUC-Rio 5.1.5 suite to `lib/lua/lua-tests/` (just `.lua` files)
|
||
- [x] `lib/lua/conformance.sh` + Python runner (model on `lib/js/test262-runner.py`)
|
||
- [x] `scoreboard.json` + `scoreboard.md` baseline
|
||
|
||
### Phase 4 — metatables + error handling (next run)
|
||
- [x] Metatable dispatch: `__index`, `__newindex`, `__add`/`__sub`/…, `__eq`, `__lt`, `__call`, `__tostring`, `__len`
|
||
- [x] `pcall`/`xpcall`/`error` via handler-bind
|
||
- [x] Generic `for … in …`
|
||
|
||
### Phase 5 — coroutines (the showcase)
|
||
- [x] `coroutine.create`/`.resume`/`.yield`/`.status`/`.wrap` via `perform`/`cek-resume`
|
||
|
||
### Phase 6 — standard library
|
||
- [x] `string` — `format`, `sub`, `find`, `match`, `gmatch`, `gsub`, `len`, `rep`, `upper`, `lower`, `byte`, `char`
|
||
- [x] `math` — full surface
|
||
- [x] `table` — `insert`, `remove`, `concat`, `sort`, `unpack`
|
||
- [x] `io` — minimal stub (read/write to SX IO surface)
|
||
- [x] `os` — time/date subset
|
||
|
||
### Phase 7 — modules + full conformance
|
||
- [x] `require` / `package` via SX `define-library`/`import`
|
||
- [ ] Drive PUC-Rio scoreboard to 100%
|
||
|
||
## Progress log
|
||
|
||
_Newest first. Agent appends on every commit._
|
||
|
||
- 2026-04-24: lua: scoreboard iteration — stripped `(else (raise e))` from `lua-tx-loop-guard`. SX `guard` with `(else (raise e))` hangs in a loop (re-enters the same guard). Since unmatched sentinels fall through to the enclosing guard naturally, the else is unnecessary. Diagnosed `calls.lua` undefined-`fat`: `function fat(x)` defined at Lua top-level is scoped inside the SX top-level guard's scope; loadstring-captured closures don't see it via lexical env. Fix would require either dropping the top-level guard (breaking top-level `return`) or dynamic env access — deferred.
|
||
- 2026-04-24: lua: scoreboard iteration — **method-call double-evaluation bug**. `lua-tx-method-call` emitted `(lua-call (lua-get OBJ name) OBJ args…)` which evaluated OBJ TWICE, so `a:add(10):add(20):add(30).x` computed `110` instead of `60` (side effects applied twice). Fixed by `(let ((__obj OBJ)) (lua-call (lua-get __obj name) __obj args…))`. 373/373 green (+1 chaining test).
|
||
- 2026-04-24: lua: **🎉 FIRST PASSING PUC-Rio TEST — 1/16 runnable (6.2%)**. `verybig.lua` now passes: needed `io.output`/`io.input`/`io.stdout`/`io.stderr` stubs, made `os.remove` return `true` (test asserts on it), and added `dofile`/`loadfile` stubs. All cumulative fixes (returns/break/scoping/escapes/precedence/vararg/tonumber-trim) combined make this test's full happy path work end-to-end. 372 unit tests. Failure mix: 10× assertion / 4× timeout / 1× call-non-fn.
|
||
- 2026-04-24: lua: scoreboard iteration — **proper `break` via guard+raise sentinel** (`lua-brk`) + auto-first multi-values in arith/concat. Loop break dispatch was previously a no-op (emitted bare `'lua-break-marker` symbol that nothing caught); converted to raise+catch pattern, wrapping the OUTER invocation of `_while_loop`/`_for_loop`/`_repeat_loop`/`__for_loop` in a break-guard (wrapping body doesn't work — break would just be caught and loop keeps recursing). Also `lua-arith`/`lua-concat`/`lua-concat-coerce` now `lua-first` their operands so multi-returns auto-truncate at scalar boundaries. 372/372 green (+4 break tests). Scoreboard: 10×assert / 4×timeout / 2×call-non-fn (no more undef-symbol or compare-incompat).
|
||
- 2026-04-24: lua: scoreboard iteration — **proper early-return via guard+raise sentinel**. Fixes long-logged limitation: `if cond then return X end ...rest` now exits the enclosing function; `rest` is skipped. `lua-tx-return` raises `(list 'lua-ret value)`; every function body and the top-level chunk + loadstring'd chunks wrap in a guard that catches the sentinel and returns its value. Eliminates "compare incompatible types" from constructs.lua (past line 40). 368/368 green (+3 early-return tests).
|
||
- 2026-04-24: lua: scoreboard iteration — **unary-minus / `^` precedence fix**. Per Lua spec, `^` binds tighter than unary `-`, so `-2^2` should parse as `-(2^2) = -4`, not `(-2)^2 = 4`. My parser recursed into `parse-unary` and then let `^` bind to the already-negated operand. Added `parse-pow-chain` helper and changed the `else` branch of `parse-unary` to parse a primary + `^`-chain before returning; unary operators now wrap the full `^`-chain. Fixed `constructs.lua` past assert #3 (moved to compare-incompatible). 365/365 green (+3 precedence tests).
|
||
- 2026-04-24: lua: scoreboard iteration — `lua-byte-to-char` regression fix. My previous change returned 2-char strings (`"\a"` etc.) for bytes that SX string literals can't express (0, 7, 8, 11, 12, 14–31, 127+), breaking `'a\0a'` length from 3 → 4. Now only 9/10/13 and printable 32-126 produce real bytes; others use a single `"?"` placeholder so `string.len` stays correct. literals.lua back to failing at assert #4 (was regressed to #2).
|
||
- 2026-04-24: lua: scoreboard iteration — **decimal string escapes** `\ddd` (1-3 digits). Tokenizer `read-string` previously fell through to literal for digits, so `"\65"` came out as `"65"` not `"A"`. Added `read-decimal-escape!` consuming up to 3 digits while keeping value ≤255, plus `\a`/`\b`/`\f`/`\v` control escapes and `lua-byte-to-char` ASCII lookup. 362 tests (+2 escape tests).
|
||
- 2026-04-24: lua: scoreboard iteration — **`loadstring` error propagation**. When `loadstring(s)()` was implemented as `eval-expr ( (let () compiled))`, SX's `eval-expr` wrapped any propagated `raise` as "Unhandled exception: X" — so `error('hi')` inside a loadstring'd chunk came out as that wrapped string instead of the clean `"hi"` Lua expects. Fix: transpile source once into a lambda AST, `eval-expr` it ONCE to get a callable fn value, return that — subsequent calls propagate raises cleanly. Guarded parse-failure path returns `(nil, err)` per Lua convention. vararg.lua now runs past assert #18; errors.lua past parse stage.
|
||
- 2026-04-24: lua: scoreboard iteration — `table.sort` O(n²) insertion-sort → **quicksort** (Lomuto partition). 1000-element sorts finish in ms; but `sort.lua` uses 30k elements and still times out even at 90s (metamethod-heavy interpreter overhead). Correctness verified on 1000/5000 element random arrays.
|
||
- 2026-04-24: lua: scoreboard iteration — `dostring(s)` alias for `loadstring(s)()` (Lua 5.0 compat used by literals.lua). Diagnosed `locals.lua` call-non-fn at call #18 → `getfenv/setfenv` stub-return pattern fails `assert(getfenv(foo("")) == a)` (need real env tracking, deferred). Tokenizer long-string-leading-NL rule verified correct.
|
||
- 2026-04-24: lua: scoreboard iteration — Lua 5.0-style `arg` auto-binding inside vararg functions (some PUC-Rio tests still rely on it). `lua-varargs-arg-table` builds `{1=v1, 2=v2, …, n=count}`; transpile adds `arg` binding alongside `__varargs` when `is-vararg`. Diagnosis done with assert-counter instrumentation — literals.lua fails at #4 (long-string NL rule), vararg.lua was at #2 (arg table — FIXED), attrib.lua at #9, locals.lua now past asserts into call-non-fn. 360 tests.
|
||
- 2026-04-24: lua: scoreboard iteration — **`loadstring` scoping**. Temporarily instrumented `lua-assert` with a counter, found `locals.lua` fails at assertion #5: `loadstring('local a = {}')() → assert(type(a) ~= 'table')`. The loadstring'd code's `local a` was leaking to outer scope because `lua-eval-ast` ran at top-level. Fixed by transpiling once and wrapping the AST in `(let () …)` before `eval-expr`.
|
||
- 2026-04-24: lua: scoreboard iteration — **`if`/`else`/`elseif` body scoping** (latent bug). `else local x = 99` was leaking to enclosing scope. Wrap all three branches in `(let () …)` via `lua-tx-if-body`. 358 tests.
|
||
- 2026-04-24: lua: scoreboard iteration — **`do`-block proper scoping**. Was transpiling `do ... end` to a raw `lua-tx` pass-through, so `define`s inside leaked to the enclosing scope (`do local i = 100 end` overwrote outer `i`). Now wraps in `(let () body)` for proper lexical isolation. 355 tests, +2 scoping tests.
|
||
- 2026-04-24: lua: scoreboard iteration — `lua-to-number` trims whitespace before `parse-number` (Lua coerces `" 3e0 "` in arithmetic). math.lua moved past the arith-type error to deeper assertion-land. 12× asserts / 3× timeouts / 1× call-non-fn.
|
||
- 2026-04-24: lua: scoreboard iteration — `table.getn`/`setn`/`foreach`/`foreachi` (Lua 5.0-era), `string.reverse`. `sort.lua` unblocked past `getn`-undef; now times out on the 30k-element sort body (insertion sort too slow). 13 fail / 3 timeout / 0 pass.
|
||
- 2026-04-24: lua: scoreboard iteration — parser consumes trailing `;` after `return`; added `collectgarbage`/`setfenv`/`getfenv`/`T` stubs. All parse errors and undefined-symbol failures eliminated — every runnable test now executes deep into the script. Failure mix: **11× assertion failed**, 2× timeout, 2× call-non-fn, 1× arith. Still 0/16 pass but the remaining work is substantive (stdlib fidelity vs the exact PUC-Rio assertions).
|
||
- 2026-04-24: lua: scoreboard iteration — trailing-dot number literals (`5.`), preload stdlibs in `package.loaded` (`string`/`math`/`table`/`io`/`os`/`coroutine`/`package`/`_G`), `arg` stub, `debug` module stub. Assertion-failure count 4→**8**, parse errors 3→**1**, call-non-fn stable, module-not-found gone.
|
||
- 2026-04-24: lua: scoreboard iteration — **vararg `...` transpile**. Parser already emitted `(lua-vararg)`; transpile now: (a) binds `__varargs` in function body when `is-vararg`, (b) emits `__varargs` for `...` uses; `lua-varargs`/`lua-spread-last-multi` runtime helpers spread multi in last call-arg and last table-pos positions. Eliminated all 6× "transpile: unsupported" failures; top-5 now all real asserts. 353 unit tests.
|
||
- 2026-04-24: lua: scoreboard iteration — added `rawget`/`rawset`/`rawequal`/`rawlen`, `loadstring`/`load`, `select`, `assert`, `_G`, `_VERSION`. Failure mix now 6×vararg-transpile / 4×real-assertion / 3×parse / 2×call-non-fn / 1×timeout (was 14 parse + 1 print undef at baseline); tests now reach deep into real assertions. Still 0/16 runnable — next targets: vararg transpile, goto, loadstring-compile depth. 347 unit tests.
|
||
- 2026-04-24: lua: `require`/`package` via preload-only (no filesystem search). `package.loaded` caching, nil-returning modules cache as `true`, unknown modules error. 347 tests.
|
||
- 2026-04-24: lua: `os` stub — time/clock monotonic counter, difftime, date (default string / `*t` dict), getenv/remove/rename/tmpname/execute/exit stubs. Phase 6 complete. 342 tests.
|
||
- 2026-04-24: lua: `io` stub + `print`/`tostring`/`tonumber` globals. io buffers to internal `__io-buffer` (tests drain it via `io.__buffer()`). print: tab-sep + NL. tostring respects `__tostring` metamethod. 334 tests.
|
||
- 2026-04-24: lua: `table` lib — insert (append / at pos, shifts up), remove (last / at pos, shifts down), concat (sep, i, j), sort (insertion sort, optional cmp), unpack + table.unpack, maxn. Caught trap: local helper named `shift` collides with SX's `shift` special form → renamed to `tbl-shift-up`/`tbl-shift-down`. 322 tests.
|
||
- 2026-04-24: lua: `math` lib — pi/huge + abs/ceil/floor/sqrt/exp/log/log10/pow/trig (sin/cos/tan/asin/acos/atan/atan2)/deg/rad/min/max (&rest)/fmod/modf/random (0/1/2 arg)/randomseed. Most ops delegate to SX primitives; log w/ base via change-of-base. 309 tests.
|
||
- 2026-04-24: lua: `string` lib — len/upper/lower/rep/sub (1-idx + neg)/byte/char/find/match/gmatch/gsub/format. Patterns are literal-only (no `%d`/etc.); format is `%s`/`%d`/`%f`/`%%` only. `string.char` uses printable-ASCII lookup + tab/nl/cr. 292 tests.
|
||
- 2026-04-24: lua: phase 5 — coroutines (create/resume/yield/status/wrap) via `call/cc` (perform/cek-resume not exposed to SX userland). Handles multi-yield + final return + arg passthrough. Fix: body's final return must jump via `caller-k` to the **current** resume's caller, not unwind through the stale first-call continuation. 273 tests.
|
||
- 2026-04-24: lua: generic `for … in …` — parser split (`=` → num, else `in`), new `lua-for-in` node, transpile to `let`-bound `f,s,var` + recursive `__for_loop`. Added `ipairs`/`pairs`/`next`/`lua-arg` globals. Lua fns now arity-tolerant (`&rest __args` + indexed bind) — needed because generic for always calls iter with 2 args. Noted early-return-in-nested-block as pre-existing limitation. 265 tests.
|
||
- 2026-04-24: lua: `pcall`/`xpcall`/`error` via SX `guard` + `raise`. Added `lua-apply` (arity-dispatch 0-8, apply fallback) because SX `apply` re-wraps raises as "Unhandled exception". Table payloads preserved (`error({code = 42})`). 256 total tests.
|
||
- 2026-04-24: lua: phase 4 — metatable dispatch (`__index`/`__newindex`/arith/compare/`__call`/`__len`), `setmetatable`/`getmetatable`/`type` globals, OO `self:method` pattern. Transpile routes all calls through `lua-call` (stashed `sx-apply-ref` to dodge user-shadowing of SX `apply`). Skipped `__tostring` (needs `tostring()` builtin). 247 total tests.
|
||
- 2026-04-24: lua: PUC-Rio scoreboard baseline — 0/16 runnable pass (0.0%). Top modes: 14× parse error, 1× `print` undef, 1× vararg transpile. Phase 3 complete.
|
||
- 2026-04-24: lua: conformance runner — `conformance.sh` shim + `conformance.py` (long-lived sx_server, epoch protocol, classify_error, writes scoreboard.{json,md}). 24 files classified in full run: 8 skip / 16 fail / 0 timeout.
|
||
- 2026-04-24: lua: vendored PUC-Rio 5.1 test suite (lua5.1-tests.tar.gz from lua.org) to `lib/lua/lua-tests/` — 22 .lua files, 6304 lines; README kept for context.
|
||
- 2026-04-24: lua: raw table access — fix `lua-set!` to use `dict-set!` (mutating), fix `lua-len` `has?`→`has-key?`, `#t` works, mutation/chained/computed-key writes + reference semantics. 224 total tests.
|
||
- 2026-04-24: lua: phase 3 — table constructors verified (array, hash, computed keys, mixed, nested, dynamic values, fn values, sep variants). 205 total tests.
|
||
- 2026-04-24: lua: multi-return — `lua-multi` tagged value, `lua-first`/`lua-nth-ret`/`lua-pack-return` runtime, tail-position spread in return/local/assign. 185 total tests.
|
||
- 2026-04-24: lua: phase 3 — functions (anon/local/top-level) + closures verified (lexical capture, mutation-through-closure, recursion, HOFs). 175 total tests.
|
||
- 2026-04-24: lua: phase 2 transpile — arithmetic, comparison, short-circuit logical, `..` concat, if/while/repeat/for-num/local/assign. 157 total tests green.
|
||
- 2026-04-24: lua: parser (exprs with precedence, all phase-1 statements, funcbody, table ctors, method/chained calls) — 112 total tokenizer+parser tests
|
||
- 2026-04-24: lua: tokenizer (numbers/strings/long-brackets/keywords/ops/comments) + 56 tests
|
||
|
||
## Blockers
|
||
|
||
_Shared-file issues that need someone else to fix. Minimal repro only._
|
||
|
||
- _(none yet)_
|
||
|
||
## Known limitations (own code, not shared)
|
||
|
||
- **`require` supports `package.preload` only** — no filesystem search (we don't have Lua-file resolution inside sx_server). Users register a loader in `package.preload.name` and `require("name")` calls it with name as arg. Results cached in `package.loaded`; nil return caches as `true` per Lua convention.
|
||
- **`os` library is a stub** — `os.time()` returns a monotonic counter (not Unix epoch), `os.clock()` = counter/1000, `os.date()` returns hardcoded "1970-01-01 00:00:00" or a `*t` table with fixed fields; `os.getenv` returns nil; `os.remove`/`rename` return nil+error. No real clock/filesystem access.
|
||
- **`io` library is a stub** — `io.write`/`print` append to an internal `__io-buffer` (accessible via `io.__buffer()` which returns + clears it) instead of real stdout. `io.read`/`open`/`lines` return nil. Suitable for tests that inspect output; no actual stdio.
|
||
- **`string.find`/`match`/`gmatch`/`gsub` patterns are LITERAL only** — no `%d`/`%a`/`.`/`*`/`+`/etc. Implementing Lua patterns is a separate work item; literal search covers the common case.
|
||
- **`string.format`** supports only `%s`, `%d`, `%f`, `%%`. No width/precision flags (`%.2f`, `%5d`).
|
||
- **`string.char`** supports printable ASCII 32–126 plus `\t`/`\n`/`\r`; other codes error.
|
||
- ~~Early `return` inside nested block~~ — **FIXED 2026-04-24** via guard+raise sentinel (`lua-ret`). All function bodies and the top-level chunk wrap in a guard that catches the return-sentinel; `return` statements raise it.
|