Files
rose-ash/plans/lua-on-sx.md
giles 8c25527205
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Has been cancelled
lua: string library (len/upper/lower/rep/sub/byte/char/find/match/gmatch/gsub/format) +19 tests
2026-04-24 18:45:03 +00:00

113 lines
8.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Lua-on-SX: Lua 5.1 on the CEK/VM
Compile Lua 5.1 AST to SX AST; the existing CEK evaluator runs it. Same architecture as `plans/js-on-sx.md` — reuse SX semantics wherever they fit, only shim the Lua-specific parts (tables/metatables, `nil`/`false`-only-falsy, multi-return, coroutines via `perform`/resume).
End-state goal: **100% of PUC-Rio Lua 5.1.5 test suite.** Running as a long-lived background agent that drives the scoreboard up one failure-mode at a time, like `lib/js/`.
## Ground rules
- **Scope:** only touch `lib/lua/**` and `plans/lua-on-sx.md`. Do **not** edit `spec/`, `hosts/`, `shared/`, `lib/js/**`, `lib/hyperscript/**`, `lib/prolog/**`, `lib/stdlib.sx`, or anything in `lib/` root. Lua-specific primitives go in `lib/lua/runtime.sx`.
- **Shared-file issues** go under "Blockers" below with a minimal repro; do not fix from this loop.
- **SX files:** use `sx-tree` MCP tools only (never `Edit`/`Read`/`Write` on `.sx` files). `sx_write_file` for new files, path/pattern edits for changes.
- **Architecture:** Lua source → Lua AST → SX AST → CEK. No standalone Lua evaluator.
- **Commits:** one feature per commit. Keep `## Progress log` updated (dated entries, newest first) and tick boxes in the roadmap.
## Architecture sketch
```
Lua source text
lib/lua/tokenizer.sx — numbers, strings (short + long [[…]]), idents, ops, comments
lib/lua/parser.sx — Lua AST as SX trees, e.g. (lua-for-num i a b c body)
lib/lua/transpile.sx — Lua AST → SX AST (entry: lua-eval-ast)
existing CEK / VM
```
Runtime shims in `lib/lua/runtime.sx`: `lua-truthy?`, string coercion for `..`/arithmetic, table ops (array + hash part), metatable dispatch, `pcall`/`error` bridge, `string`/`math`/`table` libs.
## Roadmap
Each item: implement → tests → tick box → update progress log.
### Phase 1 — tokenizer + parser
- [x] Tokenizer: numbers (int, float, hex), strings (short + long `[[…]]`), idents, keywords, operators, comments (`--`, `--[[…]]`)
- [x] Parser: blocks, `local`, `if/elseif/else/end`, `while`, numeric `for`, `function`, `return`, expressions, table constructors, indexing (`.`, `[]`), calls (`f(…)`, `f:m(…)`)
- [x] Skip for phase 1: generic `for … in …`, goto/labels, nested varargs `...`
- [x] Unit tests in `lib/lua/tests/parse.sx`: source → expected AST
### Phase 2 — transpile: control flow + arithmetic
- [x] `lua-eval-ast` entry
- [x] Arithmetic (Lua 5.1 semantics — `/` is float)
- [x] Comparison + logical (short-circuit, Lua truthy)
- [x] `..` concat with string/number coercion
- [x] `if`, `while`, numeric `for`, `local`, assignment, blocks
- [x] 30+ eval tests in `lib/lua/tests/eval.sx`
### Phase 3 — tables + functions + first PUC-Rio slice
- [x] `function` (anon, local, top-level), closures
- [x] Multi-return: return as list, unpack at call sites
- [x] Table constructors (array + hash + computed keys)
- [x] Raw table access `t.k` / `t[k]` (no metatables yet)
- [x] Vendor PUC-Rio 5.1.5 suite to `lib/lua/lua-tests/` (just `.lua` files)
- [x] `lib/lua/conformance.sh` + Python runner (model on `lib/js/test262-runner.py`)
- [x] `scoreboard.json` + `scoreboard.md` baseline
### Phase 4 — metatables + error handling (next run)
- [x] Metatable dispatch: `__index`, `__newindex`, `__add`/`__sub`/…, `__eq`, `__lt`, `__call`, `__tostring`, `__len`
- [x] `pcall`/`xpcall`/`error` via handler-bind
- [x] Generic `for … in …`
### Phase 5 — coroutines (the showcase)
- [x] `coroutine.create`/`.resume`/`.yield`/`.status`/`.wrap` via `perform`/`cek-resume`
### Phase 6 — standard library
- [x] `string``format`, `sub`, `find`, `match`, `gmatch`, `gsub`, `len`, `rep`, `upper`, `lower`, `byte`, `char`
- [ ] `math` — full surface
- [ ] `table``insert`, `remove`, `concat`, `sort`, `unpack`
- [ ] `io` — minimal stub (read/write to SX IO surface)
- [ ] `os` — time/date subset
### Phase 7 — modules + full conformance
- [ ] `require` / `package` via SX `define-library`/`import`
- [ ] Drive PUC-Rio scoreboard to 100%
## Progress log
_Newest first. Agent appends on every commit._
- 2026-04-24: lua: `string` lib — len/upper/lower/rep/sub (1-idx + neg)/byte/char/find/match/gmatch/gsub/format. Patterns are literal-only (no `%d`/etc.); format is `%s`/`%d`/`%f`/`%%` only. `string.char` uses printable-ASCII lookup + tab/nl/cr. 292 tests.
- 2026-04-24: lua: phase 5 — coroutines (create/resume/yield/status/wrap) via `call/cc` (perform/cek-resume not exposed to SX userland). Handles multi-yield + final return + arg passthrough. Fix: body's final return must jump via `caller-k` to the **current** resume's caller, not unwind through the stale first-call continuation. 273 tests.
- 2026-04-24: lua: generic `for … in …` — parser split (`=` → num, else `in`), new `lua-for-in` node, transpile to `let`-bound `f,s,var` + recursive `__for_loop`. Added `ipairs`/`pairs`/`next`/`lua-arg` globals. Lua fns now arity-tolerant (`&rest __args` + indexed bind) — needed because generic for always calls iter with 2 args. Noted early-return-in-nested-block as pre-existing limitation. 265 tests.
- 2026-04-24: lua: `pcall`/`xpcall`/`error` via SX `guard` + `raise`. Added `lua-apply` (arity-dispatch 0-8, apply fallback) because SX `apply` re-wraps raises as "Unhandled exception". Table payloads preserved (`error({code = 42})`). 256 total tests.
- 2026-04-24: lua: phase 4 — metatable dispatch (`__index`/`__newindex`/arith/compare/`__call`/`__len`), `setmetatable`/`getmetatable`/`type` globals, OO `self:method` pattern. Transpile routes all calls through `lua-call` (stashed `sx-apply-ref` to dodge user-shadowing of SX `apply`). Skipped `__tostring` (needs `tostring()` builtin). 247 total tests.
- 2026-04-24: lua: PUC-Rio scoreboard baseline — 0/16 runnable pass (0.0%). Top modes: 14× parse error, 1× `print` undef, 1× vararg transpile. Phase 3 complete.
- 2026-04-24: lua: conformance runner — `conformance.sh` shim + `conformance.py` (long-lived sx_server, epoch protocol, classify_error, writes scoreboard.{json,md}). 24 files classified in full run: 8 skip / 16 fail / 0 timeout.
- 2026-04-24: lua: vendored PUC-Rio 5.1 test suite (lua5.1-tests.tar.gz from lua.org) to `lib/lua/lua-tests/` — 22 .lua files, 6304 lines; README kept for context.
- 2026-04-24: lua: raw table access — fix `lua-set!` to use `dict-set!` (mutating), fix `lua-len` `has?``has-key?`, `#t` works, mutation/chained/computed-key writes + reference semantics. 224 total tests.
- 2026-04-24: lua: phase 3 — table constructors verified (array, hash, computed keys, mixed, nested, dynamic values, fn values, sep variants). 205 total tests.
- 2026-04-24: lua: multi-return — `lua-multi` tagged value, `lua-first`/`lua-nth-ret`/`lua-pack-return` runtime, tail-position spread in return/local/assign. 185 total tests.
- 2026-04-24: lua: phase 3 — functions (anon/local/top-level) + closures verified (lexical capture, mutation-through-closure, recursion, HOFs). 175 total tests.
- 2026-04-24: lua: phase 2 transpile — arithmetic, comparison, short-circuit logical, `..` concat, if/while/repeat/for-num/local/assign. 157 total tests green.
- 2026-04-24: lua: parser (exprs with precedence, all phase-1 statements, funcbody, table ctors, method/chained calls) — 112 total tokenizer+parser tests
- 2026-04-24: lua: tokenizer (numbers/strings/long-brackets/keywords/ops/comments) + 56 tests
## Blockers
_Shared-file issues that need someone else to fix. Minimal repro only._
- _(none yet)_
## Known limitations (own code, not shared)
- **`string.find`/`match`/`gmatch`/`gsub` patterns are LITERAL only** — no `%d`/`%a`/`.`/`*`/`+`/etc. Implementing Lua patterns is a separate work item; literal search covers the common case.
- **`string.format`** supports only `%s`, `%d`, `%f`, `%%`. No width/precision flags (`%.2f`, `%5d`).
- **`string.char`** supports printable ASCII 32126 plus `\t`/`\n`/`\r`; other codes error.
- **Early `return` inside nested block** — `if cond then return nil end ...rest` doesn't exit the enclosing function; `rest` runs anyway. Use `if cond then return X else return Y end` instead. Likely needs guard+raise sentinel for proper fix.