go: lex.sx — keywords, ident/int/string/rune lits, comments, ops, ASI + 78 tests [consumes-lex]
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 23s

First Go-on-SX iteration. Tokenizer consumes lib/guest/lex.sx character-class
predicates. Automatic semicolon insertion per Go spec § Semicolons fires on
newline, EOF, and block comments containing a newline, after
ident/int/string/rune/{break,continue,fallthrough,return}/{++,--,),],}}.

Scoreboard + conformance.sh wired; lex 78/78. Plan Phase 1 sub-items
checked; floats/raw-strings/hex-ints still .

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-26 21:13:06 +00:00
parent 0f7444e0d5
commit 4fc73a97f4
6 changed files with 757 additions and 10 deletions

View File

@@ -131,16 +131,21 @@ Loop-style. Each phase: implement → test → commit → tick `[ ]` → append
Progress-log line → push `origin/loops/go`.
### Phase 1 — Tokenizer (`lib/go/lex.sx`) ⬜
- Consume `lib/guest/core/lex.sx`. Tag the chisel note `consumes-lex`.
- Keywords (25), operators + punctuation (47 distinct), identifiers,
literals (int / float / imaginary / rune / string with raw + interpreted
variants), comments.
- **Automatic semicolon insertion** — the one tricky bit. Newline becomes
`;` after identifier/literal/`)`/`]`/`}` per Go spec § Semicolons. Build
it into the tokenizer, not the parser.
- Tests: golden-token streams for every keyword/operator/literal kind +
ASI edge cases.
- **Acceptance:** lex/ suite at 50+ tests.
- [x] Scaffold + scoreboard + conformance runner (consumes lib/guest/lex.sx)
- [x] Identifiers + 25 keywords
- [x] Decimal integer literals
- [x] Interpreted string literals `"..."` with `\n \t \r \\ \" \'` escapes
- [x] Rune literals `'x'` (single char + simple escapes)
- [x] Line + block comments (block w/ newline triggers ASI)
- [x] Common operator/punct set incl. `:= <- ++ -- == != <= >= && || ...`
- [x] **Automatic semicolon insertion** (Go spec § Semicolons) — newline,
EOF, and block-comment-with-newline trigger `;` after
ident/int/string/rune/{break,continue,fallthrough,return}/{++,--,),],}}.
- [ ] Float / imaginary literals
- [ ] Raw string literals `` `...` ``
- [ ] Hex/octal/binary integer literals (0x… 0o… 0b…) + underscores
- [ ] Full operator set audit (47 distinct per Go spec)
- **Acceptance:** lex/ suite at 50+ tests. Current: 78/78.
### Phase 2 — Parser (`lib/go/parse.sx`) ⬜
- Consume `lib/guest/core/pratt.sx` + `lib/guest/core/ast.sx`. Chisel notes
@@ -399,6 +404,11 @@ _(none yet)_
_Newest first. Append one dated entry per commit._
- 2026-05-26 — Phase 1 first slice: `lib/go/lex.sx` tokenizer consuming
`lib/guest/lex.sx` predicates. 25 keywords, ident/int/string/rune lits,
line+block comments, common operators, automatic semicolon insertion per
Go spec § Semicolons (newline / EOF / block-comment-with-newline triggers).
Scoreboard + conformance.sh wired. 78/78 tests. `[consumes-lex]`.
- 2026-05-26 — Plan rewritten to integrate the lib/guest framework
(chiselling discipline, sister plans for scheduler + bidirectional
types, type-checker phase added, conformance scoreboard model adopted).