go: lex.sx — operator-set audit + tilde; PHASE 1 COMPLETE + 6 tests [proposes-lex]
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 25s

Adds the missing tilde operator '~' (Go 1.18+ generics type-set
constraint, e.g. 'interface { ~int | ~float64 }') to the longest-match
operator table. Adds an exhaustive 'op-audit:' test block covering
every Go operator/punctuation token by category — arithmetic +
assignment, bitwise + assignment, comparison + logical, decls /
arrows / variadic / inc-dec, punctuation, and tilde.

Phase 1 (tokenizer) is now complete. Two kit gaps surfaced and logged
in plans/go-on-sx.md Blockers for the substrate maintainer / next
statically-typed guest loop:

  * lib/guest/lex.sx lacks lex-oct-digit? / lex-bin-digit?
    (we rolled local gl-* equivalents for 0o.. and 0b.. literals).
  * lib/guest/lex.sx lacks a table-driven longest-prefix operator
    matcher; our gl-match-op is a 25-clause cond ladder. Rust/Swift/TS
    will each hit the same shape with 50+ ops apiece.

lex 129/129. Phase 2 (parser) next.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-27 07:28:50 +00:00
parent 65467c232b
commit c1baca2e4e
5 changed files with 68 additions and 12 deletions

View File

@@ -130,7 +130,7 @@ Suites planned:
Loop-style. Each phase: implement → test → commit → tick `[ ]` → append
Progress-log line → push `origin/loops/go`.
### Phase 1 — Tokenizer (`lib/go/lex.sx`)
### Phase 1 — Tokenizer (`lib/go/lex.sx`)
- [x] Scaffold + scoreboard + conformance runner (consumes lib/guest/lex.sx)
- [x] Identifiers + 25 keywords
- [x] Decimal integer literals
@@ -148,8 +148,10 @@ Progress-log line → push `origin/loops/go`.
as interpreted strings)
- [x] Hex/octal/binary integer literals (0x… 0o… 0b…) + underscores
(legacy 0123 octal also accepted; consumes lex-hex-digit?)
- [ ] Full operator set audit (47 distinct per Go spec)
- **Acceptance:** lex/ suite at 50+ tests. Current: 123/123.
- [x] Full operator set audit (47 distinct per Go spec, plus `~` for
generics type-sets). Exhaustive coverage tests in `op-audit:` block.
- **Acceptance:** lex/ suite at 50+ tests. Current: 129/129. **Phase 1
done** — hex floats deferred (rare). Move to Phase 2 next.
### Phase 2 — Parser (`lib/go/parse.sx`) ⬜
- Consume `lib/guest/core/pratt.sx` + `lib/guest/core/ast.sx`. Chisel notes
@@ -402,12 +404,35 @@ Every commit ends its message with a chisel note in brackets:
## Blockers
_(none yet)_
### Kit-gap proposals against `lib/guest/lex.sx`
Observed from building the Go tokenizer. Not blocking Phase 2; surfaced
here for the substrate-maintainer / next statically-typed-guest loop:
1. **No `lex-oct-digit?` / `lex-bin-digit?`.** Go's prefixed integer forms
`0o17` and `0b1010` need digit-class predicates that the kit doesn't
provide. We rolled local `gl-oct-digit?` and `gl-bin-digit?`. Rust and
Swift's lexers will need the same. Cheap to promote.
2. **No table-driven longest-prefix matcher.** Go has 47+ operator
sequences with longest-match semantics. Our `gl-match-op` is a
25-clause `cond` ladder; Rust/Swift/TS will each need ~50+. A kit
helper like `(lex-match-longest TABLE SOURCE POS)` that takes a sorted
prefix table would collapse this. Worth proposing once a second
statically-typed guest hits the same pattern.
Minimal repro: see `lib/go/lex.sx#gl-oct-digit?` and `#gl-match-op`.
## Progress log
_Newest first. Append one dated entry per commit._
- 2026-05-27 — **Phase 1 complete.** Operator-set audit: added missing
`~` (Go 1.18+ generics type-set), exhaustive op coverage tests grouped
by category. Two kit gaps observed and logged in Blockers:
`lex-oct-digit?`/`lex-bin-digit?` predicates + `lex-match-longest`
table-driven prefix matcher — both useful for future statically-typed
guests. +6 tests, lex 129/129. `[proposes-lex]`. Phase 2 (parser) next.
- 2026-05-27 — Phase 1 cont.: raw string literals (backtick-delimited).
Multi-line, no escape processing, `\r` stripped per Go spec § String
literals. Same `"string"` token type as interpreted strings — parsers