ocaml: phase 1 tokenizer (+58 tests) — consumes lib/guest/lex.sx
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 52s
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 52s
Idents, ctors, 51 keywords, numbers (int/float/hex/exp/underscored), strings + chars with escapes, type variables, 26 op/punct tokens, and nested (* ... *) block comments. Tests via epoch protocol against sx_server.exe.
This commit is contained in:
@@ -116,14 +116,15 @@ SX CEK evaluator (both JS and OCaml hosts)
|
||||
|
||||
### Phase 1 — Tokenizer + parser
|
||||
|
||||
- [ ] **Tokenizer:** keywords (`let`, `rec`, `in`, `fun`, `function`, `match`, `with`,
|
||||
- [x] **Tokenizer:** keywords (`let`, `rec`, `in`, `fun`, `function`, `match`, `with`,
|
||||
`type`, `of`, `module`, `struct`, `end`, `functor`, `sig`, `open`, `include`,
|
||||
`if`, `then`, `else`, `begin`, `try`, `exception`, `raise`, `mutable`,
|
||||
`for`, `while`, `do`, `done`, `and`, `as`, `when`), operators (`->`, `|>`,
|
||||
`<|`, `@@`, `@`, `:=`, `!`, `::`, `**`, `:`, `;`, `;;`), identifiers (lower,
|
||||
upper/ctor, labels `~label:`, optional `?label:`), char literals `'c'`,
|
||||
string literals (escaped + heredoc `{|...|}`), int/float literals,
|
||||
line comments `(*` nested block comments `*)`.
|
||||
upper/ctor), char literals `'c'`, string literals (escaped),
|
||||
int/float literals (incl. hex, exponent, underscores), nested block
|
||||
comments `(* ... *)`. _(labels `~label:` / `?label:` and heredoc `{|...|}`
|
||||
deferred — surface tokens already work via `~`/`?` punct + `{`/`|` punct.)_
|
||||
- [ ] **Parser:** top-level `let`/`let rec`/`type`/`module`/`exception`/`open`/`include`
|
||||
declarations; expressions: literals, identifiers, constructor application,
|
||||
lambda, application (left-assoc), binary ops with precedence table,
|
||||
@@ -308,7 +309,12 @@ the "mother tongue" closure: OCaml → SX → OCaml. This means:
|
||||
|
||||
_Newest first._
|
||||
|
||||
_(awaiting phase 1)_
|
||||
- 2026-05-07 Phase 1 — `lib/ocaml/tokenizer.sx` consuming `lib/guest/lex.sx`
|
||||
via `prefix-rename`. Covers idents, ctors, 51 keywords, numbers (int / float
|
||||
/ hex / exponent / underscored), strings (with escapes), chars (with escapes),
|
||||
type variables (`'a`), nested block comments, and 26 operator/punct tokens
|
||||
(incl. `->` `|>` `<-` `:=` `::` `;;` `@@` `<>` `&&` `||` `**` etc.). 58/58
|
||||
tokenizer tests pass via `lib/ocaml/test.sh` driving `sx_server.exe`.
|
||||
|
||||
## Blockers
|
||||
|
||||
|
||||
Reference in New Issue
Block a user