Commit Graph

5 Commits

Author SHA1 Message Date
7a94a47e26 datalog: quoted 'atoms' tokenize as strings
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 23s
Quoted atoms with uppercase- or underscore-leading names were
misclassified as variables. `p('Hello World').` flowed through the
tokenizer's "atom" branch and through the parser's string->symbol,
producing a symbol named "Hello World". dl-var? inspects the first
character — "H" is uppercase, so the fact was rejected as non-ground
("expected ground literal").

Tokenizer now emits "string" for any '...' quoted form. Quoted atoms
become opaque string constants — matching how Datalog idiomatically
treats them, and avoiding a per-symbol "quoted" marker that would
have rippled through unification and dl-var?. The trade-off is that
'a' and a are no longer the same value (string vs symbol); for
Datalog this is the safer default.

Updated the existing "quoted atom" tokenize test, added a regression
case for an uppercase-named quoted atom, and a parse-level test that
verifies the AST. Conformance 269/269.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 08:39:24 +00:00
3a1ecaa362 datalog: tokenizer raises on unexpected characters (256/256)
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 45s
Bug: characters not recognised by any branch of `scan!` (`?`,
`!`, `#`, `@`, `&`, `|`, `\\`, `^`, etc.) were silently consumed
via `(else (advance! 1) (scan!))`. Programs with typos would
parse to a stripped version of themselves with no warning —
`?(X).` became `(X).` and produced confusing downstream errors.

Fix: the else branch now raises a clear "unexpected character"
error with the offending char and its position.

1 new tokenize test.
2026-05-10 21:17:07 +00:00
5bcda5c88c datalog: tokenizer raises on unterminated string + quoted atom (251/251)
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 35s
Bug: read-quoted ran to EOF silently when the closing quote was
missing. The token's value was whatever ran-to-end string had been
accumulated; the parser later saw an unexpected EOF, but the error
message blamed the wrong location ("expected `)` got eof") and
hid the real problem.

Fix: read-quoted now raises with a message that distinguishes
strings from quoted atoms, including the position where the
opening quote was lost. The escape-sequence handling and proper
closing are unaffected.

2 new tokenize tests.
2026-05-10 21:05:28 +00:00
4b5e75dc3e datalog: tokenizer raises on unterminated block comment (249/249)
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 43s
Bug: `/* unclosed` was silently consumed to EOF, swallowing any
Datalog code that followed inside the (never-closing) comment.
Programs would produce empty parses with no error.

Fix: skip-block-comment! now raises when it hits EOF without
finding `*/`. Error message includes the position where the
problem was first detected. Line comments (`%`) and properly
closed block comments (`/* ... */`) are unaffected.

1 new tokenize test verifies the error path.
2026-05-10 20:59:33 +00:00
9bd6bbb7e7 datalog: tokenizer + 26 tests (Phase 1)
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 54s
Tokens {:type :value :pos} for atoms, vars, numbers, strings, punct, ops.
Operators :- ?- <= >= != < > = + - * /. Comments % and /* */.
2026-05-07 23:05:59 +00:00