datalog: quoted 'atoms' tokenize as strings
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 23s
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 23s
Quoted atoms with uppercase- or underscore-leading names were
misclassified as variables. `p('Hello World').` flowed through the
tokenizer's "atom" branch and through the parser's string->symbol,
producing a symbol named "Hello World". dl-var? inspects the first
character — "H" is uppercase, so the fact was rejected as non-ground
("expected ground literal").
Tokenizer now emits "string" for any '...' quoted form. Quoted atoms
become opaque string constants — matching how Datalog idiomatically
treats them, and avoiding a per-symbol "quoted" marker that would
have rippled through unification and dl-var?. The trade-off is that
'a' and a are no longer the same value (string vs symbol); for
Datalog this is the safer default.
Updated the existing "quoted atom" tokenize test, added a regression
case for an uppercase-named quoted atom, and a parse-level test that
verifies the AST. Conformance 269/269.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -106,6 +106,13 @@
|
||||
"string arg"
|
||||
(dl-parse "label(x, \"hi\").")
|
||||
(list {:body (list) :head (list (quote label) (quote x) "hi")}))
|
||||
;; Quoted 'atoms' parse as strings — a uppercase-starting name
|
||||
;; in quotes used to misclassify as a variable and reject the
|
||||
;; fact as non-ground.
|
||||
(dl-pt-test!
|
||||
"quoted atom arg parses as string"
|
||||
(dl-parse "p('Hello World').")
|
||||
(list {:body (list) :head (list (quote p) "Hello World")}))
|
||||
(dl-pt-test!
|
||||
"comparison literal"
|
||||
(dl-parse "p(X) :- <(X, 5).")
|
||||
|
||||
@@ -58,14 +58,23 @@
|
||||
"string"
|
||||
(dl-tk-values (dl-tokenize "\"hello\""))
|
||||
(list "hello" nil))
|
||||
;; Quoted 'atoms' tokenize as strings — see the type-table
|
||||
;; comment in lib/datalog/tokenizer.sx for the rationale.
|
||||
(dl-tk-test!
|
||||
"quoted atom"
|
||||
"quoted atom as string"
|
||||
(dl-tk-types (dl-tokenize "'two words'"))
|
||||
(list "atom" "eof"))
|
||||
(list "string" "eof"))
|
||||
(dl-tk-test!
|
||||
"quoted atom value"
|
||||
(dl-tk-values (dl-tokenize "'two words'"))
|
||||
(list "two words" nil))
|
||||
;; A quoted atom whose name would otherwise be a variable
|
||||
;; (uppercase / leading underscore) is now safely a string —
|
||||
;; this was the bug that motivated the type change.
|
||||
(dl-tk-test!
|
||||
"quoted Uppercase as string"
|
||||
(dl-tk-types (dl-tokenize "'Hello'"))
|
||||
(list "string" "eof"))
|
||||
(dl-tk-test! ":-" (dl-tk-values (dl-tokenize ":-")) (list ":-" nil))
|
||||
(dl-tk-test! "?-" (dl-tk-values (dl-tokenize "?-")) (list "?-" nil))
|
||||
(dl-tk-test! "<=" (dl-tk-values (dl-tokenize "<=")) (list "<=" nil))
|
||||
|
||||
Reference in New Issue
Block a user