datalog: tokenizer raises on unexpected characters (256/256)
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 45s
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 45s
Bug: characters not recognised by any branch of `scan!` (`?`, `!`, `#`, `@`, `&`, `|`, `\\`, `^`, etc.) were silently consumed via `(else (advance! 1) (scan!))`. Programs with typos would parse to a stripped version of themselves with no warning — `?(X).` became `(X).` and produced confusing downstream errors. Fix: the else branch now raises a clear "unexpected character" error with the offending char and its position. 1 new tokenize test.
This commit is contained in:
@@ -118,6 +118,18 @@
|
||||
"block comment"
|
||||
(dl-tk-types (dl-tokenize "/* a\nb */ x."))
|
||||
(list "atom" "punct" "eof"))
|
||||
;; Unexpected characters surface at tokenize time rather
|
||||
;; than being silently consumed (previously `?(X)` parsed as
|
||||
;; if the leading `?` weren't there).
|
||||
(dl-tk-test!
|
||||
"unexpected char raises"
|
||||
(let ((threw false))
|
||||
(do
|
||||
(guard (e (#t (set! threw true)))
|
||||
(dl-tokenize "?(X)"))
|
||||
threw))
|
||||
true)
|
||||
|
||||
;; Unterminated string / quoted-atom must raise.
|
||||
(dl-tk-test!
|
||||
"unterminated string raises"
|
||||
|
||||
Reference in New Issue
Block a user