Files
rose-ash/plans/haskell-completeness.md
giles 4a35998469
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 49s
haskell: Phase 7 string=[Char] — O(1) string-view head/tail + chr/ord/toUpper/toLower/++ (+35 tests, 810/810)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-06 19:44:19 +00:00

305 lines
16 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Haskell-on-SX: completeness roadmap (Phases 716)
Continuation of `plans/haskell-on-sx.md`. Phases 16 are complete (156/156
conformance tests, 18 programs, 775 total hk-on-sx tests). This document covers
the next ten features toward a more complete Haskell 98 subset.
## Scope decisions (unchanged from haskell-on-sx.md)
- Haskell 98 subset only. No GHC extensions.
- All work lives in `lib/haskell/**` and this file. Nothing else.
- SX files: `sx-tree` MCP tools only.
- One feature per commit. Keep `## Progress log` updated.
## String-view design note
Haskell defines `type String = [Char]`. Representing that naively as a linked
cons-spine makes `length`, `++`, and `take` O(n) in allocation — unacceptable
for string-processing programs. The design uses **string views** implemented as
pure-SX dicts, requiring no OCaml changes.
### Representation
A string view is a dict `{:hk-str buf :hk-off n}` where `buf` is a native SX
string and `n` is the current offset (zero-based code-unit index). Native SX
strings also satisfy the predicate (offset = 0 implicitly).
- `hk-str?` returns true for both native strings and string-view dicts.
- `hk-str-head v` extracts the character at offset `n` as an integer (ord value).
- `hk-str-tail v` returns a new view with offset `n+1`; O(1).
- `hk-str-null? v` is true when offset equals the string's length.
### Char = integer
`Char` is represented as a plain integer (its Unicode code point / ord value).
`chr n` converts back to a single-character string for display and `++`. `ord c`
is the identity (the integer itself). `toUpper`/`toLower` operate on the integer,
looking up ASCII ranges. This is already consistent with the existing `ord 'A' =
65` tests.
### Pattern matching
In `match.sx`, the cons-pattern branch (`":"` constructor) checks `hk-str?` on
the scrutinee **before** the normal tagged-list path. When the scrutinee is a
string view (or native string), decompose as:
- head → `hk-str-head` (an integer char-code)
- tail → `hk-str-tail` (a new string view, or `(list "[]")` if exhausted)
The nil-pattern `"[]"` matches when `hk-str-null?` is true.
### Complexity
- `head s` / `tail s` — O(1) via view shift
- `s !! n` — O(n) (n tail calls)
- `(c:s)` construction — O(n) for full `[Char]` construction (same as real Haskell)
- `++` on two strings — native `str` concat, O(length left)
- `length` — O(n); `words`/`lines` — O(n)
No OCaml changes are needed. The view type is fully representable as an SX dict.
## Ground rules
- **Scope:** only `lib/haskell/**` and `plans/haskell-completeness.md`. No edits
to `spec/`, `hosts/`, `shared/`, other `lib/<lang>/` dirs, or `lib/` root.
- **SX files:** `sx-tree` MCP tools only. `sx_validate` after every edit.
- **Commits:** one feature per commit. Keep `## Progress log` updated.
- **Tests:** `bash lib/haskell/test.sh` must be green before any commit. After
adding new programs, run `bash lib/haskell/conformance.sh` and commit the
updated `scoreboard.md`.
- **Conformance programs:** WebFetch from 99 Haskell Problems or Rosetta Code.
Adapt minimally (no GHC extensions). Cite the source URL in the file header.
Add to `conformance.sh` PROGRAMS array.
- **NEVER call `sx_build`.** If sx_server binary broken → Blockers entry, stop.
## Roadmap
### Phase 7 — String = [Char] (performant string views)
- [x] Add `hk-str?` predicate to `runtime.sx` covering both native SX strings
and `{:hk-str buf :hk-off n}` view dicts.
- [x] Implement `hk-str-head`, `hk-str-tail`, `hk-str-null?` helpers in
`runtime.sx`.
- [x] In `match.sx`, intercept cons-pattern `":"` when scrutinee satisfies
`hk-str?`; decompose to (char-int, view) instead of the tagged-list path.
Nil-pattern `"[]"` matches `hk-str-null?`.
- [x] Add builtins: `chr` (int → single-char string), verify `ord` returns int,
`toUpper`, `toLower` (ASCII range arithmetic on ints).
- [x] Ensure `++` between two strings concatenates natively via `str` rather
than building a cons spine.
- [x] Tests in `lib/haskell/tests/string-char.sx` (≥ 15 tests: head/tail on
string literal, map over string, filter chars, chr/ord roundtrip, toUpper,
toLower, null/empty string view).
- [ ] Conformance programs (WebFetch + adapt):
- `caesar.hs` — Caesar cipher. Exercises `map`, `chr`, `ord`, `toUpper`,
`toLower` on characters.
- `runlength-str.hs` — run-length encoding on a String. Exercises string
pattern matching, `span`, character comparison.
### Phase 8 — `show` for arbitrary types
- [ ] Audit `hk-show-val` in `runtime.sx` — ensure output format matches
Haskell 98: `"Just 3"`, `"[1,2,3]"`, `"(True,False)"`, `"'a'"` (Char shows
with single-quotes), `"\"hello\""` (String shows with escaped double-quotes).
- [ ] `show` Prelude binding calls `hk-show-val`; `print x = putStrLn (show x)`.
- [ ] `deriving Show` auto-generates proper show for record-style and
multi-constructor ADTs. Nested application arguments wrapped in parens:
if `show arg` contains a space, emit `"(" ++ show arg ++ ")"`.
- [ ] `showsPrec` / `showParen` stubs so hand-written Show instances compile.
- [ ] `Read` class stub — just enough for `reads :: String -> [(a,String)]` to
type-check; no real parser needed yet.
- [ ] Tests in `lib/haskell/tests/show.sx` (≥ 12 tests: show Int, show Bool,
show Char, show String, show list, show tuple, show Maybe, show custom ADT,
deriving Show on multi-constructor type, nested constructor parens).
- [ ] Conformance programs:
- `showadt.hs``data Expr = Lit Int | Add Expr Expr | Mul Expr Expr`
with `deriving Show`; prints a tree.
- `showio.hs``print` on various types in a `do` block.
### Phase 9 — `error` / `undefined`
- [ ] `error :: String -> a` — raises `(raise (list "hk-error" msg))` in SX.
- [ ] `undefined :: a` = `error "Prelude.undefined"`.
- [ ] Partial functions emit proper error messages: `head []`
`"Prelude.head: empty list"`, `tail []``"Prelude.tail: empty list"`,
`fromJust Nothing``"Maybe.fromJust: Nothing"`.
- [ ] Top-level `hk-run-io` catches `hk-error` tag and returns it as a tagged
error result so test suites can inspect it without crashing.
- [ ] `hk-test-error` helper in `testlib.sx`:
`(hk-test-error "desc" thunk expected-substring)` — asserts the thunk raises
an `hk-error` whose message contains the given substring.
- [ ] Tests in `lib/haskell/tests/errors.sx` (≥ 10 tests: error message
content, undefined, head/tail/fromJust on bad input, `hk-test-error` helper).
- [ ] Conformance programs:
- `partial.hs` — exercises `head []`, `tail []`, `fromJust Nothing` caught
at the top level; shows error messages.
### Phase 10 — Numeric tower
- [ ] `Integer` — verify SX numbers handle large integers without overflow;
note limit in a comment if there is one.
- [ ] `fromIntegral :: (Integral a, Num b) => a -> b` — identity in our runtime
(all numbers share one SX type); register as a builtin no-op with the correct
typeclass signature.
- [ ] `toInteger`, `fromInteger` — same treatment.
- [ ] Float/Double literals round-trip through `hk-show-val`:
`show 3.14 = "3.14"`, `show 1.0e10 = "1.0e10"`.
- [ ] Math builtins: `sqrt`, `floor`, `ceiling`, `round`, `truncate` — call
the corresponding SX numeric primitives.
- [ ] `Fractional` typeclass stub: `(/)`, `recip`, `fromRational`.
- [ ] `Floating` typeclass stub: `pi`, `exp`, `log`, `sin`, `cos`, `(**)`
(power operator, maps to SX exponentiation).
- [ ] Tests in `lib/haskell/tests/numeric.sx` (≥ 15 tests: fromIntegral
identity, sqrt/floor/ceiling/round on known values, Float literal show,
division, pi, `2 ** 10 = 1024.0`).
- [ ] Conformance programs:
- `statistics.hs` — mean, variance, std-dev on a `[Double]`. Exercises
`fromIntegral`, `sqrt`, `/`.
- `newton.hs` — Newton's method for square root. Exercises `Float`, `abs`,
iteration.
### Phase 11 — Data.Map
- [ ] Implement a weight-balanced BST in pure SX in `lib/haskell/map.sx`.
Internal node representation: `("Map-Node" key val left right size)`.
Leaf: `("Map-Empty")`.
- [ ] Core operations: `empty`, `singleton`, `insert`, `lookup`, `delete`,
`member`, `size`, `null`.
- [ ] Bulk operations: `fromList`, `toList`, `toAscList`, `keys`, `elems`.
- [ ] Combining: `unionWith`, `intersectionWith`, `difference`.
- [ ] Transforming: `foldlWithKey`, `foldrWithKey`, `mapWithKey`, `filterWithKey`.
- [ ] Updating: `adjust`, `insertWith`, `insertWithKey`, `alter`.
- [ ] Module wiring: `import Data.Map` and `import qualified Data.Map as Map`
resolve to the `map.sx` namespace dict in the eval import handler.
- [ ] Unit tests in `lib/haskell/tests/map.sx` (≥ 20 tests: empty, singleton,
insert + lookup hit/miss, delete root, fromList with duplicates,
toAscList ordering, unionWith, foldlWithKey).
- [ ] Conformance programs:
- `wordfreq.hs` — word-frequency histogram using `Data.Map`. Source from
Rosetta Code "Word frequency" Haskell entry.
- `mapgraph.hs` — adjacency-list BFS using `Data.Map`.
### Phase 12 — Data.Set
- [ ] Implement `Data.Set` in `lib/haskell/set.sx`. Use a standalone
weight-balanced BST (same structure as Map but no value field) or wrap
`Data.Map` with unit values.
- [ ] API: `empty`, `singleton`, `insert`, `delete`, `member`, `fromList`,
`toList`, `toAscList`, `size`, `null`, `union`, `intersection`, `difference`,
`isSubsetOf`, `filter`, `map`, `foldr`, `foldl'`.
- [ ] Module wiring: `import Data.Set` / `import qualified Data.Set as Set`.
- [ ] Unit tests in `lib/haskell/tests/set.sx` (≥ 15 tests: empty, insert,
member hit/miss, delete, fromList deduplication, union, intersection,
difference, isSubsetOf).
- [ ] Conformance programs:
- `uniquewords.hs` — unique words in a string using `Data.Set`.
- `setops.hs` — set union/intersection/difference on integer sets;
exercises all three combining operations.
### Phase 13 — `where` in typeclass instances + default methods
- [ ] Verify `where`-clauses in `instance` bodies desugar correctly. The
`hk-bind-decls!` instance arm must call the same where-lifting logic as
top-level function clauses. Write a targeted test to confirm.
- [ ] Class declarations may include default method implementations. Parser:
`hk-parse-class` collects method decls; eval registers defaults under
`"__default__ClassName_method"` in the class dict.
- [ ] Instance method lookup: when the instance dict lacks a method, fall back
to the default. Wire this into the dictionary-passing dispatch.
- [ ] `Eq` default: `(/=) x y = not (x == y)`. Verify it works without an
explicit `/=` in every Eq instance.
- [ ] `Ord` defaults: `max a b = if a >= b then a else b`, `min a b = if a <=
b then a else b`. Verify.
- [ ] `Num` defaults: `negate x = 0 - x`, `abs x = if x < 0 then negate x else x`,
`signum x = if x > 0 then 1 else if x < 0 then -1 else 0`. Verify.
- [ ] Tests in `lib/haskell/tests/class-defaults.sx` (≥ 10 tests).
- [ ] Conformance programs:
- `shapes.hs` — `class Area a` with a default `perimeter`; two instances
using `where`-local helpers.
### Phase 14 — Record syntax
- [ ] Parser: extend `hk-parse-data` to recognise `{ field :: Type, … }`
constructor bodies. AST node: `(:con-rec CNAME [(FNAME TYPE) …])`.
- [ ] Desugar: `:con-rec` → positional `:con-def` plus generated accessor
functions `(\rec -> case rec of …)` for each field name.
- [ ] Record creation `Foo { bar = 1, baz = "x" }` parsed as
`(:rec-create CON [(FNAME EXPR) …])`. Eval builds the same tagged list as
positional construction (field order from the data decl).
- [ ] Record update `r { field = v }` parsed as `(:rec-update EXPR [(FNAME EXPR)])`.
Eval forces the record, replaces the relevant positional slot, returns a new
tagged list. Field → index mapping stored in `hk-constructors` at registration.
- [ ] Exhaustive record patterns: `Foo { bar = b }` in case binds `b`,
wildcards remaining fields.
- [ ] Tests in `lib/haskell/tests/records.sx` (≥ 12 tests: creation, accessor,
update one field, update two fields, record pattern, `deriving Show` on
record type).
- [ ] Conformance programs:
- `person.hs` — `data Person = Person { name :: String, age :: Int }` with
accessors, update, `deriving Show`.
- `config.hs` — multi-field config record; partial update; defaultConfig
constant.
### Phase 15 — IORef
- [ ] `IORef a` representation: a dict `{:hk-ioref true :hk-value v}`.
Allocation creates a new dict in the IO monad. Mutation via `dict-set!`.
- [ ] `newIORef :: a -> IO (IORef a)` — wraps a new dict in `IO`.
- [ ] `readIORef :: IORef a -> IO a` — returns `(IO (get ref ":hk-value"))`.
- [ ] `writeIORef :: IORef a -> a -> IO ()` — `(dict-set! ref ":hk-value" v)`,
returns `(IO ("Tuple"))`.
- [ ] `modifyIORef :: IORef a -> (a -> a) -> IO ()` — read + apply + write.
- [ ] `modifyIORef' :: IORef a -> (a -> a) -> IO ()` — strict variant (force
new value before write).
- [ ] `Data.IORef` module wiring.
- [ ] Tests in `lib/haskell/tests/ioref.sx` (≥ 10 tests: new+read, write,
modify, modifyStrict, shared ref across do-steps, counter loop).
- [ ] Conformance programs:
- `counter.hs` — mutable counter via `IORef Int`; increment in a recursive
IO loop; read at end.
- `accumulate.hs` — accumulate results into `IORef [Int]` inside a mapped
IO action, read at the end.
### Phase 16 — Exception handling
- [ ] `SomeException` type: `data SomeException = SomeException String`.
`IOException = SomeException`.
- [ ] `throwIO :: Exception e => e -> IO a` — raises `("hk-exception" e)`.
- [ ] `evaluate :: a -> IO a` — forces arg strictly; any embedded `hk-error`
surfaces as a catchable `SomeException`.
- [ ] `catch :: Exception e => IO a -> (e -> IO a) -> IO a` — wraps action in
SX `guard`; on `hk-error` or `hk-exception`, calls the handler with a
`SomeException` value.
- [ ] `try :: Exception e => IO a -> IO (Either e a)` — returns `Right v` on
success, `Left e` on any exception.
- [ ] `handle = flip catch`.
- [ ] Tests in `lib/haskell/tests/exceptions.sx` (≥ 10 tests: catch success,
catch error, try Right, try Left, nested catch, evaluate surfaces error,
throwIO propagates, handle alias).
- [ ] Conformance programs:
- `safediv.hs` — safe division using `catch`; divide-by-zero raises,
handler returns 0.
- `trycatch.hs` — `try` pattern: run an action, branch on Left/Right.
## Progress log
_Newest first._
**2026-05-06** — Phase 7 complete (string-view O(1) head/tail + `++` native concat):
- `runtime.sx`: added `hk-str?`, `hk-str-head`, `hk-str-tail`, `hk-str-null?`.
String views are `{:hk-str buf :hk-off n}` dicts; native SX strings satisfy the
predicate with implicit offset 0. All helpers are O(1) via `char-at` / `string-length`.
- `eval.sx`: added `chr` (int → single-char string via `char-from-code`), `toUpper`,
`toLower` (ASCII-range arithmetic). Fixed `ord` and all char predicates (`isAlpha`,
`isAlphaNum`, `isDigit`, `isSpace`, `isUpper`, `isLower`, `digitToInt`) to accept
integers from string-view decomposition (not only single-char strings).
- `match.sx`: cons-pattern `":"` now checks `hk-str?` before the tagged-list path,
decomposing to `(hk-str-head, hk-str-tail)`. Empty-list pattern (`p-list []`) also
accepts `hk-str-null?` values. `hk-match-list-pat` updated to traverse string views
element-by-element.
- `runtime.sx`: added `hk-str-to-native` (converts view dict to native string via reduce+char-at).
- `eval.sx`: `hk-list-append` now checks `hk-str?` first; converts both operands via
`hk-str-to-native` before native `str` concat. String `++` String no longer builds
a cons spine.
- 35 new tests in `lib/haskell/tests/string-char.sx` (35/35 passing).
- Full suite: 810/810 tests, 0 regressions (was 775).