plan: Phase 6 ADT design doc — define-type/match syntax, CEK dispatch, exhaustiveness
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -171,9 +171,12 @@ Fix O(n²) string concatenation in loops across Lua, Ruby, Common Lisp, Tcl.
|
|||||||
The deepest structural gap. Every language uses `{:tag "..." :field ...}` tagged dicts to
|
The deepest structural gap. Every language uses `{:tag "..." :field ...}` tagged dicts to
|
||||||
simulate sum types. A native `define-type` + `match` form eliminates this everywhere.
|
simulate sum types. A native `define-type` + `match` form eliminates this everywhere.
|
||||||
|
|
||||||
- [ ] Design: write `plans/designs/sx-adt.md` covering syntax, CEK dispatch, interaction with
|
- [x] Design: write `plans/designs/sx-adt.md` covering syntax, CEK dispatch, interaction with
|
||||||
existing `cond`/`case`, exhaustiveness checking, recursive types, pattern variables.
|
existing `cond`/`case`, exhaustiveness checking, recursive types, pattern variables.
|
||||||
Draft, then stop — next fire reviews design before implementing.
|
Draft, then stop — next fire reviews design before implementing.
|
||||||
|
Written: define-type/match syntax, AdtValue runtime rep, stepSfDefineType + MatchFrame
|
||||||
|
CEK dispatch, exhaustiveness warnings via _adt_registry, recursive types, nested patterns,
|
||||||
|
wildcard _, 3-phase impl plan (basic/nested/exhaustiveness), open questions on accessors/singletons/inspect.
|
||||||
|
|
||||||
- [ ] Spec: implement `define-type` special form in `spec/evaluator.sx`:
|
- [ ] Spec: implement `define-type` special form in `spec/evaluator.sx`:
|
||||||
`(define-type Name (Ctor1 field...) (Ctor2 field...) ...)`
|
`(define-type Name (Ctor1 field...) (Ctor2 field...) ...)`
|
||||||
@@ -683,6 +686,7 @@ Brief each language's loop agent (or do inline) after rebasing their branch onto
|
|||||||
|
|
||||||
_Newest first._
|
_Newest first._
|
||||||
|
|
||||||
|
- 2026-04-26: Phase 6 Design done — plans/designs/sx-adt.md written. Covers define-type/match syntax, AdtValue CEK runtime, stepSfDefineType+MatchFrame dispatch, exhaustiveness warnings, recursive types, nested patterns, wildcard _. 3-phase impl plan. Next fire: Spec implement define-type.
|
||||||
- 2026-04-26: Phase 5 complete — string buffer fully landed (d98b5fa2). 17 tests, 17/17 OCaml+JS. Phase 6 (ADTs) next.
|
- 2026-04-26: Phase 5 complete — string buffer fully landed (d98b5fa2). 17 tests, 17/17 OCaml+JS. Phase 6 (ADTs) next.
|
||||||
- 2026-04-26: Phase 5 Spec+OCaml+JS step done — StringBuffer of Buffer.t in sx_types.ml; make-string-buffer/append!/->string/length/string-buffer? in sx_primitives.ml; SxStringBuffer with _string_buffer marker + typeOf/dict? fixes in platform.py; JS rebuilt. 17/17 tests OCaml+JS.
|
- 2026-04-26: Phase 5 Spec+OCaml+JS step done — StringBuffer of Buffer.t in sx_types.ml; make-string-buffer/append!/->string/length/string-buffer? in sx_primitives.ml; SxStringBuffer with _string_buffer marker + typeOf/dict? fixes in platform.py; JS rebuilt. 17/17 tests OCaml+JS.
|
||||||
- 2026-04-26: Phase 4 complete — coroutine primitive fully landed (4 commits: spec library + OCaml verified + JS pre-load + 27 tests). Phase 5 (string buffer) next.
|
- 2026-04-26: Phase 4 complete — coroutine primitive fully landed (4 commits: spec library + OCaml verified + JS pre-load + 27 tests). Phase 5 (string buffer) next.
|
||||||
|
|||||||
257
plans/designs/sx-adt.md
Normal file
257
plans/designs/sx-adt.md
Normal file
@@ -0,0 +1,257 @@
|
|||||||
|
# SX Algebraic Data Types — Design
|
||||||
|
|
||||||
|
## Motivation
|
||||||
|
|
||||||
|
Every language implementation currently uses `{:tag "..." :field ...}` tagged dicts to
|
||||||
|
simulate sum types. This is verbose, error-prone (typos in tag strings go undetected), and
|
||||||
|
produces no exhaustiveness warnings. Native ADTs eliminate the pattern everywhere.
|
||||||
|
|
||||||
|
Examples of current workarounds:
|
||||||
|
- Haskell `Maybe a` → `{:tag "Just" :value x}` / `{:tag "Nothing"}`
|
||||||
|
- Prolog terms → `{:tag "functor" :name "foo" :args (list x y)}`
|
||||||
|
- Lua result type → `{:tag "ok" :value v}` / `{:tag "err" :msg s}`
|
||||||
|
- Common Lisp `cons` pairs → `{:tag "cons" :car a :cdr b}`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Syntax
|
||||||
|
|
||||||
|
### `define-type`
|
||||||
|
|
||||||
|
```lisp
|
||||||
|
(define-type Name
|
||||||
|
(Ctor1 field1 field2 ...)
|
||||||
|
(Ctor2 field1 ...)
|
||||||
|
...)
|
||||||
|
```
|
||||||
|
|
||||||
|
Creates:
|
||||||
|
- Constructor functions: `Ctor1`, `Ctor2`, … (callable like normal functions)
|
||||||
|
- Type predicate: `Name?` — returns true for any value of type `Name`
|
||||||
|
- Constructor predicates: `Ctor1?`, `Ctor2?`, … (optional, auto-generated)
|
||||||
|
- Field accessors: `Ctor1-field1`, `Ctor1-field2`, … (optional, auto-generated)
|
||||||
|
|
||||||
|
Examples:
|
||||||
|
|
||||||
|
```lisp
|
||||||
|
(define-type Maybe
|
||||||
|
(Just value)
|
||||||
|
(Nothing))
|
||||||
|
|
||||||
|
(define-type Result
|
||||||
|
(Ok value)
|
||||||
|
(Err message))
|
||||||
|
|
||||||
|
(define-type Tree
|
||||||
|
(Leaf)
|
||||||
|
(Node left value right))
|
||||||
|
|
||||||
|
(define-type List-of
|
||||||
|
(Nil-of)
|
||||||
|
(Cons-of head tail))
|
||||||
|
```
|
||||||
|
|
||||||
|
Constructors with no fields are zero-argument constructors (singletons by value):
|
||||||
|
|
||||||
|
```lisp
|
||||||
|
(Nothing) ; => #<Nothing>
|
||||||
|
(Leaf) ; => #<Leaf>
|
||||||
|
```
|
||||||
|
|
||||||
|
### `match`
|
||||||
|
|
||||||
|
```lisp
|
||||||
|
(match expr
|
||||||
|
((Ctor1 a b) body)
|
||||||
|
((Ctor2 x) body)
|
||||||
|
((Ctor3) body)
|
||||||
|
(else body))
|
||||||
|
```
|
||||||
|
|
||||||
|
- Clauses are tried in order; first match wins.
|
||||||
|
- `else` clause is optional but suppresses exhaustiveness warnings.
|
||||||
|
- Pattern variables (`a`, `b`, `x`) are bound in the body scope.
|
||||||
|
- Wildcard `_` discards the matched value.
|
||||||
|
- Literal patterns: `42`, `"str"`, `true`, `nil` — match by value equality.
|
||||||
|
- Nested patterns: `((Node left (Leaf) right) body)` — nested constructor patterns.
|
||||||
|
|
||||||
|
Examples:
|
||||||
|
|
||||||
|
```lisp
|
||||||
|
(match result
|
||||||
|
((Ok v) (str "got: " v))
|
||||||
|
((Err m) (str "error: " m)))
|
||||||
|
|
||||||
|
(match tree
|
||||||
|
((Leaf) 0)
|
||||||
|
((Node l v r) (+ 1 (tree-depth l) (tree-depth r))))
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## CEK Dispatch
|
||||||
|
|
||||||
|
### Runtime representation
|
||||||
|
|
||||||
|
ADT values are OCaml records (not dicts) — opaque, non-inspectable via `get`:
|
||||||
|
|
||||||
|
```ocaml
|
||||||
|
type adt_value = {
|
||||||
|
av_type : string; (* type name, e.g. "Maybe" *)
|
||||||
|
av_ctor : string; (* constructor name, e.g. "Just" *)
|
||||||
|
av_fields: value array; (* positional fields *)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
In JS: `{ _adt: true, _type: "Maybe", _ctor: "Just", _fields: [v] }`.
|
||||||
|
|
||||||
|
`typeOf` returns the ADT type name (e.g. `"Maybe"`).
|
||||||
|
|
||||||
|
### `define-type` — special form
|
||||||
|
|
||||||
|
`stepSfDefineType(args, env, kont)`:
|
||||||
|
|
||||||
|
1. Parse `Name` and list of `(CtorN field...)` clauses.
|
||||||
|
2. For each constructor `CtorK` with fields `[f1, f2, …]`:
|
||||||
|
- Register `CtorK` as a `NativeFn` that takes `|fields|` args and returns an `AdtValue`.
|
||||||
|
- Register `CtorK?` as a predicate (`AdtValue` with matching ctor name → `true`).
|
||||||
|
- Register `CtorK-fN` as field accessor (returns `av_fields[N]`).
|
||||||
|
3. Register `Name?` as a predicate (`AdtValue` with matching type name → `true`).
|
||||||
|
4. All bindings go into the current environment via `env-bind!`.
|
||||||
|
5. Returns `Nil`.
|
||||||
|
|
||||||
|
This is an environment mutation — no new frame needed. Evaluates in one step.
|
||||||
|
|
||||||
|
### `match` — special form
|
||||||
|
|
||||||
|
`stepSfMatch(args, env, kont)`:
|
||||||
|
|
||||||
|
1. Push `MatchFrame` with `clauses` and `env` onto kont.
|
||||||
|
2. Return state evaluating the scrutinee `expr`.
|
||||||
|
3. `MatchFrame` continue: receive scrutinee value, walk clauses:
|
||||||
|
- For each `((CtorN vars...) body)`:
|
||||||
|
- If scrutinee is an `AdtValue` with `av_ctor = "CtorN"` and `av_fields.length = |vars|`:
|
||||||
|
- Bind `vars[i]` → `av_fields[i]` in fresh child env.
|
||||||
|
- Return state evaluating `body` in that env.
|
||||||
|
- `(else body)` — always matches, body evaluated in current env.
|
||||||
|
- Literal `42`/`"str"` patterns: match by value equality.
|
||||||
|
- Wildcard `_`: always matches, binds nothing.
|
||||||
|
4. If no clause matched and no `else`: raise `"match: no clause matched <value>"`.
|
||||||
|
|
||||||
|
Frame type: `"match"` — stores `cf_remaining` (clauses), `cf_env` (enclosing env).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Interaction with `cond` / `case`
|
||||||
|
|
||||||
|
`match` is the primary dispatch form for ADTs. `cond` / `case` remain unchanged:
|
||||||
|
|
||||||
|
- `cond` tests arbitrary boolean expressions — still useful for non-ADT dispatch.
|
||||||
|
- `case` matches on equality to literal values — unchanged.
|
||||||
|
- `match` is the new form: structural pattern matching on ADT constructors.
|
||||||
|
|
||||||
|
They are orthogonal. A `match` clause can contain a `cond`; a `cond` clause can contain a `match`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Exhaustiveness checking
|
||||||
|
|
||||||
|
Emit a **warning** (not an error) when:
|
||||||
|
- A `match` has no `else` clause, AND
|
||||||
|
- Not all constructors of the scrutinee's type are covered.
|
||||||
|
|
||||||
|
Detection: when `define-type` runs, it registers the constructor set in a global table
|
||||||
|
`_adt_registry: type_name → [ctor_names]`. At `match` compile/evaluation time:
|
||||||
|
- If the scrutinee's type is in `_adt_registry` and not all ctors appear as patterns:
|
||||||
|
- `console.warn("[sx] match: non-exhaustive — missing: Ctor3, Ctor4 for type Maybe")`
|
||||||
|
- Execution continues (warning, not error).
|
||||||
|
|
||||||
|
This is best-effort: the scrutinee type is only known at runtime. The warning fires on
|
||||||
|
first non-exhaustive match evaluation, not at definition time.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Recursive types
|
||||||
|
|
||||||
|
Recursive types work because constructors are registered as functions, and function bodies
|
||||||
|
are evaluated lazily:
|
||||||
|
|
||||||
|
```lisp
|
||||||
|
(define-type Tree
|
||||||
|
(Leaf)
|
||||||
|
(Node left value right))
|
||||||
|
|
||||||
|
; Recursive function over a recursive type:
|
||||||
|
(define (depth tree)
|
||||||
|
(match tree
|
||||||
|
((Leaf) 0)
|
||||||
|
((Node l v r) (+ 1 (max (depth l) (depth r))))))
|
||||||
|
```
|
||||||
|
|
||||||
|
No special treatment needed — the type definition doesn't need to know about recursion.
|
||||||
|
The constructor `Node` accepts any values, including other `Node` or `Leaf` values.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Pattern variables
|
||||||
|
|
||||||
|
In `match` clauses, identifiers in constructor position that are NOT constructor names are
|
||||||
|
treated as pattern variables (bound to matched field values):
|
||||||
|
|
||||||
|
```lisp
|
||||||
|
(match x
|
||||||
|
((Just v) v) ; v bound to the wrapped value
|
||||||
|
((Nothing) nil))
|
||||||
|
|
||||||
|
(match pair
|
||||||
|
((Cons-of h t) (list h t))) ; h, t bound to head and tail
|
||||||
|
```
|
||||||
|
|
||||||
|
**Wildcard**: `_` is always a wildcard — matches anything, binds nothing.
|
||||||
|
|
||||||
|
```lisp
|
||||||
|
(match x
|
||||||
|
((Just _) "has value")
|
||||||
|
((Nothing) "empty"))
|
||||||
|
```
|
||||||
|
|
||||||
|
**Nested patterns**:
|
||||||
|
|
||||||
|
```lisp
|
||||||
|
(match tree
|
||||||
|
((Node (Leaf) v (Leaf)) (str "leaf node: " v))
|
||||||
|
((Node l v r) (str "inner node: " v)))
|
||||||
|
```
|
||||||
|
|
||||||
|
Nested patterns are matched recursively: the inner `(Leaf)` pattern checks that the
|
||||||
|
`left` field is itself a `Leaf` ADT value.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Implementation Plan
|
||||||
|
|
||||||
|
### Phase 6a — `define-type` + basic `match` (no nested patterns, no exhaustiveness)
|
||||||
|
|
||||||
|
1. OCaml: add `AdtValue of adt_value` to `sx_types.ml`.
|
||||||
|
2. Evaluator: add `step-sf-define-type` — parse clauses, register ctor fns + predicates + accessors.
|
||||||
|
3. Evaluator: add `step-sf-match` + `MatchFrame` — linear scan of clauses, flat patterns only.
|
||||||
|
4. JS: same (AdtValue as plain object with `_adt`/`_type`/`_ctor`/`_fields` props).
|
||||||
|
|
||||||
|
### Phase 6b — nested patterns (separate fire)
|
||||||
|
|
||||||
|
Recursive `matchPattern(pattern, value, env)` helper that:
|
||||||
|
- Returns `{matched: bool, bindings: map}`
|
||||||
|
- Recursively matches sub-patterns against ADT fields.
|
||||||
|
|
||||||
|
### Phase 6c — exhaustiveness warnings (separate fire)
|
||||||
|
|
||||||
|
`_adt_registry` global + warning emission on first non-exhaustive match.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Open questions (deferred to review)
|
||||||
|
|
||||||
|
1. **Accessor auto-generation**: should `Ctor-field` accessors be generated always, or only on demand? Risk: name collisions if two types have constructors with same field names.
|
||||||
|
2. **Singleton constructors**: `(Nothing)` — zero-arg ctor — should these be interned (same object every call) or fresh each time? Interning enables `eq?` checks but requires a global table.
|
||||||
|
3. **Printing/inspect**: `inspect` on an AdtValue should show `(Just 42)` not `#<adt:Just>`. Implement in `inspect` function or via `display`/`write` (Phase 17 ports).
|
||||||
|
4. **Pattern-matching on non-ADT values**: should `match` handle list patterns `(a . b)` and literal patterns in clause heads? Deferred — add only if needed by a language implementation.
|
||||||
Reference in New Issue
Block a user