Design + ops scaffolding for the next phase of work, none of it touching
substrate or guest code.
lib-guest.md: rewrites Architectural framing as a 5-layer stack
(substrate → lib/guest → languages → shared/ → applications),
recursive dependency-direction rule, scaled two-consumer rule. Adds
Phase B (long-running stratification) with sub-layer matrix
(core/typed/relational/effects/layout/lazy/oo), language profiles, and
the long-running-discipline section. Preserves existing Phase A
progress log and rules.
ocaml-on-sx.md: scope reduced to substrate validation + HM + reference
oracle. Phases 1-5 + minimal stdlib slice + vendored testsuite slice.
Dream carved out into dream-on-sx.md; Phase 8 (ReasonML) deferred.
Records lib-guest sequencing dependency.
datalog-on-sx.md: adds Phase 4 built-in predicates + body arithmetic,
Phase 6 magic sets, safety analysis in Phase 3, Non-goals section.
New chisel plans (forward-looking, not yet launchable):
kernel-on-sx.md — first-class everything, env-as-value endgame
idris-on-sx.md — dependent types, evidence chisel
probabilistic-on-sx.md — weighted nondeterminism + traces
maude-on-sx.md — rewriting as primitive
linear-on-sx.md — resource model, artdag-relevant
Loop briefings (4 active, 1 cold):
minikanren-loop.md, ocaml-loop.md, datalog-loop.md, elm-loop.md, koka-loop.md
Restore scripts mirror the loop pattern:
restore-{minikanren,ocaml,datalog,jit-perf,lib-guest}.sh
Each captures worktree state, plan progress, MCP health, tmux status.
Includes the .mcp.json absolute-path patch instruction (fresh worktrees
have no _build/, so the relative mcp_tree path fails on first launch).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
318 lines
24 KiB
Markdown
318 lines
24 KiB
Markdown
# lib/guest — the metatheory layer for SX-hosted languages
|
||
|
||
Extract the duplicated plumbing across `lib/{haskell,common-lisp,erlang,prolog,js,lua,smalltalk,tcl,forth,ruby,apl,hyperscript}` into a small, composable kit so language N+1 costs ~200 lines instead of ~2000, without regressing any existing conformance scoreboard.
|
||
|
||
**This is a long-running, accreting plan.** Phase 0–3 (below) is the bootstrapping extraction — pulling the most obvious shared plumbing out of existing guests. Phase B onwards (see Stratification) is the ongoing accretion: codifying the universal patterns rose-ash's languages share, stratified by audience, refined continuously by pairs of new language consumers. The plan does not have a "done" state. The closest equivalent is "no two languages currently disagree about an abstraction in lib/guest" — and that's a moving target as new languages come online.
|
||
|
||
Branch: `architecture`. SX files via `sx-tree` MCP only. Never edit generated files.
|
||
|
||
## Thesis
|
||
|
||
The substrate (CEK, hygienic macros, records, delimited continuations, IO suspension, reactivity) was chosen with multi-paradigm hosting in mind, but each guest currently re-rolls its own tokeniser, recursive-descent loop, conformance harness, and primitive-rename layer. Extracting these shared layers does not reduce conformance bug-finding pressure — it only removes plumbing — so it is pure win.
|
||
|
||
**Canaries:** Lua (small, conventional expression-grammar — exercises lex/Pratt/AST) and Prolog (paradigm-different — exercises pattern-match/unification). The two-canary rule prevents Lua-shaped abstractions.
|
||
|
||
**Two-language rule:** no extraction is merged until **two** guests consume it. The rule scales with the universality claim — see *Stratification* for layer-appropriate pairs.
|
||
|
||
## Architectural framing — the layered stack
|
||
|
||
Rose-ash stratifies into five layers, each with a different invariant, audience, and time horizon. The same operating principles (dependency direction, two-consumer rule, layered editorial bar) work at every layer.
|
||
|
||
| Layer | rose-ash location | Time horizon | Audience |
|
||
|-------|-------------------|--------------|----------|
|
||
| **Substrate** (SX) | `spec/`, `hosts/` | years | platform maintainers |
|
||
| **lib/guest** (language metatheory) | `lib/guest/` | years, slower than substrate | guest-language authors |
|
||
| **Languages** | `lib/<lang>/` | months–years | application authors |
|
||
| **shared/** (application metatheory) | `shared/` | months | service authors |
|
||
| **Applications** | `blog/`, `market/`, `cart/`, `events/`, `federation/`, `account/`, `orders/`, `artdag/` | weeks–months | village members |
|
||
|
||
What each layer *is*:
|
||
|
||
- **Substrate (SX)** — values, evaluation, continuations, effects, hygienic macros, reactivity. The *physics* of the platform. Bugs here are catastrophic for everyone.
|
||
- **lib/guest** — patterns that recur across paradigms: pattern matching, lexical primitives, precedence parsing, type inference, layout algorithms, effect handler protocols, module dispatch. *Applied PL theory.* Bugs only affect adopters; non-consumers don't care.
|
||
- **Languages** — specific syntactic and semantic commitments that *are* a particular language. The user-facing surface for code authoring.
|
||
- **shared/ (application metatheory)** — patterns that recur across domains: app factory, OAuth flow, ActivityPub, internal HMAC channel, fragments, sessions. Mature counterpart of what lib/guest is becoming, just at the application layer. The two-consumer rule there is already passed (every service is a consumer).
|
||
- **Applications** — the village system itself. Federation, blog, market, events, etc. The proof point that justifies all the layers below.
|
||
|
||
This five-layer separation is unusually clean. Most platforms collapse two adjacent layers — JVM and BEAM are pure substrate (no shared metatheory layer), Lisp images and Smalltalk environments bundle substrate + metatheory, conventional web stacks merge "shared infrastructure" with "applications." Racket's `#lang` machinery is the closest analogue at the lib/guest boundary. Treating each layer as a deliberately separate stratum is a design choice, not a code-organisation accident.
|
||
|
||
### Dependency direction (strict, at every boundary)
|
||
|
||
Higher layers may use lower; lower layers must not know higher exists.
|
||
|
||
- Applications import from `shared/`. `shared/` doesn't know which application is using it.
|
||
- `shared/` and applications import from languages (via SX modules and the host runtime). Languages don't know what application calls them.
|
||
- Languages import from lib/guest. lib/guest doesn't know which language is consuming it.
|
||
- lib/guest uses SX primitives. SX doesn't import from lib/guest.
|
||
|
||
Same invariant that makes substrate/metatheory separation work in PL theory, applied recursively up the stack. Violations show up as cyclic imports or as suspiciously language-specific code in `lib/guest/`, suspiciously domain-specific code in `shared/`, etc.
|
||
|
||
### Two-consumer rule, recursive
|
||
|
||
The pair-validation discipline applies at every layer, with audience-appropriate pairs:
|
||
|
||
- An entry in `lib/guest/core/` needs two consumers from different paradigms (e.g. lua + prolog).
|
||
- An entry in `lib/guest/typed/` needs two typed consumers.
|
||
- A pattern in `shared/` needs two services using it (largely already enforced — auth/HMAC/AP are used everywhere).
|
||
- An application's reusable abstraction promotion to `shared/` should happen only after a second domain wants the same shape.
|
||
|
||
At every layer, "shared between two consumers we happen to have" is not enough — the pair must be appropriate to the universality being claimed.
|
||
|
||
### Editorial bar
|
||
|
||
An entry belongs at layer N only if it codifies a piece of universal-or-near-universal pattern *for that layer's audience*. Same bar at every level; just the meaning of "universal" changes — universal-across-paradigms for lib/guest, universal-across-services for shared/, universal-across-domains for application metatheory.
|
||
|
||
### Leverage versus concreteness
|
||
|
||
The two directions matter at every layer.
|
||
|
||
- **Leverage compounds downward.** A substrate fix benefits every layer above. A lib/guest fix benefits every consuming language. A `shared/` fix benefits every service. So the highest-leverage work is always the layer that *enables* the most above it.
|
||
- **Concreteness flows upward.** Applications are what the village actually uses; substrate is invisible to them. Each layer is judged by its appropriate audience: substrate by correctness and speed, lib/guest by paradigm-coverage, languages by ergonomic fit, `shared/` by service reuse, applications by real users on real use cases.
|
||
|
||
The pleasant property: once you internalise the operating discipline at one layer, you know how to operate at every other. Pair-driven extraction. Two-consumer rule scaled to the layer's universality. Higher-uses-lower invariant. Codify-don't-just-deduplicate. The lib/guest plan is a working example of these principles applied at the metatheory layer; the same playbook applies all the way up.
|
||
|
||
## Current baseline
|
||
|
||
The loop fills these in on its first iteration by running every `*/conformance.sh` and `*/test.sh` and copying each `scoreboard.json` to `lib/guest/baseline/<lang>.json`. Until then:
|
||
|
||
| Guest | Suite | Baseline |
|
||
|--------------|--------------------|----------|
|
||
| lua | `bash lib/lua/test.sh` | 185 / 185 |
|
||
| prolog | `bash lib/prolog/conformance.sh` | 590 / 590 |
|
||
| haskell | `bash lib/haskell/conformance.sh` | 156 / 156 (was reported 0/18 by the buggy old script) |
|
||
| common-lisp | `bash lib/common-lisp/conformance.sh` | 518 / 518 (Phase 2 +182 and Phase 6 +27 were previously under-counted) |
|
||
| erlang | `bash lib/erlang/conformance.sh` | 0 / 0 (suite all-zero) |
|
||
| js | `bash lib/js/conformance.sh` | 94 / 148 (test262-slice) |
|
||
| smalltalk | `bash lib/smalltalk/conformance.sh` | 625 / 629 |
|
||
| tcl | `bash lib/tcl/conformance.sh` | 3 / 4 (programs) |
|
||
| forth | `bash lib/forth/test.sh` | 64 / 64 |
|
||
| ruby | `bash lib/ruby/test.sh` | 76 / 76 |
|
||
| apl | `bash lib/apl/test.sh` | 73 / 73 |
|
||
|
||
The baseline only needs to be re-snapshotted when the substrate (`spec/**`, `hosts/**`) changes underneath this loop.
|
||
|
||
---
|
||
|
||
## Phase A — Bootstrapping extraction (Phases 0–3 below)
|
||
|
||
The following four phases (0/1/2/3) are the bootstrap — pulling the most obvious shared plumbing out of existing guests. Largely shipped; partial-status entries are deferred ports waiting for their natural consumer (datalog, minikanren, ocaml, etc.) to close them. Phase B (Stratification) is the long-running successor.
|
||
|
||
## Phase 0 — Baseline snapshot (one-shot)
|
||
|
||
### Step 0: Snapshot every guest's scoreboard
|
||
|
||
Create `lib/guest/baseline/`. Run every guest's conformance/test runner. Copy each `scoreboard.json` (or extract pass/fail counts from `test.sh` output for guests without a scoreboard) into `lib/guest/baseline/<lang>.json`. Fill in the table above.
|
||
|
||
**Verify:** `ls lib/guest/baseline/*.json` shows one per guest. Plan table populated.
|
||
|
||
---
|
||
|
||
## Phase 1 — Cheap, zero-semantic-risk extractions
|
||
|
||
### Step 1: `lib/guest/conformance.sx` — config-driven test runner
|
||
|
||
Replace the 6+ near-identical `*/conformance.sh` scripts with one driver that takes a config dict:
|
||
|
||
```
|
||
{:lang "prolog"
|
||
:loads ("lib/prolog/tokenizer.sx" "lib/prolog/parser.sx" ...)
|
||
:suites (("parse" "lib/prolog/tests/parse.sx" "pl-parse-tests-run!") ...)}
|
||
```
|
||
|
||
The driver locates `sx_server.exe`, runs the epoch protocol, collects pass/fail per suite, and writes `scoreboard.{json,md}`. The per-language `conformance.sh` becomes a 3-line stub that points at its config.
|
||
|
||
**Port to:** `lib/prolog/conformance.sh` and `lib/haskell/conformance.sh`. Two consumers required for merge.
|
||
|
||
**Verify:** both `bash lib/prolog/conformance.sh` and `bash lib/haskell/conformance.sh` produce scoreboard JSONs equal to baseline.
|
||
|
||
### Step 2: `lib/guest/prefix.sx` — prefix-rename macro
|
||
|
||
One macro that takes a prefix and a list of SX symbols and binds prefixed aliases:
|
||
|
||
```
|
||
(prefix-rename "cl-" '(null? pair? even? odd? zero? ...))
|
||
```
|
||
|
||
Replaces hundreds of hand-written `(define (cl-null? x) (= x nil))`-style wrappers in `common-lisp/runtime.sx`, `lua/runtime.sx`, `erlang/runtime.sx`.
|
||
|
||
**Port to:** `common-lisp/runtime.sx` (largest user) and `lua/runtime.sx`. Two consumers.
|
||
|
||
**Verify:** common-lisp + lua scoreboards equal baseline.
|
||
|
||
---
|
||
|
||
## Phase 2 — Lex / parse kit
|
||
|
||
### Step 3: `lib/guest/lex.sx` — character-class + tokeniser primitives
|
||
|
||
- Source-position tracking (line/col/offset).
|
||
- Character-class predicates (`whitespace?`, `digit?`, `alpha?`, `ident-start?`, `ident-rest?`).
|
||
- Number recognisers (decimal, hex, float, scientific).
|
||
- String recognisers (quoted, escapes, raw).
|
||
- Comment recognisers (line, block, nestable).
|
||
- Token record `{:type :value :pos :end :line}`.
|
||
|
||
**Port to:** `lua/tokenizer.sx` and `tcl/tokenizer.sx`. Two consumers.
|
||
|
||
**Verify:** lua + tcl scoreboards equal baseline.
|
||
|
||
### Step 4: `lib/guest/pratt.sx` — Pratt / operator-precedence parser
|
||
|
||
Prefix / infix / postfix tables, left/right associativity, precedence climbing. Grammar is a dict, not hardcoded `cond`.
|
||
|
||
**Port to:** Lua expression parser (`lua/parser.sx`) and Prolog operator table (`prolog/parser.sx` — Prolog ops are the stress test). Two consumers.
|
||
|
||
**Verify:** lua + prolog scoreboards equal baseline.
|
||
|
||
### Step 5: `lib/guest/ast.sx` — canonical AST node shapes
|
||
|
||
Standard constructors and predicates for: `literal`, `var`, `app`, `lambda`, `let`, `letrec`, `if`, `match-clause`, `module`, `import`. Optional — guests may keep their own AST — but using the canonical shape lets cross-language tooling (formatters, highlighters, debuggers) work without per-language adapters.
|
||
|
||
**Port to:** lua + prolog AST emitters. Two consumers.
|
||
|
||
**Verify:** lua + prolog scoreboards equal baseline.
|
||
|
||
---
|
||
|
||
## Phase 3 — Semantic extractions (highest leverage, highest risk)
|
||
|
||
### Step 6: `lib/guest/match.sx` — pattern-match + unification engine
|
||
|
||
Single engine for:
|
||
- Literal patterns (numbers, strings, symbols, nil, booleans).
|
||
- Wildcard `_`.
|
||
- Constructor patterns (ADT-shaped — depends on Phase 3 of `sx-improvements.md` if available, otherwise dict-tagged).
|
||
- Variable binding.
|
||
- **Unification** (Prolog flavour): symmetric, occurs-check toggle, substitution returned.
|
||
- **Match** (Haskell flavour): asymmetric pattern→value, bindings returned.
|
||
|
||
**Port to:** `haskell/match.sx` and `prolog/query.sx` unification core. Two consumers.
|
||
|
||
**Verify:** haskell + prolog scoreboards equal baseline. **Highest-risk extraction** — if either regresses by 1 test, revert and redesign.
|
||
|
||
### Step 7: `lib/guest/layout.sx` — significant-whitespace / off-side rule
|
||
|
||
Generalised layout-sensitive lexer. Configurable: which keywords open layout blocks, whether semicolons are inserted, brace insertion rules.
|
||
|
||
**Port to:** `haskell/layout.sx` (existing). Second consumer: write a synthetic test fixture that exercises a Python-ish layout to prove the kit is not Haskell-shaped. Two consumers.
|
||
|
||
**Verify:** haskell scoreboard equal baseline; synthetic layout fixture passes.
|
||
|
||
### Step 8: `lib/guest/hm.sx` — Hindley-Milner type inference
|
||
|
||
Extract from `haskell/infer.sx`. Algorithm W or J, generalisation, instantiation, occurs-check, principal types.
|
||
|
||
**Sequencing:** this step is **paired with `plans/ocaml-on-sx.md` Phase 5**. The natural order is lib-guest Steps 0–7 → OCaml-on-SX Phases 1–5 → lib-guest Step 8. With OCaml-on-SX Phase 5 done, the two-language rule is satisfied for real (Haskell + OCaml). Without it, accept "second user TBD" — the alternative is letting the inference stay locked inside Haskell forever.
|
||
|
||
**Port to:** `haskell/infer.sx` and (preferred) `lib/ocaml/types.sx`.
|
||
|
||
**Verify:** haskell scoreboard equal baseline; if OCaml-on-SX Phase 5 has shipped, OCaml type-inference tests equal baseline too.
|
||
|
||
---
|
||
|
||
## Phase B — Stratification (long-running)
|
||
|
||
lib/guest itself decomposes by audience. Phase B accepts that and codifies the decomposition. Sub-layers emerge as paradigms reveal which abstractions are real; nothing in this section is fully fleshed out — it's the editorial direction, not a concrete queue.
|
||
|
||
### Proposed sub-layer shape
|
||
|
||
| Sub-layer | Purpose | Pair-validation requirement |
|
||
|-----------|---------|------------------------------|
|
||
| `lib/guest/core/` | True universals: lex, pratt, ast, match, prefix-rename, conformance harness | Two consumers from *different paradigms* (e.g. lua + prolog) |
|
||
| `lib/guest/typed/` | HM, generalisation, kind system, type-class-style dispatch | Two typed consumers (e.g. ocaml + haskell) |
|
||
| `lib/guest/relational/` | Unification beyond core match, occurs-check toggles, substitution composition, search strategies | Two relational consumers (e.g. minikanren + datalog) |
|
||
| `lib/guest/effects/` | Handler stacks, perform/resume protocols, dynamic-extent tracking | Two effect-typed consumers (e.g. koka + future) |
|
||
| `lib/guest/layout/` | Off-side rule, semicolon insertion, brace insertion | Two whitespace-sensitive consumers (e.g. haskell + elm or python-shape) |
|
||
| `lib/guest/lazy/` | Thunk wrapping, force/delay protocols, sharing semantics | Two lazy consumers (e.g. haskell + future lazy guest) |
|
||
| `lib/guest/oo/` | Message dispatch, method tables, super lookup | Two message-passing consumers (e.g. smalltalk + ruby) |
|
||
|
||
Future rows as paradigms emerge: constraint-domain solvers, gradual typing, capability-based effect systems, dependent types, etc. Layers should not be created speculatively — wait for two real consumers in the same paradigm before opening a sub-layer.
|
||
|
||
### Re-homing the Phase A entries
|
||
|
||
The Phase 0–3 entries currently at the lib/guest root are the candidates for re-homing under sub-layers as the stratification settles. Initial mapping (subject to refinement):
|
||
|
||
- `conformance.sx`, `prefix.sx` → stay at root (true infrastructure, not paradigm-specific)
|
||
- `lex.sx`, `pratt.sx`, `ast.sx`, `match.sx` → `core/`
|
||
- `layout.sx` → `layout/`
|
||
- `hm.sx` → `typed/` (currently the most overclaiming entry at root — has no plausible non-typed consumer)
|
||
|
||
### Two-consumer rule, scaled
|
||
|
||
The flat "two guests must consume it" rule scales with the layer's universality claim:
|
||
|
||
- A `core/` extraction must be cross-paradigm-validated (lua + prolog, not lua + tcl which are both dynamic-imperative).
|
||
- A `typed/` extraction needs two typed consumers; that's a tighter audience but still real.
|
||
- A `relational/` extraction needs two relational consumers, etc.
|
||
- Each layer's bar is *exactly* the universality it claims, no more, no less. An abstraction that claims universality (root level) but only has typed consumers belongs in `typed/`, not at root.
|
||
|
||
### Language profiles
|
||
|
||
Each language ends up consuming a *profile* of which sub-layers it uses. Profiles are aspirational until each language ports — but the matrix tells you which sub-layers to invest in based on consumer demand, and serves as a quick design document for new languages ("which existing profile does it most resemble?").
|
||
|
||
| Language | core | typed | relational | effects | layout | lazy | oo |
|
||
|----------|:----:|:-----:|:----------:|:-------:|:------:|:----:|:--:|
|
||
| ocaml | ✓ | ✓ | | | | | |
|
||
| haskell | ✓ | ✓ | | | ✓ | ✓ | |
|
||
| elm | ✓ | ✓ | | | ✓ | | |
|
||
| reasonml | ✓ | ✓ | | | | | |
|
||
| minikanren | ✓ | | ✓ | | | | |
|
||
| datalog | ✓ | | ✓ | | | | |
|
||
| prolog | ✓ | | ✓ | | | | |
|
||
| koka | ✓ | ✓ | | ✓ | | | |
|
||
| erlang | ✓ | | | (msg) | | | |
|
||
| elixir | ✓ | | | (msg) | | | |
|
||
| smalltalk | ✓ | | | | | | ✓ |
|
||
| ruby | ✓ | | | | | | ✓ |
|
||
| common-lisp | ✓ | | | | | | (CLOS) |
|
||
| lua / tcl / forth / apl | ✓ | | | | | | |
|
||
| js | ✓ | | | (async) | | | |
|
||
|
||
`(msg)`, `(async)`, `(CLOS)` denote shapes that *might* live in `effects/` or `oo/` once the paradigm gets a second consumer to validate against.
|
||
|
||
## Long-running discipline
|
||
|
||
This plan does not have a "done" state. The operating mode is *continuous pair-driven refactoring*:
|
||
|
||
- When a new guest reaches the same shape as an existing one → look for shared abstraction → consider extraction.
|
||
- When two existing consumers diverge on how they use a kit → consider a sub-layer split or a redesign.
|
||
- When a sub-layer accumulates more than ~5 entries → consider further stratification.
|
||
- When a kit has *never* been refactored after a second consumer ported → suspicious; the second port probably reshaped expectations and the kit should have flexed. Audit it.
|
||
- When a Phase A entry (currently at root) gets a second consumer in a narrower paradigm than "universal" → re-home into the appropriate sub-layer, don't wait for a third.
|
||
|
||
**Substrate work and lib/guest work feed each other.** Substrate fixes (sx-improvements queue) raise lib/guest's ceiling — every kit gets faster and more correct. lib/guest exposes substrate gaps that wouldn't show up in single-guest work — when two paradigms can't share an abstraction cleanly, the substrate may be missing a primitive. Treat lib/guest issues as substrate-investigation prompts before papering them over with kit-side workarounds.
|
||
|
||
**Extraction is not the goal — codification is.** "I refactored 800 lines of duplication into 200 lines of shared kit" is the bootstrapping mode. The long-running mode is "I codified a piece of language theory in working SX form, validated by N paradigms." The same line-count delta means very different things in those two modes. Keep the bar at codification, not just deduplication.
|
||
|
||
---
|
||
|
||
## Progress log
|
||
|
||
| Step | Status | Commit | Delta |
|
||
|------|--------|--------|-------|
|
||
| 0 — baseline snapshot | [done] | 2f7f8189 | 11 guests captured: lua 185/185, forth 64/64, ruby 76/76, apl 73/73, prolog 590/590, common-lisp 309/309, smalltalk 625/629, tcl 3/4, haskell 0/18 programs, js 94/148 (slice), erlang 0/0 |
|
||
| 1 — conformance.sx (prolog + haskell) | [done] | 58dcff26 | Prolog 590/590 (matches baseline). Haskell 156/156 — old script was broken (0/18 was an artefact of a never-matching grep), driver reveals true counts; baseline updated. |
|
||
| 2 — prefix.sx (common-lisp + lua) | [partial — pending lua] | 2ef773a3 | common-lisp/runtime.sx ported (47 aliases collapsed into 13 prefix-rename calls); 518/518 vs 309/309 baseline (improvement, no regression). lua/runtime.sx has no pure same-name aliases — every lua- definition wraps custom logic; second consumer pending. |
|
||
| 3 — lex.sx (lua + tcl) | [done] | 559b0df9 | lex.sx exports nil-safe char-class predicates + token record. lua/tokenizer.sx (7 preds) and tcl/tokenizer.sx (5 preds) collapsed into prefix-rename calls. lua 185/185, tcl 342/342, tcl-conf 3/4 — all = baseline. |
|
||
| 4 — pratt.sx (lua + prolog) | [done] | da27958d | Extracted operator-table format + lookup only — climbing loops stay per-language because lua and prolog use opposite prec conventions. lua/parser.sx: 18-clause cond → 15-entry table. prolog/parser.sx: pl-op-find deleted, pl-op-lookup wraps pratt-op-lookup. lua 185/185, prolog 590/590 — both = baseline. |
|
||
| 5 — ast.sx (lua + prolog) | [partial — pending real consumers] | a774cd26 | Kit + 33 self-tests shipped (10 canonical kinds, predicates, accessors). Step is "Optional" per brief; lua/prolog parsers untouched (185/185 + 590/590). Datalog-on-sx will be the natural first real consumer; lua/prolog converters can land later. |
|
||
| 6 — match.sx (haskell + prolog) | [partial — kit shipped; ports deferred] | 863e9d93 | Pure-functional unify + match kit (canonical wire format + cfg-driven adapters) + 25 self-tests. Existing prolog/haskell engines untouched (structurally divergent — mutating-symmetric vs pure-asymmetric — would risk 746 passing tests under brief's revert-on-regression rule). Real consumer is minikraken/datalog work in flight. |
|
||
| 7 — layout.sx (haskell + synthetic) | [partial — haskell port deferred] | d75c61d4 | Configurable kit (haskell-style keyword-opens + python-style trailing-`:`-opens) + 6 self-tests covering both flavours. Synthetic Python-ish fixture passes; haskell/layout.sx untouched (kit not yet a drop-in for Haskell 98 Note 5 etc.; haskell still 156/156 baseline). |
|
||
| 8 — hm.sx (haskell + TBD) | [partial — algebra shipped; assembly deferred] | ab2c40c1 | HM foundations: types/schemes/ftv/apply/compose/generalize/instantiate/fresh-tv on top of match.sx unify, plus literal inference rule. 24/24 self-tests. Algorithm W lambda/app/let assembly deferred to host code — paired sequencing per brief: lib/ocaml/types.sx (OCaml-on-SX Phase 5) + haskell/infer.sx port. Haskell still 156/156 baseline. |
|
||
|
||
---
|
||
|
||
## Rules
|
||
|
||
- **Branch:** `architecture`. Commit locally. **Never push.** **Never touch `main`.**
|
||
- **Scope:** ONLY `lib/guest/**`, `lib/{lua,prolog,haskell,common-lisp,tcl}/**` (canaries + extraction targets), `plans/lib-guest.md`, `plans/agent-briefings/lib-guest-loop.md`. No `spec/`, `hosts/`, `web/`, `shared/`.
|
||
- **SX files:** `sx-tree` MCP tools only. `sx_validate` after every edit.
|
||
- **No raw dune.** Use `sx_build target="ocaml"` MCP tool.
|
||
- **Two-language rule (scaled by claim):** never merge an extraction until two guests consume it. The pair must be appropriate to the layer's universality claim — `core/` needs cross-paradigm pair, `typed/` needs two typed consumers, `relational/` needs two relational consumers, etc. (See *Phase B — Stratification* for the matrix.) Step 8 (Phase A) excepted with explicit OCaml-paired note.
|
||
- **Conformance baseline is the bar.** Any port whose scoreboard regresses by ≥1 test → revert, mark blocked, move on.
|
||
- **Substrate change → re-snapshot.** If `spec/` or `hosts/` changes underneath this loop, re-run Step 0 before continuing.
|
||
- **One step per code commit.** Plan updates as a separate commit. Short message with delta.
|
||
- **No alias chains** to paper over drift between extraction and consumer (`feedback_no_alias_bloat`).
|
||
- **Partial extraction is OK** if the canary works and a pending consumer is identified — mark `[partial — pending <consumer>]`.
|
||
- **Hard timeout:** if stuck >45 min on a step, mark `blocked (<reason>)` and move on.
|