Files
rose-ash/plans/lib-guest.md
giles 9dd9fb9c37 plans: layered-stack framing + chisel sequence + loop scaffolding
Design + ops scaffolding for the next phase of work, none of it touching
substrate or guest code.

lib-guest.md: rewrites Architectural framing as a 5-layer stack
  (substrate → lib/guest → languages → shared/ → applications),
  recursive dependency-direction rule, scaled two-consumer rule. Adds
  Phase B (long-running stratification) with sub-layer matrix
  (core/typed/relational/effects/layout/lazy/oo), language profiles, and
  the long-running-discipline section. Preserves existing Phase A
  progress log and rules.

ocaml-on-sx.md: scope reduced to substrate validation + HM + reference
  oracle. Phases 1-5 + minimal stdlib slice + vendored testsuite slice.
  Dream carved out into dream-on-sx.md; Phase 8 (ReasonML) deferred.
  Records lib-guest sequencing dependency.

datalog-on-sx.md: adds Phase 4 built-in predicates + body arithmetic,
  Phase 6 magic sets, safety analysis in Phase 3, Non-goals section.

New chisel plans (forward-looking, not yet launchable):
  kernel-on-sx.md       — first-class everything, env-as-value endgame
  idris-on-sx.md        — dependent types, evidence chisel
  probabilistic-on-sx.md — weighted nondeterminism + traces
  maude-on-sx.md        — rewriting as primitive
  linear-on-sx.md       — resource model, artdag-relevant

Loop briefings (4 active, 1 cold):
  minikanren-loop.md, ocaml-loop.md, datalog-loop.md, elm-loop.md, koka-loop.md

Restore scripts mirror the loop pattern:
  restore-{minikanren,ocaml,datalog,jit-perf,lib-guest}.sh
  Each captures worktree state, plan progress, MCP health, tmux status.
  Includes the .mcp.json absolute-path patch instruction (fresh worktrees
  have no _build/, so the relative mcp_tree path fails on first launch).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 22:27:50 +00:00

24 KiB
Raw Blame History

lib/guest — the metatheory layer for SX-hosted languages

Extract the duplicated plumbing across lib/{haskell,common-lisp,erlang,prolog,js,lua,smalltalk,tcl,forth,ruby,apl,hyperscript} into a small, composable kit so language N+1 costs ~200 lines instead of ~2000, without regressing any existing conformance scoreboard.

This is a long-running, accreting plan. Phase 03 (below) is the bootstrapping extraction — pulling the most obvious shared plumbing out of existing guests. Phase B onwards (see Stratification) is the ongoing accretion: codifying the universal patterns rose-ash's languages share, stratified by audience, refined continuously by pairs of new language consumers. The plan does not have a "done" state. The closest equivalent is "no two languages currently disagree about an abstraction in lib/guest" — and that's a moving target as new languages come online.

Branch: architecture. SX files via sx-tree MCP only. Never edit generated files.

Thesis

The substrate (CEK, hygienic macros, records, delimited continuations, IO suspension, reactivity) was chosen with multi-paradigm hosting in mind, but each guest currently re-rolls its own tokeniser, recursive-descent loop, conformance harness, and primitive-rename layer. Extracting these shared layers does not reduce conformance bug-finding pressure — it only removes plumbing — so it is pure win.

Canaries: Lua (small, conventional expression-grammar — exercises lex/Pratt/AST) and Prolog (paradigm-different — exercises pattern-match/unification). The two-canary rule prevents Lua-shaped abstractions.

Two-language rule: no extraction is merged until two guests consume it. The rule scales with the universality claim — see Stratification for layer-appropriate pairs.

Architectural framing — the layered stack

Rose-ash stratifies into five layers, each with a different invariant, audience, and time horizon. The same operating principles (dependency direction, two-consumer rule, layered editorial bar) work at every layer.

Layer rose-ash location Time horizon Audience
Substrate (SX) spec/, hosts/ years platform maintainers
lib/guest (language metatheory) lib/guest/ years, slower than substrate guest-language authors
Languages lib/<lang>/ monthsyears application authors
shared/ (application metatheory) shared/ months service authors
Applications blog/, market/, cart/, events/, federation/, account/, orders/, artdag/ weeksmonths village members

What each layer is:

  • Substrate (SX) — values, evaluation, continuations, effects, hygienic macros, reactivity. The physics of the platform. Bugs here are catastrophic for everyone.
  • lib/guest — patterns that recur across paradigms: pattern matching, lexical primitives, precedence parsing, type inference, layout algorithms, effect handler protocols, module dispatch. Applied PL theory. Bugs only affect adopters; non-consumers don't care.
  • Languages — specific syntactic and semantic commitments that are a particular language. The user-facing surface for code authoring.
  • shared/ (application metatheory) — patterns that recur across domains: app factory, OAuth flow, ActivityPub, internal HMAC channel, fragments, sessions. Mature counterpart of what lib/guest is becoming, just at the application layer. The two-consumer rule there is already passed (every service is a consumer).
  • Applications — the village system itself. Federation, blog, market, events, etc. The proof point that justifies all the layers below.

This five-layer separation is unusually clean. Most platforms collapse two adjacent layers — JVM and BEAM are pure substrate (no shared metatheory layer), Lisp images and Smalltalk environments bundle substrate + metatheory, conventional web stacks merge "shared infrastructure" with "applications." Racket's #lang machinery is the closest analogue at the lib/guest boundary. Treating each layer as a deliberately separate stratum is a design choice, not a code-organisation accident.

Dependency direction (strict, at every boundary)

Higher layers may use lower; lower layers must not know higher exists.

  • Applications import from shared/. shared/ doesn't know which application is using it.
  • shared/ and applications import from languages (via SX modules and the host runtime). Languages don't know what application calls them.
  • Languages import from lib/guest. lib/guest doesn't know which language is consuming it.
  • lib/guest uses SX primitives. SX doesn't import from lib/guest.

Same invariant that makes substrate/metatheory separation work in PL theory, applied recursively up the stack. Violations show up as cyclic imports or as suspiciously language-specific code in lib/guest/, suspiciously domain-specific code in shared/, etc.

Two-consumer rule, recursive

The pair-validation discipline applies at every layer, with audience-appropriate pairs:

  • An entry in lib/guest/core/ needs two consumers from different paradigms (e.g. lua + prolog).
  • An entry in lib/guest/typed/ needs two typed consumers.
  • A pattern in shared/ needs two services using it (largely already enforced — auth/HMAC/AP are used everywhere).
  • An application's reusable abstraction promotion to shared/ should happen only after a second domain wants the same shape.

At every layer, "shared between two consumers we happen to have" is not enough — the pair must be appropriate to the universality being claimed.

Editorial bar

An entry belongs at layer N only if it codifies a piece of universal-or-near-universal pattern for that layer's audience. Same bar at every level; just the meaning of "universal" changes — universal-across-paradigms for lib/guest, universal-across-services for shared/, universal-across-domains for application metatheory.

Leverage versus concreteness

The two directions matter at every layer.

  • Leverage compounds downward. A substrate fix benefits every layer above. A lib/guest fix benefits every consuming language. A shared/ fix benefits every service. So the highest-leverage work is always the layer that enables the most above it.
  • Concreteness flows upward. Applications are what the village actually uses; substrate is invisible to them. Each layer is judged by its appropriate audience: substrate by correctness and speed, lib/guest by paradigm-coverage, languages by ergonomic fit, shared/ by service reuse, applications by real users on real use cases.

The pleasant property: once you internalise the operating discipline at one layer, you know how to operate at every other. Pair-driven extraction. Two-consumer rule scaled to the layer's universality. Higher-uses-lower invariant. Codify-don't-just-deduplicate. The lib/guest plan is a working example of these principles applied at the metatheory layer; the same playbook applies all the way up.

Current baseline

The loop fills these in on its first iteration by running every */conformance.sh and */test.sh and copying each scoreboard.json to lib/guest/baseline/<lang>.json. Until then:

Guest Suite Baseline
lua bash lib/lua/test.sh 185 / 185
prolog bash lib/prolog/conformance.sh 590 / 590
haskell bash lib/haskell/conformance.sh 156 / 156 (was reported 0/18 by the buggy old script)
common-lisp bash lib/common-lisp/conformance.sh 518 / 518 (Phase 2 +182 and Phase 6 +27 were previously under-counted)
erlang bash lib/erlang/conformance.sh 0 / 0 (suite all-zero)
js bash lib/js/conformance.sh 94 / 148 (test262-slice)
smalltalk bash lib/smalltalk/conformance.sh 625 / 629
tcl bash lib/tcl/conformance.sh 3 / 4 (programs)
forth bash lib/forth/test.sh 64 / 64
ruby bash lib/ruby/test.sh 76 / 76
apl bash lib/apl/test.sh 73 / 73

The baseline only needs to be re-snapshotted when the substrate (spec/**, hosts/**) changes underneath this loop.


Phase A — Bootstrapping extraction (Phases 03 below)

The following four phases (0/1/2/3) are the bootstrap — pulling the most obvious shared plumbing out of existing guests. Largely shipped; partial-status entries are deferred ports waiting for their natural consumer (datalog, minikanren, ocaml, etc.) to close them. Phase B (Stratification) is the long-running successor.

Phase 0 — Baseline snapshot (one-shot)

Step 0: Snapshot every guest's scoreboard

Create lib/guest/baseline/. Run every guest's conformance/test runner. Copy each scoreboard.json (or extract pass/fail counts from test.sh output for guests without a scoreboard) into lib/guest/baseline/<lang>.json. Fill in the table above.

Verify: ls lib/guest/baseline/*.json shows one per guest. Plan table populated.


Phase 1 — Cheap, zero-semantic-risk extractions

Step 1: lib/guest/conformance.sx — config-driven test runner

Replace the 6+ near-identical */conformance.sh scripts with one driver that takes a config dict:

{:lang "prolog"
 :loads ("lib/prolog/tokenizer.sx" "lib/prolog/parser.sx" ...)
 :suites (("parse" "lib/prolog/tests/parse.sx" "pl-parse-tests-run!") ...)}

The driver locates sx_server.exe, runs the epoch protocol, collects pass/fail per suite, and writes scoreboard.{json,md}. The per-language conformance.sh becomes a 3-line stub that points at its config.

Port to: lib/prolog/conformance.sh and lib/haskell/conformance.sh. Two consumers required for merge.

Verify: both bash lib/prolog/conformance.sh and bash lib/haskell/conformance.sh produce scoreboard JSONs equal to baseline.

Step 2: lib/guest/prefix.sx — prefix-rename macro

One macro that takes a prefix and a list of SX symbols and binds prefixed aliases:

(prefix-rename "cl-" '(null? pair? even? odd? zero? ...))

Replaces hundreds of hand-written (define (cl-null? x) (= x nil))-style wrappers in common-lisp/runtime.sx, lua/runtime.sx, erlang/runtime.sx.

Port to: common-lisp/runtime.sx (largest user) and lua/runtime.sx. Two consumers.

Verify: common-lisp + lua scoreboards equal baseline.


Phase 2 — Lex / parse kit

Step 3: lib/guest/lex.sx — character-class + tokeniser primitives

  • Source-position tracking (line/col/offset).
  • Character-class predicates (whitespace?, digit?, alpha?, ident-start?, ident-rest?).
  • Number recognisers (decimal, hex, float, scientific).
  • String recognisers (quoted, escapes, raw).
  • Comment recognisers (line, block, nestable).
  • Token record {:type :value :pos :end :line}.

Port to: lua/tokenizer.sx and tcl/tokenizer.sx. Two consumers.

Verify: lua + tcl scoreboards equal baseline.

Step 4: lib/guest/pratt.sx — Pratt / operator-precedence parser

Prefix / infix / postfix tables, left/right associativity, precedence climbing. Grammar is a dict, not hardcoded cond.

Port to: Lua expression parser (lua/parser.sx) and Prolog operator table (prolog/parser.sx — Prolog ops are the stress test). Two consumers.

Verify: lua + prolog scoreboards equal baseline.

Step 5: lib/guest/ast.sx — canonical AST node shapes

Standard constructors and predicates for: literal, var, app, lambda, let, letrec, if, match-clause, module, import. Optional — guests may keep their own AST — but using the canonical shape lets cross-language tooling (formatters, highlighters, debuggers) work without per-language adapters.

Port to: lua + prolog AST emitters. Two consumers.

Verify: lua + prolog scoreboards equal baseline.


Phase 3 — Semantic extractions (highest leverage, highest risk)

Step 6: lib/guest/match.sx — pattern-match + unification engine

Single engine for:

  • Literal patterns (numbers, strings, symbols, nil, booleans).
  • Wildcard _.
  • Constructor patterns (ADT-shaped — depends on Phase 3 of sx-improvements.md if available, otherwise dict-tagged).
  • Variable binding.
  • Unification (Prolog flavour): symmetric, occurs-check toggle, substitution returned.
  • Match (Haskell flavour): asymmetric pattern→value, bindings returned.

Port to: haskell/match.sx and prolog/query.sx unification core. Two consumers.

Verify: haskell + prolog scoreboards equal baseline. Highest-risk extraction — if either regresses by 1 test, revert and redesign.

Step 7: lib/guest/layout.sx — significant-whitespace / off-side rule

Generalised layout-sensitive lexer. Configurable: which keywords open layout blocks, whether semicolons are inserted, brace insertion rules.

Port to: haskell/layout.sx (existing). Second consumer: write a synthetic test fixture that exercises a Python-ish layout to prove the kit is not Haskell-shaped. Two consumers.

Verify: haskell scoreboard equal baseline; synthetic layout fixture passes.

Step 8: lib/guest/hm.sx — Hindley-Milner type inference

Extract from haskell/infer.sx. Algorithm W or J, generalisation, instantiation, occurs-check, principal types.

Sequencing: this step is paired with plans/ocaml-on-sx.md Phase 5. The natural order is lib-guest Steps 07 → OCaml-on-SX Phases 15 → lib-guest Step 8. With OCaml-on-SX Phase 5 done, the two-language rule is satisfied for real (Haskell + OCaml). Without it, accept "second user TBD" — the alternative is letting the inference stay locked inside Haskell forever.

Port to: haskell/infer.sx and (preferred) lib/ocaml/types.sx.

Verify: haskell scoreboard equal baseline; if OCaml-on-SX Phase 5 has shipped, OCaml type-inference tests equal baseline too.


Phase B — Stratification (long-running)

lib/guest itself decomposes by audience. Phase B accepts that and codifies the decomposition. Sub-layers emerge as paradigms reveal which abstractions are real; nothing in this section is fully fleshed out — it's the editorial direction, not a concrete queue.

Proposed sub-layer shape

Sub-layer Purpose Pair-validation requirement
lib/guest/core/ True universals: lex, pratt, ast, match, prefix-rename, conformance harness Two consumers from different paradigms (e.g. lua + prolog)
lib/guest/typed/ HM, generalisation, kind system, type-class-style dispatch Two typed consumers (e.g. ocaml + haskell)
lib/guest/relational/ Unification beyond core match, occurs-check toggles, substitution composition, search strategies Two relational consumers (e.g. minikanren + datalog)
lib/guest/effects/ Handler stacks, perform/resume protocols, dynamic-extent tracking Two effect-typed consumers (e.g. koka + future)
lib/guest/layout/ Off-side rule, semicolon insertion, brace insertion Two whitespace-sensitive consumers (e.g. haskell + elm or python-shape)
lib/guest/lazy/ Thunk wrapping, force/delay protocols, sharing semantics Two lazy consumers (e.g. haskell + future lazy guest)
lib/guest/oo/ Message dispatch, method tables, super lookup Two message-passing consumers (e.g. smalltalk + ruby)

Future rows as paradigms emerge: constraint-domain solvers, gradual typing, capability-based effect systems, dependent types, etc. Layers should not be created speculatively — wait for two real consumers in the same paradigm before opening a sub-layer.

Re-homing the Phase A entries

The Phase 03 entries currently at the lib/guest root are the candidates for re-homing under sub-layers as the stratification settles. Initial mapping (subject to refinement):

  • conformance.sx, prefix.sx → stay at root (true infrastructure, not paradigm-specific)
  • lex.sx, pratt.sx, ast.sx, match.sxcore/
  • layout.sxlayout/
  • hm.sxtyped/ (currently the most overclaiming entry at root — has no plausible non-typed consumer)

Two-consumer rule, scaled

The flat "two guests must consume it" rule scales with the layer's universality claim:

  • A core/ extraction must be cross-paradigm-validated (lua + prolog, not lua + tcl which are both dynamic-imperative).
  • A typed/ extraction needs two typed consumers; that's a tighter audience but still real.
  • A relational/ extraction needs two relational consumers, etc.
  • Each layer's bar is exactly the universality it claims, no more, no less. An abstraction that claims universality (root level) but only has typed consumers belongs in typed/, not at root.

Language profiles

Each language ends up consuming a profile of which sub-layers it uses. Profiles are aspirational until each language ports — but the matrix tells you which sub-layers to invest in based on consumer demand, and serves as a quick design document for new languages ("which existing profile does it most resemble?").

Language core typed relational effects layout lazy oo
ocaml
haskell
elm
reasonml
minikanren
datalog
prolog
koka
erlang (msg)
elixir (msg)
smalltalk
ruby
common-lisp (CLOS)
lua / tcl / forth / apl
js (async)

(msg), (async), (CLOS) denote shapes that might live in effects/ or oo/ once the paradigm gets a second consumer to validate against.

Long-running discipline

This plan does not have a "done" state. The operating mode is continuous pair-driven refactoring:

  • When a new guest reaches the same shape as an existing one → look for shared abstraction → consider extraction.
  • When two existing consumers diverge on how they use a kit → consider a sub-layer split or a redesign.
  • When a sub-layer accumulates more than ~5 entries → consider further stratification.
  • When a kit has never been refactored after a second consumer ported → suspicious; the second port probably reshaped expectations and the kit should have flexed. Audit it.
  • When a Phase A entry (currently at root) gets a second consumer in a narrower paradigm than "universal" → re-home into the appropriate sub-layer, don't wait for a third.

Substrate work and lib/guest work feed each other. Substrate fixes (sx-improvements queue) raise lib/guest's ceiling — every kit gets faster and more correct. lib/guest exposes substrate gaps that wouldn't show up in single-guest work — when two paradigms can't share an abstraction cleanly, the substrate may be missing a primitive. Treat lib/guest issues as substrate-investigation prompts before papering them over with kit-side workarounds.

Extraction is not the goal — codification is. "I refactored 800 lines of duplication into 200 lines of shared kit" is the bootstrapping mode. The long-running mode is "I codified a piece of language theory in working SX form, validated by N paradigms." The same line-count delta means very different things in those two modes. Keep the bar at codification, not just deduplication.


Progress log

Step Status Commit Delta
0 — baseline snapshot [done] 2f7f8189 11 guests captured: lua 185/185, forth 64/64, ruby 76/76, apl 73/73, prolog 590/590, common-lisp 309/309, smalltalk 625/629, tcl 3/4, haskell 0/18 programs, js 94/148 (slice), erlang 0/0
1 — conformance.sx (prolog + haskell) [done] 58dcff26 Prolog 590/590 (matches baseline). Haskell 156/156 — old script was broken (0/18 was an artefact of a never-matching grep), driver reveals true counts; baseline updated.
2 — prefix.sx (common-lisp + lua) [partial — pending lua] 2ef773a3 common-lisp/runtime.sx ported (47 aliases collapsed into 13 prefix-rename calls); 518/518 vs 309/309 baseline (improvement, no regression). lua/runtime.sx has no pure same-name aliases — every lua- definition wraps custom logic; second consumer pending.
3 — lex.sx (lua + tcl) [done] 559b0df9 lex.sx exports nil-safe char-class predicates + token record. lua/tokenizer.sx (7 preds) and tcl/tokenizer.sx (5 preds) collapsed into prefix-rename calls. lua 185/185, tcl 342/342, tcl-conf 3/4 — all = baseline.
4 — pratt.sx (lua + prolog) [done] da27958d Extracted operator-table format + lookup only — climbing loops stay per-language because lua and prolog use opposite prec conventions. lua/parser.sx: 18-clause cond → 15-entry table. prolog/parser.sx: pl-op-find deleted, pl-op-lookup wraps pratt-op-lookup. lua 185/185, prolog 590/590 — both = baseline.
5 — ast.sx (lua + prolog) [partial — pending real consumers] a774cd26 Kit + 33 self-tests shipped (10 canonical kinds, predicates, accessors). Step is "Optional" per brief; lua/prolog parsers untouched (185/185 + 590/590). Datalog-on-sx will be the natural first real consumer; lua/prolog converters can land later.
6 — match.sx (haskell + prolog) [partial — kit shipped; ports deferred] 863e9d93 Pure-functional unify + match kit (canonical wire format + cfg-driven adapters) + 25 self-tests. Existing prolog/haskell engines untouched (structurally divergent — mutating-symmetric vs pure-asymmetric — would risk 746 passing tests under brief's revert-on-regression rule). Real consumer is minikraken/datalog work in flight.
7 — layout.sx (haskell + synthetic) [partial — haskell port deferred] d75c61d4 Configurable kit (haskell-style keyword-opens + python-style trailing-:-opens) + 6 self-tests covering both flavours. Synthetic Python-ish fixture passes; haskell/layout.sx untouched (kit not yet a drop-in for Haskell 98 Note 5 etc.; haskell still 156/156 baseline).
8 — hm.sx (haskell + TBD) [partial — algebra shipped; assembly deferred] ab2c40c1 HM foundations: types/schemes/ftv/apply/compose/generalize/instantiate/fresh-tv on top of match.sx unify, plus literal inference rule. 24/24 self-tests. Algorithm W lambda/app/let assembly deferred to host code — paired sequencing per brief: lib/ocaml/types.sx (OCaml-on-SX Phase 5) + haskell/infer.sx port. Haskell still 156/156 baseline.

Rules

  • Branch: architecture. Commit locally. Never push. Never touch main.
  • Scope: ONLY lib/guest/**, lib/{lua,prolog,haskell,common-lisp,tcl}/** (canaries + extraction targets), plans/lib-guest.md, plans/agent-briefings/lib-guest-loop.md. No spec/, hosts/, web/, shared/.
  • SX files: sx-tree MCP tools only. sx_validate after every edit.
  • No raw dune. Use sx_build target="ocaml" MCP tool.
  • Two-language rule (scaled by claim): never merge an extraction until two guests consume it. The pair must be appropriate to the layer's universality claim — core/ needs cross-paradigm pair, typed/ needs two typed consumers, relational/ needs two relational consumers, etc. (See Phase B — Stratification for the matrix.) Step 8 (Phase A) excepted with explicit OCaml-paired note.
  • Conformance baseline is the bar. Any port whose scoreboard regresses by ≥1 test → revert, mark blocked, move on.
  • Substrate change → re-snapshot. If spec/ or hosts/ changes underneath this loop, re-run Step 0 before continuing.
  • One step per code commit. Plan updates as a separate commit. Short message with delta.
  • No alias chains to paper over drift between extraction and consumer (feedback_no_alias_bloat).
  • Partial extraction is OK if the canary works and a pending consumer is identified — mark [partial — pending <consumer>].
  • Hard timeout: if stuck >45 min on a step, mark blocked (<reason>) and move on.