Documents what's already done (kit + Kernel 7 files) and what's left
across 7 guests (35 std-pattern files + variant flavours in Tcl/APL).
Each guest is its own commit due to local naming and shape variants.
Prolog is the biggest single migration (23 files). Tcl and APL need
small variant adapters because their failure-records hold strings or
use slightly different signatures.
Reference: /tmp/migrate_harness.py is the regex-driven mechanical
migration tool; works on the standard pattern, skips variants for
human review.
plans/kernel-on-sx.md — Phase 7 header updated from "partial" to
"env.sx EXTRACTED 2026-05-12"; second-consumer-found checkbox ticked
for env.sx specifically. Other five files (combiner, evaluator,
hygiene, quoting, short-circuit) stay blocked pending their own
second consumers.
plans/lib-guest-reflective.md — Phases 1-3 ticked off with date
stamps; Outcome section added summarising the three commits, file
stats (124 LoC, within 80-200 bound), and the third-consumer
adoption protocol (cfg with five keys, no changes to env.sx).
The kernel-on-sx loop documented six candidate reflective API files
gated on the two-consumer rule. This plan opens that block by
selecting Tcl's existing uplevel/upvar machinery as the second
consumer for env.sx specifically (the highest-fit candidate).
Discovery: Kernel and Tcl have identical scope-chain semantics but
diverge on mutable-vs-functional update. Solution: adapter-cfg
pattern, same as lib/guest/match.sx. Canonical wire shape with
mutable defaults for Kernel; Tcl provides its own cfg keeping
the functional model.
Roadmap: env.sx extracted, both consumers migrated, all tests green.
The other five candidate files (combiner, evaluator, hygiene,
quoting, short-circuit) stay deferred — Tcl has no operatives.
Loop closer documenting what 18 feature commits produced. Kernel-on-SX
is 1,398 LoC substrate + 1,747 LoC tests = 3,145 LoC total. Zero
substrate fixes required across the loop. R-1RK core + extras
implemented. Six proposed lib/guest/reflective/ files awaiting second
consumer. Substrate verdict: env-as-value generalises to
evaluator-as-value; the m-eval demo proves it.
Five type predicates (number?, string?, list?, boolean?, symbol?).
New tests/metacircular.sx: m-eval defined in Kernel walks expressions
itself, recursing on applicative-call args and delegating to host
eval only for operatives and symbol lookup. 14 demo tests.
The demo surfaced a real bug: map/filter/reduce called kernel-combine
on applicative head-vals directly, which re-evaluates already-
evaluated element values; nested-list elements crashed. Fix: extracted
knl-apply-op (unwrap-applicative-or-pass-through) and use it in all
three combinators before kernel-combine. Mirrors apply's approach.
Added knl-apply-op as a proposed entry in the reflective combiner.sx
API. 322 tests total.
(apply F (list V1 V2 V3)) ≡ (F V1 V2 V3). Unwrap applicative first to
skip auto-eval (args are values), then kernel-combine with the
underlying operative. Universal pattern in reflective Lisps —
sketched into the combiner.sx API. 296 tests total.
Added kernel-make-primitive-applicative-with-env in eval.sx — IMPL
receives (args dyn-env), needed by combinators that re-enter the
evaluator. map/filter/reduce in runtime.sx use it to call user-supplied
combiners on each element with the caller's dynamic env preserved.
Sketched the env-blind vs env-aware applicative split as a new entry
in the proposed combiner.sx reflective API. 289 tests total.
Standard Kernel control flow. $cond walks clauses in order with `else`
catch-all; clauses past the first match are NOT evaluated. $when/$unless
are simple guards. 12 tests, 242 total.
kernel-quasiquote-operative walks the template via mutually-recursive
knl-quasi-walk ↔ knl-quasi-walk-list. $unquote forms eval in dyn-env;
$unquote-splicing splices list-valued results. No depth tracking
(nested quasiquotes flatten). 8 new tests, 230 total. Sketched the
universal reflective quoting kit API for the eventual Phase 7 extraction.
:body slot holds a LIST of forms now (was single expression). New
knl-eval-body in eval.sx evaluates each form in sequence, returning
the last. $vau and $lambda accept (formals env-param body...) /
(formals body...). No $sequence dependency. 223 tests total.
Parser now reads 'expr, \`expr, ,expr, ,@expr as the four standard
shorthands. Quote uses existing $quote operative; quasiquote /
unquote / unquote-splicing recognised but not yet expanded at runtime
(left for first consumer to drive). 218 tests total across six suites.
Hygiene-by-default was already present: user operatives close over
static-env and bind formals + body $define!s in (extend STATIC-ENV),
caller's env untouched. $let evaluates values in caller env, binds
in fresh child env, runs body there. $define-in! explicitly targets
an env. Full scope-set / frame-stamp hygiene is research-grade
and documented as deferred future work in the reflective API notes.
kernel-eval/kernel-combine dispatch on tagged values: operatives see
un-evaluated args + dynamic env; applicatives evaluate args then recurse.
No hardcoded special forms — $if/$quote tested as ordinary operatives
built on the fly. Pure-SX env representation
{:knl-tag :env :bindings DICT :parent P}, surfaced as a candidate
lib/guest/reflective/env.sx API since SX make-env is HTTP-mode only.
Captures the work left on the shelf after the loops/minikanren squash
merge:
Piece A — Phase 7 SLG (cyclic patho, mutual recursion). The hardest
piece; the brief's "research-grade complexity" caveat
still stands. Plan documents the in-progress sentinel +
answer-accumulator + fixed-point-driver design.
Piece B — Phase 6 polish: bounds-consistency for fd-plus / fd-times
in the (var var var) case. Math is straightforward
interval reasoning; low risk, self-contained.
Piece C — =/= disequality with a constraint store. Generalises
nafc / fd-neq to logic terms via a pending-disequality
list re-checked after each ==.
Piece D — Bigger CLP(FD) demos: send-more-money and Sudoku 4x4.
Both validate Piece B once it lands.
Suggested ordering: B (low risk, unlocks D) → D (concrete validation)
→ C (independent track) → A (highest risk, do last).
Operating ground rules carried over from the original loop brief:
loops/minikanren branch, sx-tree MCP only, one feature per commit,
test count must monotonically grow.
Replaces the watchdog-bump approach with an automated check. The next 5× (or
worse) substrate regression will trip the alarm at build time instead of
hiding behind a deadline bump and only being noticed weeks later.
Components:
* lib/perf-smoke.sx — four micro-benchmarks chosen for distinct substrate
failure modes: function-call dispatch (fib), env construction (let-chain),
HO-form dispatch + lambda creation (map-sq), TCO + primitive dispatch
(tail-loop). Warm-up pass populates JIT cache before the timed pass so we
measure the steady state.
* scripts/perf-smoke.sh — pipes lib/perf-smoke.sx to sx_server.exe, parses
per-bench wall-time, asserts each is within FACTOR× of the recorded
reference (default 5×). `--update` rewrites the reference in-place.
* scripts/sx-build-all.sh — perf-smoke wired in as a post-step after JS
tests. Hard fail if any benchmark regressed beyond budget.
Reference numbers: minimum across 6 back-to-back runs on this dev machine
under typical concurrent-loop contention (load ~9, 2 vCPU, 7.6 GiB RAM,
OCaml 5.2.0, architecture @ 92f6f187). Documented in
plans/jit-perf-regression.md including how to update them.
The 5× factor is chosen so contention noise (~1–2× variance) doesn't trigger
false alarms but a real ≥5× substrate regression — the kind that motivated
this whole investigation — fails the build immediately.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Conflict in lib/tcl/test.sh: architecture had bumped `timeout 2400 → 7200`,
this branch had restored it to `timeout 300` based on the Phase 1
quiet-machine measurement (376/376 in 57.8s wall, 16.3s user). Resolved by
keeping `timeout 300` — the 7200s bump was preemptive against contention,
not against an actual substrate regression. Phase 1 confirms the original
180s deadline is comfortable; 300s gives 5× headroom for moderate noise.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 1 of the jit-perf-regression plan reproduced and quantified the alleged
30× substrate slowdown across 5 guests (tcl, lua, erlang, prolog, haskell). On
a quiet machine all five suites pass cleanly:
tcl test.sh 57.8s wall, 16.3s user, 376/376 ✓
lua test.sh 27.3s wall, 4.2s user, 185/185 ✓
erlang conformance 3m25s wall, 36.8s user, 530/530 ✓ (needs ≥600s budget)
prolog conformance 3m54s wall, 1m08s user, 590/590 ✓
haskell conformance 6m59s wall, 2m37s user, 156/156 ✓
Per-test user-time at architecture HEAD vs pre-substrate-merge baseline
(83dbb595) is essentially flat (tcl 0.83×, lua 1.4×, prolog 0.82×). The
symptoms reported in the plan (test timeouts, OOMs, 30-min hangs) were heavy
CPU contention from concurrent loops + one undersized internal `timeout 120`
in erlang's conformance script. There is no substrate regression to bisect.
Changes:
* lib/tcl/test.sh: `timeout 2400` → `timeout 300`. The original 180s deadline
is comfortable on a quiet machine (3.1× headroom); 300s gives some safety
margin for moderate contention without masking real regressions.
* lib/erlang/conformance.sh: `timeout 120` → `timeout 600`. The 120s budget
was actually too tight for the full 9-suite chain even before this work.
* lib/erlang/scoreboard.{json,md}: 0/0 → 530/530 — populated by a successful
conformance run with the new deadline. The previous 0/0 was a stale
artefact of the run timing out before parsing any markers.
* plans/jit-perf-regression.md: full Phase 1 progress log including
per-guest perf table, quiet-machine re-measurement, and conclusion.
Phases 2–4 (bisect, diagnose, fix) skipped — there is no substrate regression
to find. Phase 6 (perf-regression alarm) still planned to catch the next
quadratic blow-up early instead of via watchdog bumps.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This session cleared 15 of the 18 documented skips:
- Toggle parser ambiguity (1) — 2-token lookahead in parse-toggle
- Throttled-at modifier (1) — parser + emit-on wrap + runtime hs-throttle!/hs-debounce!
- Tokenizer-stream API (13) — hs-stream wrapper + 15 stream primitives
Plus a perf fix in compiler.sx (hoisted throttle/debounce helpers to
module level so they don't get JIT-recompiled per emit-on call). Wall
time for full batched suite: 28m45s, was 26m17s before sync (so net
+18 tests cost only +2m even though 3x more work).
Remaining skips (3):
- Template-component scope tests (2) — needs <script type="text/
hyperscript-template"> custom-element bootstrap registrar.
- Async event dispatch (1) — repeat until event needs the OCaml
kernel to release the JS event loop between iterations.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
parser.sx parse-toggle-cmd: when seeing 'toggle .foo for', peek the
following two tokens. If they are '<ident> in', it is a for-in loop
and toggle does NOT consume 'for' as a duration clause. Restores the
trailing for-in to the command list.
parser.sx parse-on (handler modifiers): recognize 'throttled at <ms>'
and 'debounced at <ms>' as handler modifiers. Captured as :throttle /
:debounce kwargs in the on-form parts list.
compiler.sx emit-on: pre-extract :throttle / :debounce from parts via
new _strip-throttle-debounce helper before scan-on, then wrap the built
handler with (hs-throttle! handler ms) or (hs-debounce! handler ms).
runtime.sx: hs-throttle! — closure with __hs-last-fire timestamp,
fires immediately and drops events arriving within ms of the last fire.
hs-debounce! — closure with __hs-timer, clears any pending timer and
schedules a new setTimeout(handler, ms) so only the last burst event
fires.
Both formerly-architectural skips now pass:
- "toggle does not consume a following for-in loop"
- "throttled at <time> drops events within the window"
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Brings the architecture branch (559 commits ahead — R7RS step 4-6, JIT
expansion, host_error wrapping, bytecode compiler, etc.) into the
loops/haskell line of work. Conflict in lib/haskell/conformance.sh:
architecture replaced the inline driver with a thin wrapper delegating
to lib/guest/conformance.sh + a config file. Resolved by taking the
wrapper and extending lib/haskell/conformance.conf with all programs
added under loops/haskell (caesar, runlength-str, showadt, showio,
partial, statistics, newton, wordfreq, mapgraph, uniquewords, setops,
shapes, person, config, counter, accumulate, safediv, trycatch) plus
adding map.sx and set.sx to PRELOADS.
plans/haskell-completeness.md gains three new follow-up phases:
- Phase 17 — parser polish (`(x :: Int)` annotations, mid-file imports)
- Phase 18 — one ambitious conformance program (lambda-calc / Dijkstra /
JSON parser candidate list)
- Phase 19 — conformance speed (batch all suites in one sx_server
process to compress the 25-min run to single-digit minutes)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
hk-bind-exceptions! in eval.sx registers throwIO, throw, evaluate, catch,
try, handle, displayException. SomeException constructor pre-registered
in runtime.sx (arity 1, type SomeException).
throwIO and the existing error primitive both raise via SX `raise` with a
uniform "hk-error: msg" string. catch/try/handle parse it back into a
SomeException via hk-exception-of, which strips nested
'Unhandled exception: "..."' host wraps (CEK's host_error formatter) and
the "hk-error: " prefix.
catch and handle evaluate the handler outside the guard scope (build an
"ok"/"exn" outcome tag inside guard, then dispatch outside) so that a
re-throw from the handler propagates past this catch — matching Haskell
semantics rather than infinite-looping in the same guard.
14 unit tests in tests/exceptions.sx (catch success, catch error, try
Right/Left, handle, throwIO + catch/try, evaluate, nested catch, do-bind
through catch, branch on try result, IORef-mutating handler).
Conformance: safediv.hs (8/8) and trycatch.hs (8/8). Scoreboard now
285/285 tests, 36/36 programs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Trivial wrapper: apl-run-file = apl-run ∘ file-read, where
file-read is built-in to OCaml SX.
Tests verify primes.apl, life.apl, quicksort.apl all parse
end-to-end (their last form is a :dfn AST). Source-then-call
test confirms the loaded file's defined fn is callable, even
when the algorithm itself can't fully execute (primes' inline
⍵ rebinding still missing — :glyph-token, not :name-token).
Parser: :name clause now detects 'name ← rhs' patterns inside
expressions. When seen, consumes the remaining tokens as RHS,
parses recursively, and emits a (:assign-expr name parsed-rhs)
value segment.
Eval-ast :dyad and :monad: when the right operand is an
:assign-expr node, capture the binding into env before
evaluating the left operand. This realises the primes idiom:
apl-run "(2 = +⌿ 0 = a ∘.| a) / a ← ⍳ 30"
→ 2 3 5 7 11 13 17 19 23 29
Also: top-level x←5 now evaluates to scalar 5 (apl-eval-ast
:assign just unwraps to its RHS value).
Caveat: ⍵-rebinding (the original primes.apl uses
'⍵←⍳⍵') is a :glyph-token; only :name-tokens are handled.
A regular variable name (like 'a') works.
Haskell strings are [Char]. Calling reverse / head / length on a SX raw
string transparently produces a cons-list of char codes (via hk-str-head /
hk-str-tail in runtime.sx), but (==) then compared the original raw string
against the char-code cons-list and always returned False — so
"racecar" == reverse "racecar" was False.
Added hk-try-charlist-to-string and hk-normalize-for-eq in eval.sx; routed
== and /= through hk-normalize-for-eq so a string compares equal to any
cons-list whose elements are valid Unicode code points spelling the same
characters, and "[]" ↔ "".
palindrome.hs lifts from 9/12 → 12/12; conformance 33/34 → 34/34 programs,
266/269 → 269/269 tests.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Investigation of the long-standing 'why does the runner say 1494/1494 not
1496/1496?' question. The answer is in tests/hs-run-filtered.js:969 — two
tests are skipped via _SKIP_TESTS for documented architectural reasons:
1. 'until event keyword works' — uses 'repeat until event click from #x',
which suspends the OCaml kernel waiting for a click that is never
dispatched from outside K.eval. The sync test runner has no way to
fire the click while the kernel is suspended.
2. 'throttled at <time> drops events within the window' — the HS parser
does not implement the 'throttled at <ms>' modifier. The compiled SX
for the handler is malformed: handler body is the literal symbol
'throttled', the time expression dangles outside the closure as
stray (do 200 ...). Genuinely needs parser+compiler+runtime work,
not just a deadline bump.
Both are documented at the skip site with a comment explaining why they
can't run synchronously. The conformance number is 1494/1494 = 100% on
counted tests, with 2 explicit, justified skips out of 1496 total.
This was the source of the cumulative-vs-isolated test-count discrepancy.
Suite filter runs see them as 'not in this suite,' batched runs see them
as 'continued past'. Either way: not failures.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>