Commit Graph

23 Commits

Author SHA1 Message Date
8ba68e0365 W14: F10 expected-failures baseline gate (test-only)
The OCaml suite's permanent ~273-failure band (in-progress hs-* + the
r7rs radix shadow) is normalized, so real regressions hide in red noise
(conformance.md F-10). A runner skip-list would rewrite the hs loops'
scoreboards mid-flight — instead, pin the band:

scripts/test-suite-baseline.sh runs the full suite and diffs its FAIL set
against spec/tests/known-failures.txt (273 entries, identity =
"suite > name", error text stripped). Red on a NEW failure (regression)
AND red on a vanished failure (fix landed — delete it from the baseline,
locking in the win). The band still prints as FAIL lines for the teams
working through it; nothing in the runner changes.

Bonus capture: 2 of the 273 have EMPTY suite labels (can-map-an-array,
string->number) — live evidence for C9, the next checklist item.

Validated end-to-end: GREEN on current tree (5800p/273f — 38 net passes
above dc7aa709's 5762 from this loop's added pins). Runtime ~12 min.

Test-only: no semantics edits, no push.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-07-04 04:10:55 +00:00
ca4ad404f1 W14: C3-C7 epoch-protocol ledger + seeded fuzz-liveness (test-only)
All five protocol quirks are OPEN server-side, so the suite pins CURRENT
behavior (verified live) as a bidirectional ledger in
scripts/test-protocol-gate.sh:
- C3: stray (io-response ...) answered as Unknown command (dead guard)
- C4: malformed (epoch) errors and leaves the epoch stale (envelope
  changed since the finding: the dc7aa709 guard answers rather than kills)
- C5: decreasing epoch accepted silently (no monotonic enforcement)
- C6: two commands on one line -> one error, neither executed
- C7: vm-trace without compiler -> opaque "Not callable: nil"

Plus the fuzz property that matters: 60 deterministically-seeded hostile
lines (unbalanced parens, control chars, unicode, 2KB lines, stray
io-responses, epoch mutations) followed by a well-formed command — the
server must still answer and exit cleanly. protocol-gate: 11/11.

When a server-side fix lands, the matching ledger pin fails loudly and the
ledger is updated to assert the corrected behavior.

Test-only: no semantics edits, no push.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-07-04 03:43:01 +00:00
4fe6df69b8 W14: F2 WASM corpus runner — spec tests on the SHIPPED browser kernel
conformance.md F-2: no runner fed spec/tests through the shipped
sx_browser.bc.wasm.js — the F-1/F-3 native/WASM divergences existed
undetected because of exactly this gap.

Add hosts/ocaml/browser/run_wasm_corpus.js: boots the shipped kernel
headless in Node (stub block + module preload mirroring
test_wasm_native.js, the blessed boot path), registers the test-framework
hooks, runs ONE test file per process and emits a parseable CORPUS-RESULT
line — process isolation means a hanging file is killed by the driver's
per-file timeout without ending the sweep.

Add scripts/test-wasm-corpus.sh: sweeps spec/tests, applies a SKIP /
KNOWN_FAIL ledger (green-flip on a KNOWN_FAIL fails the run so the ledger
cannot rot), gates on everything else.

Empirical baseline (2026-07-04): 83 files, 80 fully green, 5192 passes,
zero test failures on the shipped kernel — including test-gate-pins
(29/29). KNOWN_FAIL: test-hash-table/test-r7rs/test-sets hit an opaque
jsoo load-error mid-file (22/87/30 tests pass first). Full sweep ~13 min;
sx-build-all.sh wiring deferred to the D3 gate-definition decision.

Test-only: no semantics edits, no push.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-07-04 03:20:48 +00:00
01e5f876bc W14: pin K19 MCP-harness/runtime primitive parity (test-only)
mcp_tree.ml's parallel primitive table drifted from sx_primitives.ml —
the spec-mandated harness verification path silently produced false
findings ((get {:a 1} :a 99) -> nil vs 1, char-class vs substring split,
etc.). dc7aa709 aligned 8 entries as a stopgap; the real fix (linking
sx_primitives) is hosts-lane.

Add scripts/test-harness-parity.sh: drives mcp_tree.exe sx_eval via raw
JSON-RPC and a fresh sx_server.exe via the epoch protocol, runs the
finding's 12-probe battery through both, fails on any divergence (errors
compared by inner message). 12/12 parity today — the stopgap holds and
can no longer rot silently.

Test-only: no semantics edits, no push.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-07-04 01:41:07 +00:00
f6d584629e W14: env-parity ledger — runner-only bindings vs fresh sx_server (test-only)
Section-B audit, all verified live over the epoch protocol. Runner-only
bindings absent from production: values, call-with-values (run_tests.ml
:1131/:1140), contains-char? (rt.ml:728 + rt.js:85), trim-right (JS runner
ONLY — absent even from the OCaml runner), sha3-256 (rt.ml:745 + rt.js:88;
production's real primitive is crypto-sha3-256).

Consequences pinned: (canonical-serialize 42) on a fresh server errors
"Undefined symbol: contains-char?" — content addressing broken for ANY
number outside the runners. And BOTH runners' sha3-256 are FAKE stubs
(OCaml: Hashtbl.hash), so every test-computed CID differs from production.

scripts/test-env-parity.sh is a bidirectional ledger: MUST_HAVE bindings
going missing fail; a KNOWN_DRIFT binding APPEARING also fails with
instructions to move it to MUST_HAVE and flip the consequence pin — the
ledger cannot rot silently in either direction. 7/7 green.

Test-only: no semantics edits, no push.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-07-04 01:17:50 +00:00
d9e452d9bc W14: pin S4 soft-error-page cache exclusion (test-only) — section A complete
Pre-fix, a routing-failure page was stored in the HTTP response cache as
200 and served byte-identically to every later visitor until restart
(cold 2s -> warm 0.0005s). dc7aa709 made http_render_page return
(html, is_error) and gated cache insertion on `not is_err`.

Extend scripts/test-protocol-gate.sh with an HTTP-mode case: fresh
sx_server.exe --http on a random port (timeout-bounded, own child killed),
GET the same nonexistent path twice, assert both requests re-render (two
[sx-http] render lines) and the "[cache] ... error page, not cached" gate
line appears. Standalone-worktree caveat (all docs pages render as soft
error pages, so no positive cache control) documented in the script.

5/5 protocol-gate green; 267/0 sx gate pins. All seven section-A test-debt
pins now landed (K18, K20, K09/K11/K39, K49, crit-2, C1/C1b, S4).

Test-only: no semantics edits, no push.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-07-04 00:53:57 +00:00
ff6c379942 W14: pin C1/C1b command-channel crash guards (test-only)
Pre-fix, one malformed or non-ASCII line on sx_server's top-level command
channel raised an uncaught Parse_error and killed the whole shared process
(bridges + conformance runners). dc7aa709 guards the parse; the server now
answers (error N "Malformed command line: ...") and keeps serving.

Add scripts/test-protocol-gate.sh: per case, spawn a fresh timeout-bounded
sx_server.exe (never touches a shared process) and assert the error
response, the follow-up epoch still evaluating, and a clean exit. Cases:
C1 unterminated list + garbage line, C1b non-ASCII byte (exact review
repros from plans/sx-review/hosts.md), plus a well-formed control. 4/4
green. Structured to grow into W14 section E's protocol fuzz suite (C3-C7).

Test-only: no semantics edits, no push.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-07-04 00:28:57 +00:00
e8246340fc merge: hs-f into architecture — HS conformance 1514/1514 (100%)
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 41s
2026-05-08 22:19:44 +00:00
59bec68dcc perf: Phase 6 — substrate perf-regression alarm (perf-smoke)
Replaces the watchdog-bump approach with an automated check. The next 5× (or
worse) substrate regression will trip the alarm at build time instead of
hiding behind a deadline bump and only being noticed weeks later.

Components:

* lib/perf-smoke.sx — four micro-benchmarks chosen for distinct substrate
  failure modes: function-call dispatch (fib), env construction (let-chain),
  HO-form dispatch + lambda creation (map-sq), TCO + primitive dispatch
  (tail-loop). Warm-up pass populates JIT cache before the timed pass so we
  measure the steady state.

* scripts/perf-smoke.sh — pipes lib/perf-smoke.sx to sx_server.exe, parses
  per-bench wall-time, asserts each is within FACTOR× of the recorded
  reference (default 5×). `--update` rewrites the reference in-place.

* scripts/sx-build-all.sh — perf-smoke wired in as a post-step after JS
  tests. Hard fail if any benchmark regressed beyond budget.

Reference numbers: minimum across 6 back-to-back runs on this dev machine
under typical concurrent-loop contention (load ~9, 2 vCPU, 7.6 GiB RAM,
OCaml 5.2.0, architecture @ 92f6f187). Documented in
plans/jit-perf-regression.md including how to update them.

The 5× factor is chosen so contention noise (~1–2× variance) doesn't trigger
false alarms but a real ≥5× substrate regression — the kind that motivated
this whole investigation — fails the build immediately.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 14:23:45 +00:00
982b9d6be6 HS: sync upstream → 1514 tests (+18 new), 1496 runnable
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 52s
scripts/extract-upstream-tests.py — new walker that scrapes
/tmp/hs-upstream/test/**/*.js for test('name', ...) patterns. Uses
brace-counting that handles strings, regex, comments, and template
literals. Two modes:
  - merge (default): preserves existing test bodies, only adds new tests
  - --replace: discards old bodies, fully re-extracts (use when bodies
    drift due to upstream cleanup)

Merge mode is what we want for an incremental sync — the old snapshot
had bodies that had been hand-tuned for our auto-translator; raw
re-extraction loses those tweaks and regresses ~250 working tests
back to SKIP (untranslated).

Snapshot updated: spec/tests/hyperscript-upstream-tests.json grows
from 1496 → 1514 tests. All 18 new tests are documented as either
manual bodies (3) or skips (15):

Manual bodies (3):
  - on resize from window — dispatches via host-global "window"
  - toggle between followed by for-in loop works — direct test

Skips for architectural reasons (15):
  - 13× core/tokenizer — upstream exposes a streaming token API
    (matchToken, peekToken, consumeUntil, pushFollow…) that our
    tokenizer doesn't surface. Implementing it = a token-stream
    wrapper primitive over hs-tokenize output.
  - 2× ext/component — template-based components via
    <script type="text/hyperscript-template">. We use defcomp directly;
    no template-bootstrap path.
  - 1× toggle does not consume a following for-in loop — parser
    ambiguity in 'toggle .foo for <X>'. Parser must distinguish
    'for <duration>ms' from 'for <ident> in <expr>'. The 'toggle
    between' variant works (different parse path).

Net per-suite status: every individual suite passes 100% on counted
tests (skips excluded). 1496 runnable / 1514 total = 100% on what runs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-07 23:48:41 +00:00
985671cd76 hs: query targets, prolog hook, loop scripts, new plans, WASM regen
Hyperscript compiler/runtime:
- query target support in set/fire/put commands
- hs-set-prolog-hook! / hs-prolog-hook / hs-prolog in runtime
- runtime log-capture cleanup

Scripts: sx-loops-up/down, sx-hs-e-up/down, sx-primitives-down
Plans: datalog, elixir, elm, go, koka, minikanren, ocaml, hs-bucket-f,
       designs (breakpoint, null-safety, step-limit, tell, cookies, eval,
       plugin-system)
lib/prolog/hs-bridge.sx: initial hook-based bridge draft
lib/common-lisp/tests/runtime.sx: CL runtime tests

WASM: regenerate sx_browser.bc.js from updated hs sources

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-06 09:19:56 +00:00
8328e96ff6 primitives-loop: push to origin/architecture after each commit
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 13s
2026-04-26 19:33:27 +00:00
fb72c4ab9c sx-loops: add common-lisp, apl, ruby, tcl (12 slots)
Plans + briefings for four new language loops, each with a delcc/JIT
showcase that the runtime already supports natively:

- common-lisp — conditions + restarts on delimited continuations
- apl — rank-polymorphic primitives + 6 operators on the JIT
- ruby — fibers as delcc, blocks/yield as escape continuations
- tcl — uplevel/upvar via first-class env chain, the Dodekalogue

Launcher scripts now spawn 12 windows (was 8).
2026-04-25 09:25:30 +00:00
6a00df2609 smalltalk: plan + briefing + sx-loops 8th slot
Showcase: blocks with non-local return on captured method-return
continuation. ANSI-ish Smalltalk-80 subset, SUnit + Pharo Kernel-Tests
slice, 7 phases. Worktree: /root/rose-ash-loops/smalltalk on
branch loops/smalltalk.
2026-04-25 00:05:31 +00:00
30d76537d1 sx-loops: each language runs in its own git worktree
Previous version ran all 7 claude sessions in the main working tree on
branch 'architecture'. That would race on git operations and cross-
contaminate commits between languages even though their file scopes
don't overlap. Now each session runs in /root/rose-ash-loops/<lang> on
branch loops/<lang>, created from the current architecture HEAD.

sx-loops-down.sh gains --clean to remove the worktrees; loops/<lang>
branches stay unless explicitly deleted.

Also: second Enter keystroke after the /loop command, since Claude's
input box sometimes interprets the first newline as a soft break.
2026-04-24 16:50:27 +00:00
d7070ee901 Local sx-loops tmux launcher: 7 claude sessions, one per language
sx-loops-up.sh spawns a tmux session 'sx-loops' with 7 windows (lua,
prolog, forth, erlang, haskell, js, hs). Each window runs 'claude'
and then /loop against its briefing at plans/agent-briefings/<x>-loop.md.
Optional arg is the interval (e.g. 15m); omit for model-self-paced.

Each loop does ONE iteration per fire: pick the first unchecked [ ] item,
implement, test, commit, tick, log — then stop. Commits push to
origin/loops/<lang> (safe; not main).

sx-loops-down.sh sends /exit to each window and kills the session.

Attach with: tmux a -t sx-loops
2026-04-24 16:43:40 +00:00
e67852ca96 Scheduled-loop infra: lockfile guard + release + fire log
- scripts/loop-guard.sh — atomic claim with 30-min staleness overtake,
  appends NDJSON event to .loop-logs/<lang>.ndjson. Exit 0 = go ahead,
  exit 1 = another run is live, skip.
- scripts/loop-release.sh — clear lock, log release with exit status.

Intended for 7 per-language /schedule routines firing every 15 minutes.
Lock detects overlap so tight cadences are safe; stale lock (>30 min)
overtaken automatically if an agent dies mid-run.
2026-04-24 16:39:17 +00:00
6528ce78b9 Scripts: page migration helpers for one-per-file layout
Python + shell tooling used to split grouped index.sx files into
one-directory-per-page layout (see the hyperscript gallery migration).
name-mapping.json records the rename table; strip_names.py is a helper
for extracting component names from .sx sources.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 09:09:15 +00:00
a64b693a09 Remove old CSSX system — ~tw is the sole CSS engine
Phase 1 Step 1 of the architecture roadmap. The old cssx.sx
(cssx-resolve, cssx-process-token, cssx-template, old tw function)
is superseded by the ~tw component system in tw.sx.

- Delete shared/sx/templates/cssx.sx
- Remove cssx.sx from all load lists (sx_server.ml, run_tests.ml,
  mcp_tree.ml, compile-modules.js, bundle.sh, sx-build-all.sh)
- Replace (tw "tokens") inline style calls with (~tw :tokens "tokens")
  in layouts.sx and not-found.sx
- Remove _css-hash / init-css-tracking / SX-Css header plumbing
  (dead code — ~tw/flush + flush-collected-styles handle CSS now)
- Remove sx-css-classes param and meta tag from shell template
- Update stale data-cssx references to data-sx-css in tests

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 16:18:07 +00:00
d40a9c6796 sx-tools: WASM kernel updates, TW/CSSX rework, content refresh, new debugging tools
Build tooling: updated OCaml bootstrapper, compile-modules, bundle.sh, sx-build-all.
WASM browser: rebuilt sx_browser.bc.js/wasm, sx-platform-2.js, .sxbc bytecode files.
CSSX/Tailwind: reworked cssx.sx templates and tw-layout, added tw-type support.
Content: refreshed essays, plans, geography, reactive islands, docs, demos, handlers.
New tools: bisect_sxbc.sh, test-spa.js, render-trace.sx, morph playwright spec.
Tests: added test-match.sx, test-examples.sx, updated test-tw.sx and web tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 11:31:57 +00:00
9ce8659f74 Fix signal-add-sub! losing subscribers after remove, fix build pipeline
signal-add-sub! used (append! subscribers f) which returns a new list
for immutable List but discards the result — after signal-remove-sub!
replaces the subscribers list via dict-set!, re-adding subscribers
silently fails. Counter island only worked once (0→1 then stuck).

Fix: use (dict-set! s "subscribers" (append ...)) to explicitly update
the dict field, matching signal-remove-sub!'s pattern.

Build pipeline fixes:
- sx-build-all.sh now bundles spec→dist and recompiles .sxbc bytecode
- compile-modules.js syncs .sx source files alongside .sxbc to wasm/sx/
- Per-file cache busting: wasm, platform JS, and sxbc each get own hash
- bundle.sh adds cssx.sx to dist

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 07:36:36 +00:00
03c2115f0d Fix WASM kernel deploy: 3.6MB js_of_ocaml → 68KB wasm_of_ocaml
The deployed sx_browser.bc.wasm.js was actually the js_of_ocaml output
(pure JS), not the wasm_of_ocaml loader. Nothing synced the correct
build output from _build/ to shared/static/wasm/.

- sx_build target=ocaml now auto-syncs WASM kernel + JS fallback + assets
- sx-build-all.sh syncs after dune build
- Correct 68KB WASM loader replaces 3.6MB JS imposter

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 13:10:13 +00:00
dab81fc571 Add 6 VM/bytecode debugging and build tools
OCaml server commands:
- vm-trace: step-by-step bytecode execution trace (opcode, stack, depth)
- bytecode-inspect: disassemble compiled function (opcodes, constants, arity)
- deps-check: strict symbol resolution (resolved vs unresolved symbols)
- prim-check: verify CALL_PRIM opcodes match real primitives

Scripts:
- hosts/ocaml/browser/test_boot.sh: WASM boot test in Node.js
- scripts/sx-build-all.sh: full pipeline (OCaml + JS + tests)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 01:23:55 +00:00