14 KiB
Hyperscript conformance → 100%
Goal: take the hyperscript upstream conformance suite from 1213/1496 (81%) to a clean 100%. Queue-driven — single-agent loop on architecture branch, one cluster per commit.
North star
Baseline: 1213/1496 (81.1%)
Target: 1496/1496
Gap: 283 tests (130 real fails + 153 SKIPs)
Track after each iteration via mcp__hs-test__hs_test_run on the relevant suite, not the whole thing (full runs take 10+min and include hanging tests — 196/199/200/615/1197/1198 hang under the 200k step limit).
How to run tests
mcp__hs-test__hs_test_run(suite="hs-upstream-<cluster>") # fastest, one suite
mcp__hs-test__hs_test_run(start=0, end=195) # early range
mcp__hs-test__hs_test_run(start=201, end=614) # mid range (skip hypertrace hangs)
mcp__hs-test__hs_test_run(start=616, end=1196) # late-1, skip repeat-forever hangs
mcp__hs-test__hs_test_run(start=1199) # late-2 after hangs
File layout
Runtime/compiler/parser live in lib/hyperscript/*.sx. The test runner at tests/hs-run-filtered.js loads shared/static/wasm/sx/hs-*.sx — after every .sx edit you must cp lib/hyperscript/<file>.sx shared/static/wasm/sx/hs-<file>.sx.
The test fixtures live in spec/tests/test-hyperscript-behavioral.sx, generated from tests/playwright/generate-sx-tests.py. Never edit the behavioral.sx fixture directly — fix the generator or the runtime.
Cluster queue
Each cluster below is one commit. Order is rough — a loop agent may skip ahead if a predecessor is blocked. Status: pending / in-progress / done (+N) / blocked (<reason>).
Bucket A: runtime fixes, single-file (low risk, high yield)
-
[done (+4)] fetch JSON unwrap —
hs-upstream-fetch4 tests (can do a simple fetch w/ json+ 3 variants) got{:__host_handle N}. Root:hs-fetchinruntime.sxreturns raw host Response object instead of parsing JSON body. Fix: when format is"json", unwrap viahost-get "_json"andjson-parse. Expected: +4. -
[done (+1)] element → HTML via outerHTML —
asExpression / converts an element into HTML(1 test) + unlocks response fetches. Mock DOMElclass intests/hs-run-filtered.jshas noouterHTMLgetter. Add a getter computed fromtagName+attributes+children(recurse). Expected: +1 direct, + knock-on in fetch. -
[done (+2)] Values dict insertion order —
asExpression / Values | FormEncoded+| JSONString(2 tests) — form fields come outlastName, phone, firstName, areaCode. Root:hs-values-absorbinruntime.sxusesdict-set!but keys iterate in non-insertion order. Investigatehs-gather-form-nodeswalk — the recursivekidstraversal silently fails whenchildrenis a JS Array (not sx-list), so nested inputs arrive via a different path. Fix: either coerce children to sx-list at the gather boundary OR rewrite gather to explicitly use sx-level iteration helpers. Expected: +2. -
[pending]
notprecedence overor—expressions/not3 tests (not has higher precedence than or,not with numeric truthy/falsy,not with string truthy/falsy). Check parser precedence —notshould bind tighter thanor. Fix inparser.sxexpression-level precedence. Expected: +3. -
[pending]
someselector for nonempty match —expressions/some / some returns true for nonempty selector(1 test).some .classprobably returns the list, not a boolean. Runtime fix. Expected: +1. -
[pending] string template
${x}—expressions/strings / string templates work w/ props+w/ braces(2 tests). Template interpolation isn't substituting property accesses. Checkhs-templateruntime. Expected: +2. -
[pending]
puthyperscript reprocessing —put / properly processes hyperscript at end/start/content/symbol(4 tests, allExpected 42, got 40). After a put operation, newly inserted HS scripts aren't being activated. Fix:hs-put-at!shouldhs-boot-subtree!on the target after DOM insertion. Expected: +4. -
[pending]
select returns selected text(1 test,hs-upstream-select). Likelyselectcommand needs to returnwindow.getSelection().toString()equivalent. Add host-call to selection API in mock. Expected: +1. -
[pending]
wait on eventbasics —wait / can wait on event,on another element,waiting ... sets it to the event,destructure properties in a wait(4 tests). Event-waiter suspension issue. Expected: +3-4. -
[pending]
swapvariable ↔ property —swap / can swap a variable with a property(1 test). Swap command doesn't handle mixed var/prop targets. Expected: +1. -
[pending]
hidestrategy —hide / can configure hidden as default,can hide with custom strategy,can set default to custom strategy,hide element then show element retains original display(4 tests). Strategy config plumbing. Expected: +3-4. -
[pending]
showmulti-element + display retention —show / can show multiple elements with inline-block,can filter over a set of elements using the its symbol(2 tests). Expected: +2. -
[pending]
togglemulti-class + timed + until-event —toggle(3 assertion-fail tests). Expected: +3. -
[pending]
unlessmodifier —unlessModifier / unless can conditionally execute(1 test). Parser/compiler addition. Expected: +1. -
[pending]
transitionquery-ref + multi-prop + initial —transition3 tests. Expected: +2-3. -
[pending]
send can reference sender— 1 assertion fail. Expected: +1. -
[pending]
tellsemantics —tell / attributes refer to the thing being told,does not overwrite me symbol,your symbol represents thing being told(3 tests). Expected: +3. -
[pending]
throw respond async/sync—throw / can respond to async/sync exceptions in event handler(2 tests). Expected: +2.
Bucket B: parser/compiler additions (medium risk, shared files)
-
[pending]
pickregex + indices —pick13 tests. Regex match, flags,ofsyntax, start/end, negative indices. Big enough that a single commit might fail — break into pick-regex and pick-indices if needed. Expected: +10-13. -
[pending]
repeatproperty for-loops + where —repeat / basic property for loop,can nest loops,where clause can use the for loop variable name(3 tests). Expected: +3. -
[pending]
possessiveExpressionproperty access via its —possessive / can access its properties(1 test, Expectedfoogot ``). Expected: +1. -
[pending] window global fn fallback —
regressions / can invoke functions w/ numbers in name+ unlocks several others. When callingfoo()wherefooisn't SX-defined, fall back to(host-global "foo"). Design decision: either compile-time emit(or foo (host-global "foo"))via a helper, or add runtime lookup in the dispatch path. Expected: +2-4. -
[pending]
me symbol works in from expressions—regressions(1 test, ExpectedFoo). Checkfromexpression compilation. Expected: +1. -
[pending]
properly interpolates values 2— URL interpolation regression (1 test). Likely template string + property access. Expected: +1. -
[pending]
can support parenthesized commands and features—parser(1 test, Expectedclicked). Parser needs to accept(cmd...)grouping in more contexts. Expected: +1.
Bucket C: feature stubs (DOM observer mocks)
-
[pending] resize observer mock +
on resize— 3 tests. Add a minimalResizeObservermock tohs-run-filtered.js, plus parse/compileon resize. Expected: +3. -
[pending] intersection observer mock +
on intersection— 3 tests. MockIntersectionObserver; compileon intersectionwith margin/threshold modifiers. Expected: +3. -
[pending]
ask/answer+ prompt/confirm mock —askAnswer4 tests. Requires test-name-keyed mock: first test wantsconfirm → true, secondconfirm → false, thirdprompt → "Alice", fourthprompt → null. Keyed via_current-test-namein the runner. Expected: +4. -
[pending]
hyperscript:before:init/:after:init/:parse-errorevents — 6 tests inbootstrap+parser. Fire DOM events at activation boundaries. Expected: +4-6. -
[pending]
logAllconfig — 1 test. Global config that console.log's each command. Expected: +1.
Bucket D: medium features (bigger commits, plan-first)
-
[pending] runtime null-safety error reporting — 18 tests in
runtimeErrors. When accessing.fooon nil, emit a structured error with position info. One coordinated fix in the compiler emit paths for property access, function calls, set/put. Expected: +15-18. -
[pending] MutationObserver mock +
on mutationdispatch — 15 tests inon. Add MO mock to runner. Compileon mutation [of attribute/childList/attribute-specific]. Expected: +10-15. -
[pending] cookie API — 5 tests in
expressions/cookies.document.cookiemock in runner +the cookies+set the xxx cookiekeywords. Expected: +5. -
[pending] event modifier DSL — 8 tests in
on.elsewhere,every,first click, count filters (once / twice / 3 times, ranges),from elsewhere. Expected: +6-8. -
[pending] namespaced
def— 3 tests.def ns.foo() ...createsns.foo. Expected: +3.
Bucket E: subsystems (DO NOT LOOP — human-driven)
-
[blocked: needs design] WebSocket +
socket+ rpc proxy — 16 tests. Ship only with intentional design review. -
[blocked: needs design] Tokenizer-as-API — 17 tests. Expose tokens as inspectable SX data.
-
[blocked: needs design] SourceInfo API — 4 tests.
(get line N)/(get source N)metadata on compiled AST. -
[blocked: needs design] WebWorker plugin — 1 test.
-
[blocked: needs design] Fetch non-2xx / before-fetch event / real response object — 7 tests. Sinon-level route mocks or real fetch interception.
Bucket F: generator translation gaps (after bucket A-D)
Many tests are SKIP (untranslated) because tests/playwright/generate-sx-tests.py bailed with return None. These need patches to the generator to recognize more JS test patterns. Estimated ~25 recoverable tests. Defer to a dedicated generator-repair cluster once the queue above drains.
Ground rules for the loop agent
- One cluster per commit. Don't batch. Short commit message:
HS: <cluster name> (+N tests). - Baseline first, verify at the end. Before starting: record the current pass count for the target suite AND for one smoke range (0-195). After fixing: rerun both. Abort and mark blocked if:
- Target suite didn't improve by at least +1.
- Smoke range regressed (any test flipped pass → fail).
- Never edit
.sxfiles withEdit/Read/Write. Use sx-tree MCP (sx_read_subtree,sx_replace_node,sx_insert_child,sx_insert_near,sx_replace_by_pattern,sx_rename_symbol,sx_validate,sx_write_file). - Sync WASM staging. After every edit to
lib/hyperscript/<f>.sx, runcp lib/hyperscript/<f>.sx shared/static/wasm/sx/hs-<f>.sx. - Never edit
spec/tests/test-hyperscript-behavioral.sxdirectly. Fix the generator or the runtime. - Scope:
lib/hyperscript/**,shared/static/wasm/sx/hs-*,tests/hs-run-filtered.js,tests/playwright/generate-sx-tests.py,plans/hs-conformance-to-100.md. Do not touchspec/evaluator.sx, the broader SX kernel, or unrelated files. - Commit even partial fixes. If you get +N where N is less than expected, commit what you have and mark the cluster
done (+N) — partial, <what's left>. - If stuck >30min on a cluster, mark it
blocked (<reason>)in the plan and move to the next pending cluster. - Branch:
architecture. Commit locally. Never push. Never touchmain. - Log every iteration in the Progress log below: one paragraph, what you touched, delta, commit SHA.
Known gotchas
env-bind!creates bindings;env-set!mutates existing ones.- SX
dois R7RS iteration — usebeginfor multi-expr sequences. cond/when/letclause bodies evaluate only the last expr — wrap inbegin.list?in SX checks for{_type:'list'}— it returns false on raw JS Arrays.host-get node "children"returns a JS Array in the mock, so recursion via(list? kids)silently drops nested elements.append!on a list-valued scoped var (:s) requiresemit-setin the compiler — done, see commit1613f551.- When symbol target is
the-result, also syncit(done, see emit-set). - Hypertrace tests (196, 199, 200) and query-template test (615) hang under 200k step limit — always filter around them.
repeat forevertests (1197, 1198) also hang.
Progress log
(Reverse chronological — newest at top.)
2026-04-23 — cluster 3 Values dict insertion order
e59c0b8e—HS: Values dict insertion order (+2 tests). Root cause was the OCaml kernel's dict implementation iterating keys in scrambled (non-insertion) order. Added_orderhidden list tracked byhs-values-absorb, and taughths-coerceFormEncoded/JSONString branches to iterate via_orderwhen present (filtering the_ordermarker out). Suite hs-upstream-expressions/asExpression: 28/42 → 30/42. Smoke 0-195: 162/195 unchanged.
2026-04-23 — cluster 2 element→HTML via outerHTML
e195b5bd—HS: element → HTML via outerHTML (+1 test). Added anouterHTMLgetter on the mockElclass intests/hs-run-filtered.js. Merges.id/.className(host-set! targets) with.attributes, falls back toinnerText/textContent. Suite hs-upstream-expressions/asExpression: 27/42 → 28/42. Smoke 0-195: 162/195 unchanged.
2026-04-23 — cluster 1 fetch JSON unwrap
39a597e9—HS: fetch JSON unwrap (+4 tests). Addedhs-host-to-sxhelper inruntime.sxthat converts raw host-handle JS objects/arrays to proper SX dicts/lists via Object.keys/Array walks.hs-fetchnow calls it on the result when format is"json". Detects host-handle dicts by checking(host-get v "_type") == "dict"— genuine SX dicts have the marker, host handles don't. Suite hs-upstream-fetch: 11/23 → 15/23. Smoke 0-195: 162/195 unchanged.