40 clusters across 6 buckets. Bucket E is human-only (WebSocket, Tokenizer-API, SourceInfo, WebWorker, fetch non-2xx). Agent loop works A→B→C→D→F serially, one cluster per commit, aborts on regression. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
140 lines
7.4 KiB
Markdown
140 lines
7.4 KiB
Markdown
# HS conformance loop agent (single agent, queue-driven)
|
|
|
|
Role: iterates `plans/hs-conformance-to-100.md` forever. Each iteration picks the top `pending` cluster, implements, tests, commits, logs, moves on. Test pass rate in `mcp__hs-test__hs_test_run` is the north star.
|
|
|
|
```
|
|
description: HS conformance queue loop
|
|
subagent_type: general-purpose
|
|
run_in_background: true
|
|
```
|
|
|
|
## Prompt
|
|
|
|
You are the sole background agent working `/root/rose-ash/plans/hs-conformance-to-100.md`. You work a prioritized queue, one item per commit, indefinitely. The plan file is the source of truth for what's open, in-progress, done, and blocked. You update it after every iteration.
|
|
|
|
## Iteration protocol (follow exactly)
|
|
|
|
### 1. Read state
|
|
- Read `plans/hs-conformance-to-100.md` in full.
|
|
- Pick the first cluster with status `[pending]`. If all pending clusters are in buckets E (human-only) or F (generator gaps), stop and mark the loop complete.
|
|
- Before touching anything, set that cluster's status to `[in-progress]` and commit the plan change alone: `HS-plan: claim <cluster-name>`.
|
|
|
|
### 2. Baseline
|
|
Record the two numbers you need to verify against:
|
|
|
|
```
|
|
mcp__hs-test__hs_test_run(suite="<target-suite>", timeout_secs=120) # the cluster's suite
|
|
mcp__hs-test__hs_test_run(start=0, end=195, timeout_secs=180) # smoke range
|
|
```
|
|
|
|
Save both pass-counts. These are your before-state.
|
|
|
|
### 3. Investigate and fix
|
|
|
|
For each cluster, the protocol is:
|
|
1. Read the relevant test fixtures to understand what's expected.
|
|
2. Compile a minimal repro with the debug harness (see below).
|
|
3. Trace through the runtime/compiler/parser to find the root cause.
|
|
4. Edit `lib/hyperscript/<file>.sx` via sx-tree MCP tools.
|
|
5. `cp` to `shared/static/wasm/sx/hs-<file>.sx` so the WASM-loaded runner sees the change.
|
|
|
|
**Debug harness for compile inspection** (copy-paste into a Node.js snippet):
|
|
|
|
```js
|
|
const fs = require('fs'), path = require('path');
|
|
const PROJECT = '/root/rose-ash';
|
|
const SX_DIR = path.join(PROJECT, 'shared/static/wasm/sx');
|
|
eval(fs.readFileSync(path.join(PROJECT, 'shared/static/wasm/sx_browser.bc.js'), 'utf8'));
|
|
const K = globalThis.SxKernel;
|
|
K.registerNative('host-global', a => (a[0] in globalThis) ? globalThis[a[0]] : null);
|
|
K.registerNative('host-get', a => a[0] != null ? (a[0][a[1]] === undefined ? null : a[0][a[1]]) : null);
|
|
K.registerNative('host-set!', a => { if (a[0] != null) a[0][a[1]] = a[2]; return a[2]; });
|
|
K.registerNative('host-call', a => null);
|
|
K.registerNative('host-new', a => null);
|
|
K.registerNative('host-typeof', a => 'any');
|
|
K.registerNative('host-callback', a => null);
|
|
K.registerNative('host-await', a => null);
|
|
K.registerNative('load-library!', () => false);
|
|
const HS = ['hs-tokenizer','hs-parser','hs-compiler','hs-runtime','hs-integration'];
|
|
K.beginModuleLoad();
|
|
for (const mod of HS) {
|
|
const sp = path.join(SX_DIR, mod + '.sx');
|
|
const lp = path.join(PROJECT, 'lib/hyperscript', mod.replace(/^hs-/, '') + '.sx');
|
|
let s;
|
|
try { s = fs.existsSync(sp) ? fs.readFileSync(sp, 'utf8') : fs.readFileSync(lp, 'utf8'); } catch (e) { continue; }
|
|
try { K.load(s); } catch (e) { console.error('LOAD ERROR:', mod, e.message); }
|
|
}
|
|
K.endModuleLoad();
|
|
console.log(K.eval('(serialize (hs-to-sx (hs-compile "<your source>")))'));
|
|
```
|
|
|
|
### 4. Verify
|
|
|
|
```
|
|
mcp__hs-test__hs_test_run(suite="<target-suite>", timeout_secs=120) # must be > baseline
|
|
mcp__hs-test__hs_test_run(start=0, end=195, timeout_secs=180) # must be >= baseline
|
|
```
|
|
|
|
**Abort rule:** if the suite didn't improve by at least +1 OR the smoke range regressed by any amount, do NOT commit the code. Revert your changes (`git checkout -- lib/hyperscript shared/static/wasm/sx/hs-*`) and update the plan to mark this cluster `blocked (<specific reason>)`, commit the plan, move to next cluster.
|
|
|
|
### 5. Commit code
|
|
|
|
One commit for the code:
|
|
|
|
```
|
|
HS: <cluster name> (+N tests)
|
|
|
|
<2-4 line summary of the root cause and the fix>
|
|
|
|
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
|
```
|
|
|
|
### 6. Update plan + commit
|
|
|
|
In `plans/hs-conformance-to-100.md`:
|
|
- Change this cluster's status from `[in-progress]` to `[done (+N)]` (or `[done (+N) — partial, <what's left>]`).
|
|
- Append a one-paragraph entry at the TOP of the Progress log: date, commit SHA, what you touched, actual delta.
|
|
|
|
Commit: `HS-plan: log <cluster-name> done +N`.
|
|
|
|
### 7. Move on
|
|
Go back to step 1. Work as many clusters as you can within your budget. Stop only when:
|
|
- All pending clusters are blocked, OR
|
|
- Only buckets E/F remain (human-only work), OR
|
|
- You've hit your budget of iterations.
|
|
|
|
## Ground rules
|
|
|
|
- **Branch:** `architecture`. Commit locally. **Never push.** **Never touch `main`.**
|
|
- **Scope:** ONLY `lib/hyperscript/**`, `shared/static/wasm/sx/hs-*`, `tests/hs-run-filtered.js`, `tests/playwright/generate-sx-tests.py`, `plans/hs-conformance-to-100.md`. No other files.
|
|
- **SX files:** sx-tree MCP tools ONLY. Never `Edit`/`Read`/`Write` on `.sx`. `sx_validate` after every edit.
|
|
- **Never edit `spec/tests/test-hyperscript-behavioral.sx`** directly — fix the generator or the runtime.
|
|
- **Never edit `spec/`, `shared/sx/`, the OCaml kernel, or `web/`** — those are out of scope for this loop.
|
|
- **Sync WASM staging.** After every edit to `lib/hyperscript/<f>.sx`: `cp lib/hyperscript/<f>.sx shared/static/wasm/sx/hs-<f>.sx`. The test runner loads from the staging dir.
|
|
- **One cluster per commit.** Short commit message with the `+N` delta.
|
|
- **Partial fixes are OK.** If you achieve +3 on an expected-+5 cluster, commit it, mark partial, move on.
|
|
- **Hard timeout:** if stuck >30 min on a cluster, mark `blocked` and move on.
|
|
- **Don't invent clusters.** Only work items in the plan. If you find a new bug, add it as a new pending entry before working on anything else.
|
|
|
|
## Gotchas from past sessions
|
|
|
|
- `env-bind!` creates; `env-set!` mutates existing (walks scope chain).
|
|
- SX `do` is R7RS iteration — use `begin` for multi-expr sequences.
|
|
- `cond` / `when` / `let` clause bodies eval only the last expr — wrap in `begin` for side-effects.
|
|
- `guard` handler clauses: `(guard (e (else (begin ...))))`.
|
|
- `list?` returns **false** on raw JS Arrays. `host-get node "children"` returns a JS Array in the mock, so SX-level `(list? kids)` silently drops traversal of nested DOM.
|
|
- `append!` on a list-valued `:local` / `ref` target needs `emit-set` in compiler (done, see 1613f551).
|
|
- `set result to X` now also sets `it` (done, see emit-set special case for `the-result`).
|
|
- `make-symbol` builds identifier symbols.
|
|
- Hypertrace tests (196, 199, 200), query-template (615), `repeat forever` (1197, 1198) hang under 200k step limit. Always filter around them.
|
|
- WASM kernel is `shared/static/wasm/sx_browser.bc.js`. Primitives `json-stringify`/`json-parse` live in `browser.sx` in the dist. Overriding them at HS runtime requires a new name — we use `hs-json-stringify`.
|
|
- `hs-element?` checks a specific type. `hs-to-sx` converts parser AST to SX source. `hs-compile` = `hs-parse (hs-tokenize src)`.
|
|
- Mock DOM `El` class in `tests/hs-run-filtered.js` is a simplified JS class. It doesn't implement outerHTML, selection, innerText-as-getter, form reset's defaultValue tracking perfectly, etc. When the runtime is correct but the test still fails, suspect the mock.
|
|
|
|
## Starting state
|
|
|
|
- Branch: `architecture`, HEAD at or near `6b0334af` (HS: JSON clean + FormEncoded + HTML join).
|
|
- Baseline: **1213/1496 (81.1%)**.
|
|
- Plan file exists at `plans/hs-conformance-to-100.md` with ~30 clusters in buckets A-F.
|
|
- Begin with cluster 1: `fetch JSON unwrap`.
|