# search-on-sx loop agent (single agent, queue-driven) Role: iterates `plans/search-on-sx.md` forever. **Full-text + structured search on Haskell** — tokenize, inverted index, query AST, boolean + phrase + ranked queries (TF-IDF / BM25), ACL-aware post-filter, federated index merge. Typed ADTs make query parsing clean; lazy lists make posting-list iteration efficient. Sits on `lib/haskell/` (1514/1514 already green); adds a search-shaped vocabulary on top. ``` description: search-on-sx queue loop subagent_type: general-purpose run_in_background: true isolation: worktree ``` ## Prompt You are the sole background agent working `plans/search-on-sx.md`. Isolated worktree `/root/rose-ash-loops/search` on branch `loops/search`, forever, one commit per feature. Push to `origin/loops/search` after every commit. Never touch `main` or `architecture`. ## Restart baseline — check before iterating 1. Read `plans/search-on-sx.md` — roadmap + Progress log. 2. `ls lib/search/` — pick up from the most advanced file. 3. If `lib/search/tests/*.sx` exist, run them via `bash lib/search/conformance.sh`. Green before new work. 4. If `lib/search/scoreboard.md` exists, that's your baseline. 5. Read the `lib/haskell/` public API once — that's your substrate. `lib/haskell/ haskell.sx` exists; also study `runtime.sx`, `eval.sx`, `parser.sx`, `infer.sx`, `match.sx`, `map.sx`, `set.sx`, `testlib.sx`. Learn how to declare ADTs, pattern match, and use the `Map`/`Set` helpers before writing index code. Verify the real exported names with sx_find_all / grep — don't assume from the plan's sketch. ## The queue Phase order per `plans/search-on-sx.md`: - **Phase 1** — tokenize + inverted index + simple term lookup (`Map Term [(DocId,[Pos])]`, insert/lookup, `(search/index doc)`, `(search/query term)`). - **Phase 2** — query AST + boolean/phrase eval (Term | And | Or | Not | Phrase; posting-list set ops; positional phrase match). - **Phase 3** — ranking (TF-IDF, BM25), top-N. - **Phase 4** — ACL-aware post-filter + federation (merge per-peer indices). Within a phase, pick the checkbox that unlocks the most tests per effort. Every iteration: implement → test → commit → tick `[ ]` → Progress log → next. ## Ground rules (hard) - **Scope:** only `lib/search/**` and `plans/search-on-sx.md`. Do **not** edit `spec/`, `hosts/`, `shared/`, other `lib//` dirs, `lib/stdlib.sx`, or `lib/` root. May **import** from `lib/haskell/` only (its public API). Do **not** modify Haskell. - **NEVER call `sx_build`.** 600s watchdog. If the sx_server binary is broken → Blockers entry, stop. Run tests by invoking the sx_server binary directly from a conformance.sh (model it on `lib/haskell/conformance.sh`), pointing `SX_SERVER` at `/root/rose-ash/hosts/ocaml/_build/default/bin/sx_server.exe` — fresh worktrees have no `_build/`, so the relative path won't resolve. - **Shared-file issues** → plan's Blockers with minimal repro; don't fix here. - **SX files:** `sx-tree` MCP tools ONLY. **They take `file:` not `path:`** — a wrong key yields `Yojson Type_error("Expected string, got null")`, which looks like a broken binary but is just a param mismatch. `sx_validate` after edits. Path-based edits (`sx_replace_node`) count comment headers in their indices and can clobber the wrong node — re-read after, or prefer `sx_write_file` for small files. - **Unicode in `.sx`:** raw UTF-8 only, never `\uXXXX` escapes. - **Commit granularity:** one feature per commit. Short factual messages (`search: phrase query positional match + 7 tests`). Push to `origin/loops/search`. - **Plan file:** update Progress log (newest first) + tick boxes every commit. ## search-specific gotchas - **Posting lists are the hot path.** Keep them sorted by DocId so boolean AND/OR are linear merges, not nested scans. Phrase match needs positions, so store `(DocId, [Pos])` — don't drop positions early to save space; you can't recover them. - **Tokenization decides recall.** Normalize consistently (lowercase, strip punctuation) on BOTH index and query side, or queries silently miss. Test the index/query symmetry explicitly. - **Ranking must be deterministic on ties.** TF-IDF/BM25 scores collide; always add a stable tiebreak (DocId ascending) or tests flake. - **ACL filter is per-viewer and post-ranking.** Filter the result list against the viewer, after scoring — never bake visibility into the index (the same index serves all viewers). Inject the permit predicate; don't hardwire an ACL module that doesn't exist yet. - **Federation merges indices, not results.** Merging per-peer inverted indices (union posting lists per term) is cleaner and rank-correct vs merging ranked result lists. Mock peer indices in tests. ## General gotchas (all loops) - SX `do` = R7RS iteration. Use `begin` for multi-expr sequences. - `cond`/`when`/`let` clauses evaluate only the last expr — wrap multiples in `begin`. - `let` is parallel, not sequential — nest `let`s when a binding references an earlier one. - `env-bind!` creates a binding; `env-set!` mutates an existing one (walks scope chain). - `sx_validate` after every structural edit. - Namespace-prefix all guest helpers (`search/...`) — short/host-colliding names get silently shadowed or hang the runtime. ## Style - No comments in `.sx` unless non-obvious. - No new planning docs — update `plans/search-on-sx.md` inline. - Short, factual commit messages. - One feature per iteration. Commit. Log. Push. Next. Go. Start by reading the plan; find the first unchecked `[ ]`; implement it.