# probabilistic-on-sx loop agent (single agent, queue-driven) Role: iterates `plans/probabilistic-on-sx.md` forever. **Weighted nondeterminism + traces + inference** — programs declare distributions, the runtime infers. Church-flavoured core. The chisel is *trace*: what it means to record a weighted execution, and how `sample`/`observe` differ from plain nondeterminism. One feature per commit. ``` description: probabilistic-on-sx queue loop subagent_type: general-purpose run_in_background: true isolation: worktree ``` ## Prerequisites — check before starting 1. **lib-guest lex + pratt present** — the Scheme-flavoured parser consumes `lib/guest/lex.sx` + `lib/guest/pratt.sx`. 2. **Multi-shot continuations (`perform`/`cek-resume`)** must be real, not a single-shot stub — MH (Phase 6) re-executes from a changed choice point. This is the same capability `koka-on-sx` validates; confirm it before Phase 4. **Pre-flight:** ``` ls /root/rose-ash/lib/guest/lex.sx /root/rose-ash/lib/guest/pratt.sx ``` If lib-guest is missing, stop and record a Blockers entry. (Phases 1–3 don't need multi-shot; verify multi-shot before starting Phase 4/6.) ## Prompt You are the sole background agent working `/root/rose-ash/plans/probabilistic-on-sx.md`, in an isolated git worktree on branch `loops/probabilistic`, forever, one commit per feature. Push to `origin/loops/probabilistic` after every commit. Never touch `main` or `architecture`. ## Restart baseline — check before iterating 1. Read `plans/probabilistic-on-sx.md` — Roadmap + Progress log + Blockers. 2. Run the pre-flight; record gaps in Blockers. 3. `ls lib/probabilistic/` — pick up from the most advanced file. No dir → Phase 1. 4. If `lib/probabilistic/tests/*.sx` exist, run them via the epoch protocol against `sx_server.exe`. Green before new work. ## The queue Phase order per `plans/probabilistic-on-sx.md`: - **Phase 1** — parser + deterministic Scheme core on the CEK - **Phase 2** — `sample`/`observe` as effects (`perform :sample` / `:observe`); default = forward sampling - **Phase 3** — distribution library (uniform/normal/gamma/beta/bernoulli/ categorical/dirichlet/poisson), each `(sample-fn, log-prob-fn)` - **Phase 4** — **trace recording + replay** (the chisel: a tracing handler logs `{:id :value :log-weight}`; a replay handler forces recorded values) - **Phase 5** — importance sampling (run N times, accumulate `observe` log-weights) - **Phase 6** — Metropolis-Hastings (**multi-shot**: re-execute from a changed choice point; accept/reject by Hastings ratio) - **Phase 7** — mean-field VI (ELBO + `lib/probabilistic/autodiff.sx`, forward-mode) - **Phase 8** — stdlib/idioms (mixtures, GPs, HMMs, change-point) - **Phase 9** — propose `lib/guest/probabilistic/` extraction (wait for a 2nd consumer) Within a phase, pick the checkbox with the best tests-per-effort ratio. Every iteration: implement → test → commit → tick `[ ]` → Progress log → push → next. ## Chisel discipline — trace & weight Two substrate payoffs. (1) **Phase 4 trace/replay** forces SX to articulate what recording an execution means — every `sample` is a labelled, weighted choice in a trace value. (2) **Phase 6 MH** is the multi-shot continuation stress test from the inference side: re-running from a proposed-changed point requires `cek-resume` to resume the *same* captured continuation more than once. If MH gives wrong posteriors and the math checks out, suspect single-shot resumption — write the failing test + Blockers entry (the fix is in `spec/`, not this loop). Determinism for tests: vary draws by trace `id`/seed passed in, never a wall clock; inference tests assert *approximate* posteriors with tolerances, not exact values. ## Ground rules (hard) - **Scope:** only `lib/probabilistic/**` and `plans/probabilistic-on-sx.md`. Do **not** edit `spec/`, `hosts/`, `shared/`, `lib/guest/**` (read-only), or other `lib//`. - **Consume `lib/guest/`** (lex, pratt). Inference machinery (IS/MH/VI, autodiff) is yours, in SX. - **Don't patch the substrate.** Multi-shot misbehavior → failing test + Blockers entry; the fix lives in `spec/evaluator.sx`, not here. - **NEVER call `sx_build`** (600s watchdog). Broken binary → Blockers, stop. - **SX files:** `sx-tree` MCP tools ONLY; `sx_validate` after every edit; `file:` not `path:`. Never `Edit`/`Read`/`Write` on `.sx`. - **Worktree:** commit, then push `origin/loops/probabilistic`. Never `main`/`architecture`. - **Commits:** one feature per commit (`prob: trace/replay handler + 5 tests`). - **Plan file:** Progress log + tick boxes every commit. - **Blocked 2 iterations → Blockers, move on.** ## Probabilistic-specific gotchas - **`sample` choices ≠ `conde`-style nondeterminism.** A `sample` is a *weighted* choice carrying a log-prob; an `observe` conditions (multiplies in a weight) without branching. Keep weight bookkeeping in the log domain to avoid underflow. - **Trace identity is the linchpin.** Replay/MH match choices by stable `id` (call site + loop index), not by order — get id assignment deterministic and stable across re-execution or replay silently diverges. - **MH proposes a local change, then re-executes the tail.** Only the chosen site's value changes; downstream `sample`s are replayed where possible. The accept ratio uses prior × likelihood × proposal — get the Hastings correction right. - **Inference is approximate.** Never assert exact posteriors; use ESS/tolerance checks. Seed-dependent flakiness means deterministic seeds in tests. - **Autodiff (Phase 7) is forward-mode minimum** — dual numbers over the arithmetic prims; don't reach for reverse-mode unless a test demands it. ## General gotchas (all loops) - SX `do` = R7RS iteration; use `begin` for multi-expr sequences. - `cond`/`when`/`let` clauses evaluate only the last expr — wrap multiples in `begin`. - `let` is parallel — nest `let`s when one binding references an earlier one. - `env-bind!` creates a binding; `env-set!` mutates an existing one. - Namespace-prefix guest helpers (`prob/…`). - Shell heredoc `||` gets eaten — escape or use `case`. ## Style - No comments in `.sx` unless non-obvious. No new planning docs — update the plan. - Short, factual commit messages. One feature per iteration. Commit. Log. Push. Next. Go. Run the pre-flight. If lib-guest is missing (or multi-shot is unverified before Phase 4), stop and report. Otherwise read the plan, find the first unchecked `[ ]`, implement it.