diff --git a/plans/lib-guest-test-runner.md b/plans/lib-guest-test-runner.md
new file mode 100644
index 00000000..24c5b2c2
--- /dev/null
+++ b/plans/lib-guest-test-runner.md
@@ -0,0 +1,83 @@
+# lib/guest/test-runner.sx — extraction plan
+
+## Status
+
+- [x] **Phase 1 — Kit + Kernel.** `lib/guest/test-runner.sx` extracted (5 forms, ~50 LoC). Seven Kernel test files migrated (`parse.sx`, `eval.sx`, `vau.sx`, `standard.sx`, `encap.sx`, `hygiene.sx`, `metacircular.sx`). 322/322 tests unchanged. 84 LoC removed.
+- [ ] **Phase 2 — Per-guest migrations.** 35 additional test files match the standard `(define X-test-pass 0)` pattern across 7 guests; each needs the test runner's `(load "lib/guest/test-runner.sx")` line and a per-file harness rewrite. Variant shapes (Tcl, Smalltalk, APL alternate) need *different* extraction targets.
+
+## The kit
+
+```lisp
+(refl-make-test-suite)              → {:pass 0 :fail 0 :fails (list)}
+(refl-test SUITE NAME ACT EXP)      → mutates suite (dict-set! + append!)
+(refl-test-report SUITE)            → {:total :passed :failed :fails}
+(refl-test-pass? SUITE)             → bool
+(refl-test-suite? V)                → predicate
+```
+
+Each migrated test file collapses from a 14-line harness boilerplate to:
+```lisp
+(define X-suite (refl-make-test-suite))
+(define X-test  (fn (n a e) (refl-test X-suite n a e)))
+;; ... tests ...
+(define X-tests-run! (fn () (refl-test-report X-suite)))
+```
+
+## Per-guest migration status
+
+| Guest        | Std-pattern files | Variant?            | Migration commit |
+|--------------|------------------:|---------------------|------------------|
+| kernel       | 7                 | -                   | landed (Phase 1) |
+| prolog       | 23                | -                   | pending          |
+| common-lisp  | 4                 | -                   | pending          |
+| erlang       | 4                 | sub-prefix per file | pending          |
+| apl          | 1                 | uses `set!` over `append`, args are `got expected` | needs variant adapter |
+| forth        | 1                 | -                   | pending          |
+| minikanren   | 1                 | -                   | pending          |
+| ruby         | 1                 | -                   | pending          |
+| tcl          | 0 (uses different shape — see below) | `tcl-X-pass/-failures`, append! a STRING | needs variant kit |
+| smalltalk    | 0 (uses SUnit)    | own framework        | not applicable   |
+| haskell, hyperscript, js, lua | 0 | no per-suite harness | not applicable |
+
+## Variant shapes that need their own extractions
+
+### Tcl-flavour (`lib/tcl/tests/*.sx`)
+
+```lisp
+(define tcl-X-pass 0)
+(define tcl-X-fail 0)
+(define tcl-X-failures (list))   ; not -fails
+
+(define tcl-assert
+  (fn (label expected actual)    ; arg order: label, expected, actual
+    (if (= expected actual)
+      (set! tcl-X-pass (+ tcl-X-pass 1))
+      (begin (set! tcl-X-fail (+ tcl-X-fail 1))
+             (append! tcl-X-failures
+                      (str label ": expected=" (str expected) " got=" (str actual)))))))
+```
+
+Difference from canonical: failures hold formatted *strings*, not dicts. Could be unified by passing a formatter cfg, but the API divergence is real (Tcl's report just lists strings; the kit's `:fails` holds dicts). Either:
+- Add `refl-make-string-formatting-test-suite` variant, or
+- Migrate Tcl's reports to dict-of-name-expected-actual (preferred — converges the shape).
+
+### APL-flavour
+
+```lisp
+(define apl-test
+  (fn (name got expected)        ; arg order: name, got, expected
+    (if (= got expected) ...
+      (set! apl-test-fails (append apl-test-fails (list {:got got :expected expected :name name}))))))
+```
+
+Difference: uses functional `append`, not `append!`. Argument order: `got expected`, not `actual expected`. Minor; renaming `n a e` in the kit call site to `n got expected` is the only change.
+
+## Migration playbook (for the loop that finishes Phase 2)
+
+1. Pick a guest from the table.
+2. Run `/tmp/migrate_harness.py lib/<guest>/tests/*.sx`. SKIP entries are variant-shaped; ignore them or hand-migrate.
+3. Add `(load "lib/guest/test-runner.sx")` to that guest's `test.sh` and `conformance.sh` (before the first guest .sx load that uses the harness).
+4. Run the guest's full test suite; counts should be unchanged.
+5. Commit `test-runner: migrate <guest> — N files, NN LoC saved`.
+
+Estimated effort: 1–2 hours per guest, dominated by verification time. Prolog (23 files) is the biggest single commit but the most mechanical.