Make test.sx self-executing: evaluators run it directly, no codegen

test.sx now defines deftest/defsuite as macros. Any host that provides 5 platform functions (try-call, report-pass, report-fail, push-suite, pop-suite) can evaluate the file directly — no bootstrap compilation step needed for JS. - Added defmacro for deftest (wraps body in thunk, catches via try-call) - Added defmacro for defsuite (push/pop suite context stack) - Created run.js: sx-browser.js evaluates test.sx directly (81/81 pass) - Created run.py: Python evaluator evaluates test.sx directly (81/81 pass) - Deleted bootstrap_test_js.py and generated test_sx_spec.js - Updated testing docs page to reflect self-executing architecture Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 10:50:28 +00:00
parent 754e7557f5
commit e9d86d628b
7 changed files with 299 additions and 508 deletions
--- a/sx/sx/testing.sx
+++ b/sx/sx/testing.sx
@@ -7,49 +7,69 @@
      ;; Intro
      (div :class "space-y-4"
        (p :class "text-lg text-stone-600"
-          "SX tests itself. The test spec is written in SX and defines a complete test framework — assertion helpers, test suites, and 81 test cases covering every language feature. Bootstrap compilers read "
+          "SX tests itself. "
          (code :class "text-violet-700 text-sm" "test.sx")
-          " and emit native test runners for each host platform.")
+          " is a self-executing test spec — it defines "
+          (code :class "text-violet-700 text-sm" "deftest")
+          " and "
+          (code :class "text-violet-700 text-sm" "defsuite")
+          " as macros, writes 81 test cases, and runs them. Any host that provides five platform functions can evaluate the file directly.")
        (p :class "text-stone-600"
          "This is not a test "
          (em "of") " SX — it is a test " (em "in") " SX. The same s-expressions that define how "
          (code :class "text-violet-700 text-sm" "if")
          " works are used to verify that "
          (code :class "text-violet-700 text-sm" "if")
-          " works. The language proves its own correctness, on every platform it compiles to."))
+          " works. No code generation, no intermediate files — the evaluator runs the spec."))

      ;; How it works
      (div :class "space-y-3"
        (h2 :class "text-2xl font-semibold text-stone-800" "Architecture")
+        (p :class "text-stone-600"
+          "The test framework needs five platform functions. Everything else — macros, assertion helpers, test suites — is pure SX:")
        (div :class "not-prose bg-stone-100 rounded-lg p-5 mx-auto max-w-3xl"
          (pre :class "text-sm leading-relaxed whitespace-pre-wrap break-words font-mono text-stone-700"
-"test.sx                    The spec: assertion helpers + 15 suites + 81 tests
+"test.sx                       Self-executing: macros + helpers + 81 tests
    |
-    |--- bootstrap_test.py     Reads test.sx, emits pytest module
+    |--- run.js                 Injects 5 platform fns, evaluates test.sx
    |       |
-    |       +-> test_sx_spec.py    81 pytest test cases
-    |               |
-    |               +-> shared/sx/evaluator.py    (Python SX evaluator)
+    |       +-> sx-browser.js   JS evaluator (bootstrapped from spec)
    |
-    |--- bootstrap_test_js.py  Reads test.sx, emits Node.js TAP script
+    |--- run.py                 Injects 5 platform fns, evaluates test.sx
            |
-            +-> test_sx_spec.js    81 TAP test cases
-                    |
-                    +-> sx-browser.js              (JS SX evaluator)")))
+            +-> evaluator.py    Python evaluator
+
+Platform functions:
+    try-call (thunk)          -> {:ok true} | {:ok false :error \"msg\"}
+    report-pass (name)        -> output pass
+    report-fail (name error)  -> output fail
+    push-suite (name)         -> push suite context
+    pop-suite ()              -> pop suite context")))

      ;; Framework
      (div :class "space-y-3"
        (h2 :class "text-2xl font-semibold text-stone-800" "The test framework")
        (p :class "text-stone-600"
-          "The framework defines two declarative forms and nine assertion helpers, all in pure SX:")
+          "The framework defines two macros and nine assertion helpers, all in SX. The macros are the key — they make "
+          (code :class "text-violet-700 text-sm" "defsuite")
+          " and "
+          (code :class "text-violet-700 text-sm" "deftest")
+          " executable forms, not just declarations:")
        (div :class "rounded-lg border border-stone-200 bg-white p-5 space-y-3"
-          (h3 :class "text-sm font-medium text-stone-500 uppercase tracking-wide" "Forms")
+          (h3 :class "text-sm font-medium text-stone-500 uppercase tracking-wide" "Macros")
          (~doc-code :code
-            (highlight "(defsuite \"name\" ...tests)\n(deftest \"name\" ...body)" "lisp")))
+            (highlight "(defmacro deftest (name &rest body)\n  `(let ((result (try-call (fn () ,@body))))\n     (if (get result \"ok\")\n       (report-pass ,name)\n       (report-fail ,name (get result \"error\")))))\n\n(defmacro defsuite (name &rest items)\n  `(do (push-suite ,name)\n       ,@items\n       (pop-suite)))" "lisp")))
+        (p :class "text-stone-600 text-sm"
+          (code :class "text-violet-700 text-sm" "deftest")
+          " wraps the body in a thunk, passes it to "
+          (code :class "text-violet-700 text-sm" "try-call")
+          " (the one platform function that catches errors), then reports pass or fail. "
+          (code :class "text-violet-700 text-sm" "defsuite")
+          " pushes a name onto the context stack, runs its children, and pops.")
        (div :class "rounded-lg border border-stone-200 bg-white p-5 space-y-3"
          (h3 :class "text-sm font-medium text-stone-500 uppercase tracking-wide" "Assertion helpers")
          (~doc-code :code
-            (highlight "(define assert-equal\n  (fn (expected actual)\n    (assert (equal? expected actual)\n      (str \"Expected \" (str expected) \" but got \" (str actual)))))\n\n(define assert-true  (fn (val) (assert val ...)))\n(define assert-false (fn (val) (assert (not val) ...)))\n(define assert-nil   (fn (val) (assert (nil? val) ...)))\n(define assert-type  (fn (expected-type val) ...))\n(define assert-length (fn (expected-len col) ...))\n(define assert-contains (fn (item col) ...))" "lisp"))))
+            (highlight "(define assert-equal\n  (fn (expected actual)\n    (assert (equal? expected actual)\n      (str \"Expected \" (str expected) \" but got \" (str actual)))))\n\n(define assert-true  (fn (val) (assert val ...)))\n(define assert-false (fn (val) (assert (not val) ...)))\n(define assert-nil   (fn (val) (assert (nil? val) ...)))\n(define assert-type  (fn (expected-type val) ...))\n(define assert-length (fn (expected-len col) ...))\n(define assert-contains (fn (item col) ...))\n(define assert-throws (fn (thunk) ...))" "lisp"))))

      ;; Example tests
      (div :class "space-y-3"
@@ -61,54 +81,49 @@
          (~doc-code :code
            (highlight "(defsuite \"arithmetic\"\n  (deftest \"addition\"\n    (assert-equal 3 (+ 1 2))\n    (assert-equal 0 (+ 0 0))\n    (assert-equal -1 (+ 1 -2))\n    (assert-equal 10 (+ 1 2 3 4)))\n\n  (deftest \"subtraction\"\n    (assert-equal 1 (- 3 2))\n    (assert-equal -1 (- 2 3)))\n\n  (deftest \"multiplication\"\n    (assert-equal 6 (* 2 3))\n    (assert-equal 0 (* 0 100))\n    (assert-equal 24 (* 1 2 3 4)))\n\n  (deftest \"division\"\n    (assert-equal 2 (/ 6 3))\n    (assert-equal 2.5 (/ 5 2)))\n\n  (deftest \"modulo\"\n    (assert-equal 1 (mod 7 3))\n    (assert-equal 0 (mod 6 3))))" "lisp"))))

-      ;; Bootstrapping to Python
+      ;; Running tests — JS
      (div :class "space-y-3"
-        (h2 :class "text-2xl font-semibold text-stone-800" "Bootstrap to Python")
+        (h2 :class "text-2xl font-semibold text-stone-800" "JavaScript: direct evaluation")
        (p :class "text-stone-600"
-          (code :class "text-violet-700 text-sm" "bootstrap_test.py")
-          " parses "
-          (code :class "text-violet-700 text-sm" "test.sx")
-          ", extracts the "
-          (code :class "text-violet-700 text-sm" "define")
-          " forms (assertion helpers) as a preamble, then emits each "
-          (code :class "text-violet-700 text-sm" "defsuite")
-          " as a pytest class and each "
-          (code :class "text-violet-700 text-sm" "deftest")
-          " as a test method.")
-        (div :class "rounded-lg border border-stone-200 bg-white p-5 space-y-3"
-          (h3 :class "text-sm font-medium text-stone-500 uppercase tracking-wide" "Generated Python")
-          (~doc-code :code
-            (highlight "# Auto-generated from test.sx\nfrom shared.sx.parser import parse_all\nfrom shared.sx.evaluator import _eval, _trampoline\n\ndef _make_env():\n    env = {}\n    for expr in parse_all(_PREAMBLE):\n        _trampoline(_eval(expr, env))\n    return env\n\ndef _run(sx_source, env=None):\n    if env is None:\n        env = _make_env()\n    exprs = parse_all(sx_source)\n    for expr in exprs:\n        result = _trampoline(_eval(expr, env))\n    return result\n\nclass TestSpecArithmetic:\n    def test_addition(self):\n        _run('(do (assert-equal 3 (+ 1 2)) ...)')\n\n    def test_subtraction(self):\n        _run('(do (assert-equal 1 (- 3 2)) ...)')" "python")))
-        (div :class "rounded-lg border border-stone-200 bg-white p-5 space-y-3"
-          (h3 :class "text-sm font-medium text-stone-500 uppercase tracking-wide" "Run it")
-          (~doc-code :code
-            (highlight "$ python bootstrap_test.py --output test_sx_spec.py\nParsed 15 suites, 9 preamble defines from test.sx\nTotal test cases: 81\n\n$ pytest test_sx_spec.py -v\n81 passed in 0.15s" "bash"))))
-
-      ;; Bootstrapping to JavaScript
-      (div :class "space-y-3"
-        (h2 :class "text-2xl font-semibold text-stone-800" "Bootstrap to JavaScript")
-        (p :class "text-stone-600"
-          (code :class "text-violet-700 text-sm" "bootstrap_test_js.py")
-          " emits a standalone Node.js script that loads "
          (code :class "text-violet-700 text-sm" "sx-browser.js")
-          " (bootstrapped from the spec), injects any platform-specific primitives into the test env, then runs all 81 tests with TAP output.")
+          " evaluates "
+          (code :class "text-violet-700 text-sm" "test.sx")
+          " directly. The runner injects platform functions and calls "
+          (code :class "text-violet-700 text-sm" "Sx.eval")
+          " on each parsed expression:")
        (div :class "rounded-lg border border-stone-200 bg-white p-5 space-y-3"
-          (h3 :class "text-sm font-medium text-stone-500 uppercase tracking-wide" "Generated JavaScript")
+          (h3 :class "text-sm font-medium text-stone-500 uppercase tracking-wide" "run.js")
          (~doc-code :code
-            (highlight "// Auto-generated from test.sx\nvar Sx = require('./sx-browser.js');\n\nvar _envPrimitives = {\n  'equal?': function(a, b) { return deepEqual(a, b); },\n  'boolean?': function(x) { return typeof x === 'boolean'; },\n  // ... platform primitives injected into env\n};\n\nfunction _makeEnv() {\n  var env = {};\n  for (var k in _envPrimitives) env[k] = _envPrimitives[k];\n  var exprs = Sx.parseAll(PREAMBLE);\n  for (var i = 0; i < exprs.length; i++) Sx.eval(exprs[i], env);\n  return env;\n}\n\nfunction _run(name, sxSource) {\n  var env = _makeEnv();\n  var exprs = Sx.parseAll(sxSource);\n  for (var i = 0; i < exprs.length; i++) Sx.eval(exprs[i], env);\n}" "javascript")))
+            (highlight "var Sx = require('./sx-browser.js');\nvar src = fs.readFileSync('test.sx', 'utf8');\n\nvar env = {\n  'try-call': function(thunk) {\n    try {\n      Sx.eval([thunk], env);   // call the SX lambda\n      return { ok: true };\n    } catch(e) {\n      return { ok: false, error: e.message };\n    }\n  },\n  'report-pass': function(name) { console.log('ok - ' + name); },\n  'report-fail': function(name, err) { console.log('not ok - ' + name); },\n  'push-suite': function(n) { stack.push(n); },\n  'pop-suite':  function()  { stack.pop(); },\n};\n\nvar exprs = Sx.parseAll(src);\nfor (var i = 0; i < exprs.length; i++) Sx.eval(exprs[i], env);" "javascript")))
        (div :class "rounded-lg border border-stone-200 bg-white p-5 space-y-3"
-          (h3 :class "text-sm font-medium text-stone-500 uppercase tracking-wide" "Run it")
+          (h3 :class "text-sm font-medium text-stone-500 uppercase tracking-wide" "Output")
          (~doc-code :code
-            (highlight "$ python bootstrap_test_js.py --output test_sx_spec.js\nParsed 15 suites, 9 preamble defines from test.sx\nTotal test cases: 81\n\n$ node test_sx_spec.js\nTAP version 13\n1..81\nok 1 - literals > numbers are numbers\nok 2 - literals > strings are strings\n...\nok 81 - edge-cases > empty operations\n\n# tests 81\n# pass  81" "bash"))))
+            (highlight "$ node shared/sx/tests/run.js\nTAP version 13\nok 1 - literals > numbers are numbers\nok 2 - literals > strings are strings\n...\nok 81 - edge-cases > empty operations\n\n# tests 81\n# pass  81" "bash"))))
+
+      ;; Running tests — Python
+      (div :class "space-y-3"
+        (h2 :class "text-2xl font-semibold text-stone-800" "Python: direct evaluation")
+        (p :class "text-stone-600"
+          "Same approach — the Python evaluator runs "
+          (code :class "text-violet-700 text-sm" "test.sx")
+          " directly:")
+        (div :class "rounded-lg border border-stone-200 bg-white p-5 space-y-3"
+          (h3 :class "text-sm font-medium text-stone-500 uppercase tracking-wide" "run.py")
+          (~doc-code :code
+            (highlight "from shared.sx.parser import parse_all\nfrom shared.sx.evaluator import _eval, _trampoline\n\ndef try_call(thunk):\n    try:\n        _trampoline(_eval([thunk], {}))\n        return {'ok': True}\n    except Exception as e:\n        return {'ok': False, 'error': str(e)}\n\nenv = {\n    'try-call': try_call,\n    'report-pass': report_pass,\n    'report-fail': report_fail,\n    'push-suite': push_suite,\n    'pop-suite': pop_suite,\n}\n\nfor expr in parse_all(src):\n    _trampoline(_eval(expr, env))" "python")))
+        (div :class "rounded-lg border border-stone-200 bg-white p-5 space-y-3"
+          (h3 :class "text-sm font-medium text-stone-500 uppercase tracking-wide" "Output")
+          (~doc-code :code
+            (highlight "$ python shared/sx/tests/run.py\nTAP version 13\nok 1 - literals > numbers are numbers\n...\nok 81 - edge-cases > empty operations\n\n# tests 81\n# pass  81" "bash"))))

      ;; What it proves
      (div :class "rounded-lg border border-blue-200 bg-blue-50 p-5 space-y-3"
        (h2 :class "text-lg font-semibold text-blue-900" "What this proves")
        (ol :class "list-decimal list-inside text-blue-800 space-y-2 text-sm"
-          (li "The test spec is " (strong "written in SX") " — the same language it tests")
-          (li "The same 81 tests run on " (strong "both Python and JavaScript"))
-          (li "Both hosts produce " (strong "identical results") " from identical SX source")
-          (li "Adding a new host requires only a new bootstrapper — the tests are " (strong "already written"))
+          (li "The test spec is " (strong "written in SX") " and " (strong "executed by SX") " — no code generation")
+          (li "The same 81 tests run on " (strong "both Python and JavaScript") " from the same file")
+          (li "Each host provides only " (strong "5 platform functions") " — everything else is pure SX")
+          (li "Adding a new host means implementing 5 functions, not rewriting tests")
          (li "Platform divergences (truthiness of 0, [], \"\") are " (strong "documented, not hidden"))
          (li "The spec is " (strong "executable") " — it doesn't just describe behavior, it verifies it")))

@@ -188,7 +203,7 @@
        (h2 :class "text-2xl font-semibold text-stone-800" "Full specification source")
        (p :class "text-xs text-stone-400 italic"
          "The s-expression source below is the canonical test specification. "
-          "Bootstrap compilers read this file to generate native test runners.")
+          "Any host that implements the five platform functions can evaluate it directly.")
        (div :class "not-prose bg-stone-100 rounded-lg p-5 mx-auto max-w-3xl"
          (pre :class "text-sm leading-relaxed whitespace-pre-wrap break-words"
            (code (highlight spec-source "sx"))))))))