Make test.sx self-executing: evaluators run it directly, no codegen

test.sx now defines deftest/defsuite as macros. Any host that provides
5 platform functions (try-call, report-pass, report-fail, push-suite,
pop-suite) can evaluate the file directly — no bootstrap compilation
step needed for JS.

- Added defmacro for deftest (wraps body in thunk, catches via try-call)
- Added defmacro for defsuite (push/pop suite context stack)
- Created run.js: sx-browser.js evaluates test.sx directly (81/81 pass)
- Created run.py: Python evaluator evaluates test.sx directly (81/81 pass)
- Deleted bootstrap_test_js.py and generated test_sx_spec.js
- Updated testing docs page to reflect self-executing architecture

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-07 10:50:28 +00:00
parent 754e7557f5
commit e9d86d628b
7 changed files with 299 additions and 508 deletions

View File

@@ -7,49 +7,69 @@
;; Intro
(div :class "space-y-4"
(p :class "text-lg text-stone-600"
"SX tests itself. The test spec is written in SX and defines a complete test framework — assertion helpers, test suites, and 81 test cases covering every language feature. Bootstrap compilers read "
"SX tests itself. "
(code :class "text-violet-700 text-sm" "test.sx")
" and emit native test runners for each host platform.")
" is a self-executing test spec — it defines "
(code :class "text-violet-700 text-sm" "deftest")
" and "
(code :class "text-violet-700 text-sm" "defsuite")
" as macros, writes 81 test cases, and runs them. Any host that provides five platform functions can evaluate the file directly.")
(p :class "text-stone-600"
"This is not a test "
(em "of") " SX — it is a test " (em "in") " SX. The same s-expressions that define how "
(code :class "text-violet-700 text-sm" "if")
" works are used to verify that "
(code :class "text-violet-700 text-sm" "if")
" works. The language proves its own correctness, on every platform it compiles to."))
" works. No code generation, no intermediate files — the evaluator runs the spec."))
;; How it works
(div :class "space-y-3"
(h2 :class "text-2xl font-semibold text-stone-800" "Architecture")
(p :class "text-stone-600"
"The test framework needs five platform functions. Everything else — macros, assertion helpers, test suites — is pure SX:")
(div :class "not-prose bg-stone-100 rounded-lg p-5 mx-auto max-w-3xl"
(pre :class "text-sm leading-relaxed whitespace-pre-wrap break-words font-mono text-stone-700"
"test.sx The spec: assertion helpers + 15 suites + 81 tests
"test.sx Self-executing: macros + helpers + 81 tests
|
|--- bootstrap_test.py Reads test.sx, emits pytest module
|--- run.js Injects 5 platform fns, evaluates test.sx
| |
| +-> test_sx_spec.py 81 pytest test cases
| |
| +-> shared/sx/evaluator.py (Python SX evaluator)
| +-> sx-browser.js JS evaluator (bootstrapped from spec)
|
|--- bootstrap_test_js.py Reads test.sx, emits Node.js TAP script
|--- run.py Injects 5 platform fns, evaluates test.sx
|
+-> test_sx_spec.js 81 TAP test cases
|
+-> sx-browser.js (JS SX evaluator)")))
+-> evaluator.py Python evaluator
Platform functions:
try-call (thunk) -> {:ok true} | {:ok false :error \"msg\"}
report-pass (name) -> output pass
report-fail (name error) -> output fail
push-suite (name) -> push suite context
pop-suite () -> pop suite context")))
;; Framework
(div :class "space-y-3"
(h2 :class "text-2xl font-semibold text-stone-800" "The test framework")
(p :class "text-stone-600"
"The framework defines two declarative forms and nine assertion helpers, all in pure SX:")
"The framework defines two macros and nine assertion helpers, all in SX. The macros are the key — they make "
(code :class "text-violet-700 text-sm" "defsuite")
" and "
(code :class "text-violet-700 text-sm" "deftest")
" executable forms, not just declarations:")
(div :class "rounded-lg border border-stone-200 bg-white p-5 space-y-3"
(h3 :class "text-sm font-medium text-stone-500 uppercase tracking-wide" "Forms")
(h3 :class "text-sm font-medium text-stone-500 uppercase tracking-wide" "Macros")
(~doc-code :code
(highlight "(defsuite \"name\" ...tests)\n(deftest \"name\" ...body)" "lisp")))
(highlight "(defmacro deftest (name &rest body)\n `(let ((result (try-call (fn () ,@body))))\n (if (get result \"ok\")\n (report-pass ,name)\n (report-fail ,name (get result \"error\")))))\n\n(defmacro defsuite (name &rest items)\n `(do (push-suite ,name)\n ,@items\n (pop-suite)))" "lisp")))
(p :class "text-stone-600 text-sm"
(code :class "text-violet-700 text-sm" "deftest")
" wraps the body in a thunk, passes it to "
(code :class "text-violet-700 text-sm" "try-call")
" (the one platform function that catches errors), then reports pass or fail. "
(code :class "text-violet-700 text-sm" "defsuite")
" pushes a name onto the context stack, runs its children, and pops.")
(div :class "rounded-lg border border-stone-200 bg-white p-5 space-y-3"
(h3 :class "text-sm font-medium text-stone-500 uppercase tracking-wide" "Assertion helpers")
(~doc-code :code
(highlight "(define assert-equal\n (fn (expected actual)\n (assert (equal? expected actual)\n (str \"Expected \" (str expected) \" but got \" (str actual)))))\n\n(define assert-true (fn (val) (assert val ...)))\n(define assert-false (fn (val) (assert (not val) ...)))\n(define assert-nil (fn (val) (assert (nil? val) ...)))\n(define assert-type (fn (expected-type val) ...))\n(define assert-length (fn (expected-len col) ...))\n(define assert-contains (fn (item col) ...))" "lisp"))))
(highlight "(define assert-equal\n (fn (expected actual)\n (assert (equal? expected actual)\n (str \"Expected \" (str expected) \" but got \" (str actual)))))\n\n(define assert-true (fn (val) (assert val ...)))\n(define assert-false (fn (val) (assert (not val) ...)))\n(define assert-nil (fn (val) (assert (nil? val) ...)))\n(define assert-type (fn (expected-type val) ...))\n(define assert-length (fn (expected-len col) ...))\n(define assert-contains (fn (item col) ...))\n(define assert-throws (fn (thunk) ...))" "lisp"))))
;; Example tests
(div :class "space-y-3"
@@ -61,54 +81,49 @@
(~doc-code :code
(highlight "(defsuite \"arithmetic\"\n (deftest \"addition\"\n (assert-equal 3 (+ 1 2))\n (assert-equal 0 (+ 0 0))\n (assert-equal -1 (+ 1 -2))\n (assert-equal 10 (+ 1 2 3 4)))\n\n (deftest \"subtraction\"\n (assert-equal 1 (- 3 2))\n (assert-equal -1 (- 2 3)))\n\n (deftest \"multiplication\"\n (assert-equal 6 (* 2 3))\n (assert-equal 0 (* 0 100))\n (assert-equal 24 (* 1 2 3 4)))\n\n (deftest \"division\"\n (assert-equal 2 (/ 6 3))\n (assert-equal 2.5 (/ 5 2)))\n\n (deftest \"modulo\"\n (assert-equal 1 (mod 7 3))\n (assert-equal 0 (mod 6 3))))" "lisp"))))
;; Bootstrapping to Python
;; Running tests — JS
(div :class "space-y-3"
(h2 :class "text-2xl font-semibold text-stone-800" "Bootstrap to Python")
(h2 :class "text-2xl font-semibold text-stone-800" "JavaScript: direct evaluation")
(p :class "text-stone-600"
(code :class "text-violet-700 text-sm" "bootstrap_test.py")
" parses "
(code :class "text-violet-700 text-sm" "test.sx")
", extracts the "
(code :class "text-violet-700 text-sm" "define")
" forms (assertion helpers) as a preamble, then emits each "
(code :class "text-violet-700 text-sm" "defsuite")
" as a pytest class and each "
(code :class "text-violet-700 text-sm" "deftest")
" as a test method.")
(div :class "rounded-lg border border-stone-200 bg-white p-5 space-y-3"
(h3 :class "text-sm font-medium text-stone-500 uppercase tracking-wide" "Generated Python")
(~doc-code :code
(highlight "# Auto-generated from test.sx\nfrom shared.sx.parser import parse_all\nfrom shared.sx.evaluator import _eval, _trampoline\n\ndef _make_env():\n env = {}\n for expr in parse_all(_PREAMBLE):\n _trampoline(_eval(expr, env))\n return env\n\ndef _run(sx_source, env=None):\n if env is None:\n env = _make_env()\n exprs = parse_all(sx_source)\n for expr in exprs:\n result = _trampoline(_eval(expr, env))\n return result\n\nclass TestSpecArithmetic:\n def test_addition(self):\n _run('(do (assert-equal 3 (+ 1 2)) ...)')\n\n def test_subtraction(self):\n _run('(do (assert-equal 1 (- 3 2)) ...)')" "python")))
(div :class "rounded-lg border border-stone-200 bg-white p-5 space-y-3"
(h3 :class "text-sm font-medium text-stone-500 uppercase tracking-wide" "Run it")
(~doc-code :code
(highlight "$ python bootstrap_test.py --output test_sx_spec.py\nParsed 15 suites, 9 preamble defines from test.sx\nTotal test cases: 81\n\n$ pytest test_sx_spec.py -v\n81 passed in 0.15s" "bash"))))
;; Bootstrapping to JavaScript
(div :class "space-y-3"
(h2 :class "text-2xl font-semibold text-stone-800" "Bootstrap to JavaScript")
(p :class "text-stone-600"
(code :class "text-violet-700 text-sm" "bootstrap_test_js.py")
" emits a standalone Node.js script that loads "
(code :class "text-violet-700 text-sm" "sx-browser.js")
" (bootstrapped from the spec), injects any platform-specific primitives into the test env, then runs all 81 tests with TAP output.")
" evaluates "
(code :class "text-violet-700 text-sm" "test.sx")
" directly. The runner injects platform functions and calls "
(code :class "text-violet-700 text-sm" "Sx.eval")
" on each parsed expression:")
(div :class "rounded-lg border border-stone-200 bg-white p-5 space-y-3"
(h3 :class "text-sm font-medium text-stone-500 uppercase tracking-wide" "Generated JavaScript")
(h3 :class "text-sm font-medium text-stone-500 uppercase tracking-wide" "run.js")
(~doc-code :code
(highlight "// Auto-generated from test.sx\nvar Sx = require('./sx-browser.js');\n\nvar _envPrimitives = {\n 'equal?': function(a, b) { return deepEqual(a, b); },\n 'boolean?': function(x) { return typeof x === 'boolean'; },\n // ... platform primitives injected into env\n};\n\nfunction _makeEnv() {\n var env = {};\n for (var k in _envPrimitives) env[k] = _envPrimitives[k];\n var exprs = Sx.parseAll(PREAMBLE);\n for (var i = 0; i < exprs.length; i++) Sx.eval(exprs[i], env);\n return env;\n}\n\nfunction _run(name, sxSource) {\n var env = _makeEnv();\n var exprs = Sx.parseAll(sxSource);\n for (var i = 0; i < exprs.length; i++) Sx.eval(exprs[i], env);\n}" "javascript")))
(highlight "var Sx = require('./sx-browser.js');\nvar src = fs.readFileSync('test.sx', 'utf8');\n\nvar env = {\n 'try-call': function(thunk) {\n try {\n Sx.eval([thunk], env); // call the SX lambda\n return { ok: true };\n } catch(e) {\n return { ok: false, error: e.message };\n }\n },\n 'report-pass': function(name) { console.log('ok - ' + name); },\n 'report-fail': function(name, err) { console.log('not ok - ' + name); },\n 'push-suite': function(n) { stack.push(n); },\n 'pop-suite': function() { stack.pop(); },\n};\n\nvar exprs = Sx.parseAll(src);\nfor (var i = 0; i < exprs.length; i++) Sx.eval(exprs[i], env);" "javascript")))
(div :class "rounded-lg border border-stone-200 bg-white p-5 space-y-3"
(h3 :class "text-sm font-medium text-stone-500 uppercase tracking-wide" "Run it")
(h3 :class "text-sm font-medium text-stone-500 uppercase tracking-wide" "Output")
(~doc-code :code
(highlight "$ python bootstrap_test_js.py --output test_sx_spec.js\nParsed 15 suites, 9 preamble defines from test.sx\nTotal test cases: 81\n\n$ node test_sx_spec.js\nTAP version 13\n1..81\nok 1 - literals > numbers are numbers\nok 2 - literals > strings are strings\n...\nok 81 - edge-cases > empty operations\n\n# tests 81\n# pass 81" "bash"))))
(highlight "$ node shared/sx/tests/run.js\nTAP version 13\nok 1 - literals > numbers are numbers\nok 2 - literals > strings are strings\n...\nok 81 - edge-cases > empty operations\n\n# tests 81\n# pass 81" "bash"))))
;; Running tests — Python
(div :class "space-y-3"
(h2 :class "text-2xl font-semibold text-stone-800" "Python: direct evaluation")
(p :class "text-stone-600"
"Same approach — the Python evaluator runs "
(code :class "text-violet-700 text-sm" "test.sx")
" directly:")
(div :class "rounded-lg border border-stone-200 bg-white p-5 space-y-3"
(h3 :class "text-sm font-medium text-stone-500 uppercase tracking-wide" "run.py")
(~doc-code :code
(highlight "from shared.sx.parser import parse_all\nfrom shared.sx.evaluator import _eval, _trampoline\n\ndef try_call(thunk):\n try:\n _trampoline(_eval([thunk], {}))\n return {'ok': True}\n except Exception as e:\n return {'ok': False, 'error': str(e)}\n\nenv = {\n 'try-call': try_call,\n 'report-pass': report_pass,\n 'report-fail': report_fail,\n 'push-suite': push_suite,\n 'pop-suite': pop_suite,\n}\n\nfor expr in parse_all(src):\n _trampoline(_eval(expr, env))" "python")))
(div :class "rounded-lg border border-stone-200 bg-white p-5 space-y-3"
(h3 :class "text-sm font-medium text-stone-500 uppercase tracking-wide" "Output")
(~doc-code :code
(highlight "$ python shared/sx/tests/run.py\nTAP version 13\nok 1 - literals > numbers are numbers\n...\nok 81 - edge-cases > empty operations\n\n# tests 81\n# pass 81" "bash"))))
;; What it proves
(div :class "rounded-lg border border-blue-200 bg-blue-50 p-5 space-y-3"
(h2 :class "text-lg font-semibold text-blue-900" "What this proves")
(ol :class "list-decimal list-inside text-blue-800 space-y-2 text-sm"
(li "The test spec is " (strong "written in SX") " — the same language it tests")
(li "The same 81 tests run on " (strong "both Python and JavaScript"))
(li "Both hosts produce " (strong "identical results") " from identical SX source")
(li "Adding a new host requires only a new bootstrapper — the tests are " (strong "already written"))
(li "The test spec is " (strong "written in SX") " and " (strong "executed by SX") " — no code generation")
(li "The same 81 tests run on " (strong "both Python and JavaScript") " from the same file")
(li "Each host provides only " (strong "5 platform functions") " — everything else is pure SX")
(li "Adding a new host means implementing 5 functions, not rewriting tests")
(li "Platform divergences (truthiness of 0, [], \"\") are " (strong "documented, not hidden"))
(li "The spec is " (strong "executable") " — it doesn't just describe behavior, it verifies it")))
@@ -188,7 +203,7 @@
(h2 :class "text-2xl font-semibold text-stone-800" "Full specification source")
(p :class "text-xs text-stone-400 italic"
"The s-expression source below is the canonical test specification. "
"Bootstrap compilers read this file to generate native test runners.")
"Any host that implements the five platform functions can evaluate it directly.")
(div :class "not-prose bg-stone-100 rounded-lg p-5 mx-auto max-w-3xl"
(pre :class "text-sm leading-relaxed whitespace-pre-wrap break-words"
(code (highlight spec-source "sx"))))))))