fix: cek_run propagates IO suspension via _cek_io_suspend_hook
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 1m52s
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 1m52s
When a `perform` fired inside a tree-walked eval_expr path — sf_letrec init
exprs / non-last body exprs, expand_macro body, qq_expand unquote,
sf_dynamic_wind / sf_scope / sf_provide bodies — cek_run raised
"IO suspension in non-IO context" and swallowed the suspension. The hook
that converts the CEK suspended state to VmSuspended (so the outer driver
sees it as a resumable suspension object) was defined in sx_vm.ml but
never invoked from cek_run.
Repro in Node.js (hosts/ocaml/browser/test_letrec_resume.js):
(letrec ((x (perform {:op "io"}))) "ok") ;; threw the error
(letrec ((x 1)) (perform {:op "io"}) "after") ;; threw the error
The originally reported browser symptom — "[sx] resume: Not callable: nil"
after hs-wait resumes inside a letrec — was the same root cause showing
through the JIT/VM resume path instead of as a top-level error.
Fix: cek_run and cek_run_iterative now check !_cek_io_suspend_hook and
invoke it when the loop terminates in a suspended state. The hook (set by
sx_vm.ml in the browser, by run_tests.ml in the test runner) converts the
suspension to VmSuspended / resolves IO synchronously. When the hook is
unset (pure-CEK harness), the legacy Eval_error is raised so misuse stays
visible.
Also patches:
- hosts/ocaml/bootstrap.py — regex-patches the transpiled cek_run on regen
so the fix survives a fresh `python3 hosts/ocaml/bootstrap.py` cycle.
- hosts/ocaml/browser/sx_browser.ml — api_eval / api_eval_vm / api_eval_expr
now catch VmSuspended and surface a clean error string (K.eval has no
driver to resume; callers who want resumption use callFn).
Tests:
- spec/tests/test-letrec-resume-treewalk.sx — 7 CEK-level regression tests
covering letrec init / non-last body, scope/provide bodies, sibling
fn-after-perform. All 7 fail in baseline ("IO suspension in non-IO
context"), all 7 pass with the fix.
- hosts/ocaml/browser/test_letrec_resume.js — 13 WASM kernel tests via
callFn driveSync, including the wait-boot pattern from the briefing.
All 13 pass.
Suite results: 4557 pass / 1338 fail (was 4550 / 1339); +7 new passes,
-1 flaky timeout (hs-upstream-if sieve), no regressions.
This commit is contained in:
@@ -82,7 +82,10 @@ let cek_run_iterative state =
|
||||
s := cek_step !s
|
||||
done;
|
||||
(match cek_suspended_p !s with
|
||||
| Bool true -> raise (Eval_error "IO suspension in non-IO context")
|
||||
| Bool true ->
|
||||
(match !_cek_io_suspend_hook with
|
||||
| Some hook -> hook !s
|
||||
| None -> raise (Eval_error "IO suspension in non-IO context"))
|
||||
| _ -> cek_value !s)
|
||||
with Eval_error msg ->
|
||||
_last_error_kont_ref := cek_kont !s;
|
||||
@@ -308,6 +311,23 @@ def compile_spec_to_ml(spec_dir: str | None = None) -> str:
|
||||
output
|
||||
)
|
||||
|
||||
# Patch transpiled cek_run to invoke _cek_io_suspend_hook on suspension
|
||||
# instead of unconditionally raising Eval_error. This is the fix for the
|
||||
# tree-walk eval_expr path: sf_letrec init exprs / non-last body exprs,
|
||||
# macro bodies, qq_expand, dynamic-wind / scope / provide bodies all use
|
||||
# `trampoline (eval_expr ...)` and were swallowing CEK suspensions as
|
||||
# "IO suspension in non-IO context" errors. With the hook, the suspension
|
||||
# propagates as VmSuspended to the outer driver (browser callFn / server
|
||||
# eval_expr_io). When the hook is unset (pure-CEK harness), the legacy
|
||||
# error is preserved as the fallback.
|
||||
output = re.sub(
|
||||
r'\(raise \(Eval_error \(value_to_str \(String "IO suspension in non-IO context"\)\)\)\)',
|
||||
'(match !_cek_io_suspend_hook with Some hook -> hook final | None -> '
|
||||
'(raise (Eval_error (value_to_str (String "IO suspension in non-IO context")))))',
|
||||
output,
|
||||
count=1,
|
||||
)
|
||||
|
||||
return output
|
||||
|
||||
|
||||
|
||||
Reference in New Issue
Block a user