js-on-sx: bump test262 runner per-test timeout 5s→15s
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 59s

With 4 parallel workers contending, the 5s default timed out 85/99
built-ins/String tests. Bumping to 15s yields 65/99 (65.7%) with
real failure modes now visible instead of "85x Timeout".
This commit is contained in:
2026-05-07 07:57:23 +00:00
parent 066ddcd6e1
commit 89f1c0ccbe
4 changed files with 37 additions and 45 deletions

View File

@@ -52,7 +52,7 @@ UPSTREAM = REPO / "lib" / "js" / "test262-upstream"
TEST_ROOT = UPSTREAM / "test"
HARNESS_DIR = UPSTREAM / "harness"
DEFAULT_PER_TEST_TIMEOUT_S = 5.0
DEFAULT_PER_TEST_TIMEOUT_S = 15.0
DEFAULT_BATCH_TIMEOUT_S = 120
# Cache dir for precomputed SX source of harness JS (one file per Python run).

View File

@@ -1,76 +1,66 @@
{
"totals": {
"pass": 66,
"fail": 25,
"skip": 1130,
"timeout": 9,
"total": 1230,
"runnable": 100,
"pass_rate": 66.0
"pass": 65,
"fail": 26,
"skip": 1,
"timeout": 8,
"total": 100,
"runnable": 99,
"pass_rate": 65.7
},
"categories": [
{
"category": "built-ins/String",
"total": 1223,
"pass": 66,
"fail": 25,
"skip": 1123,
"timeout": 9,
"pass_rate": 66.0,
"total": 100,
"pass": 65,
"fail": 26,
"skip": 1,
"timeout": 8,
"pass_rate": 65.7,
"top_failures": [
[
"Test262Error (assertion failed)",
14
16
],
[
"Timeout",
9
8
],
[
"TypeError: not a function",
6
],
[
"ReferenceError (undefined symbol)",
"Unhandled: Not callable: \\\\\\",
2
],
[
"Unhandled: Not callable: \\\\\\",
2
"ReferenceError (undefined symbol)",
1
]
]
},
{
"category": "built-ins/StringIteratorPrototype",
"total": 7,
"pass": 0,
"fail": 0,
"skip": 7,
"timeout": 0,
"pass_rate": 0.0,
"top_failures": []
}
],
"top_failure_modes": [
[
"Test262Error (assertion failed)",
14
16
],
[
"Timeout",
9
8
],
[
"TypeError: not a function",
6
],
[
"ReferenceError (undefined symbol)",
"Unhandled: Not callable: \\\\\\",
2
],
[
"Unhandled: Not callable: \\\\\\",
2
"ReferenceError (undefined symbol)",
1
],
[
"SyntaxError (parse/unsupported syntax)",
@@ -78,6 +68,6 @@
]
],
"pinned_commit": "d5e73fc8d2c663554fb72e2380a8c2bc1a318a33",
"elapsed_seconds": 157.9,
"elapsed_seconds": 420.4,
"workers": 1
}

View File

@@ -1,31 +1,31 @@
# test262 scoreboard
Pinned commit: `d5e73fc8d2c663554fb72e2380a8c2bc1a318a33`
Wall time: 157.9s
Wall time: 420.4s
**Total:** 66/100 runnable passed (66.0%). Raw: pass=66 fail=25 skip=1130 timeout=9 total=1230.
**Total:** 65/99 runnable passed (65.7%). Raw: pass=65 fail=26 skip=1 timeout=8 total=100.
## Top failure modes
- **14x** Test262Error (assertion failed)
- **9x** Timeout
- **16x** Test262Error (assertion failed)
- **8x** Timeout
- **6x** TypeError: not a function
- **2x** ReferenceError (undefined symbol)
- **2x** Unhandled: Not callable: \\\
- **1x** ReferenceError (undefined symbol)
- **1x** SyntaxError (parse/unsupported syntax)
## Categories (worst pass-rate first, min 10 runnable)
| Category | Pass | Fail | Skip | Timeout | Total | Pass % |
|---|---:|---:|---:|---:|---:|---:|
| built-ins/String | 66 | 25 | 1123 | 9 | 1223 | 66.0% |
| built-ins/String | 65 | 26 | 1 | 8 | 100 | 65.7% |
## Per-category top failures (min 10 runnable, worst first)
### built-ins/String (66/100 — 66.0%)
### built-ins/String (65/99 — 65.7%)
- **14x** Test262Error (assertion failed)
- **9x** Timeout
- **16x** Test262Error (assertion failed)
- **8x** Timeout
- **6x** TypeError: not a function
- **2x** ReferenceError (undefined symbol)
- **2x** Unhandled: Not callable: \\\
- **1x** ReferenceError (undefined symbol)

View File

@@ -158,6 +158,8 @@ Each item: implement → tests → update progress. Mark `[x]` when tests green.
Append-only record of completed iterations. Loop writes one line per iteration: date, what was done, test count delta.
- 2026-05-07 — **Bump test262 runner default per-test timeout 5s→15s.** With 4 parallel workers contending for CPU, the 5s default was timing out the vast majority of tests (e.g. 85/99 on built-ins/String). Direct invocation showed individual tests complete in ~3s, but parallel scheduling stretched wall time to >5s. Bumping to 15s makes the scoreboard usable: built-ins/String 14.1% → 65.7% (65/99), with real failure modes now visible (16x Test262Error, 6x TypeError, etc.) instead of "85x Timeout" drowning the signal. Regenerated scoreboard to reflect the new state. conformance.sh: 148/148.
- 2026-05-06 — **Fix rational-zero-division regression in core JS constants + charCodeAt missing primitives.** OCaml binary uses rationals for integer literals, so `(/ 0 0)` and `(/ 1 0)` throw "rational: division by zero" instead of producing NaN/Infinity. Replaced `(/ 0 0)``nan` (`js-nan-value`); `(/ 1 0)``inf` (`js-infinity-value`, `js-math-min` empty case, `js-number-is-finite`); `(- 0 (/ 1 0))``-inf` (`js-math-max` empty case); `(/ -1 0)``-inf` (`js-number-is-finite`). `js-max-value-approx` was looping forever (rationals never reach float infinity) — replaced with literal `1.7976931348623157e+308`. Fixed `charCodeAt` and string `.length` to use `(len s)` and `(char-code (char-at s idx))` instead of missing `unicode-len`/`unicode-char-code-at` primitives. conformance.sh: 0→148/148. Unit tests: 521/530 best run (baseline run was 417/530; both timeout-flaky).
- 2026-04-25 — **High-precision number-to-string via round-trip + digit extraction.** `js-big-int-str-loop` extracts decimal digits from integer-valued float. `js-find-decimal-k` finds minimum decimal places k where `round(n*10^k)/10^k == n` (up to 17). `js-format-decimal-digits` inserts decimal point. `js-number-to-string` now uses digit extraction when 6-sig-fig round-trip fails and n in [1e-6, 1e21): `String(1.0000001)="1.0000001"`, `String(1/3)="0.3333333333333333"`. String test262 subset: 58→62/100. 529/530 unit, 148/148 slice.