host: SX-native HTML→SX converter (the radar migrator) + first-class HTML import

lib/host/htmlsx.sx — a pure-SX HTML → SX converter (char-level tokenizer + stack parser):
host/html->sx turns a post's HTML into an (article …) tree that host/blog--decompose! consumes
— img / p / figure+figcaption / iframe / headings / blockquote / lists, inline strong/em/a kept
nested (decompose flattens to text), entities decoded to UTF-8, comments+doctype skipped. This
replaces the one-off external Python converter used for the nt-live-encore import.

import-post! now accepts a raw "html" field (converted via html->sx, serialized to sx_content,
decomposed) alongside "sx_content" — so importing real Ghost HTML is first-class. Wired
htmlsx.sx into conformance.sh + serve.sh module lists (loads in conformance AND live).

New htmlsx suite 8/8 (text/entities/void/nested/figure/iframe/comments + an html→sx→decompose→
typed-cards round-trip); blog 197/197 (+ import-from-html test).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-07-01 15:32:06 +00:00
parent a99e64b661
commit 7e2275b90c
6 changed files with 205 additions and 6 deletions

View File

@@ -94,6 +94,7 @@ MODULES=(
"lib/host/relations.sx"
"lib/host/compose.sx"
"lib/host/execute.sx"
"lib/host/htmlsx.sx"
"lib/host/blog.sx"
"lib/host/page.sx"
"lib/host/server.sx"
@@ -109,6 +110,7 @@ SUITES=(
"feed host-fd-tests-run! lib/host/tests/feed.sx"
"relations host-rl-tests-run! lib/host/tests/relations.sx"
"blog host-bl-tests-run! lib/host/tests/blog.sx"
"htmlsx host-ht-tests-run! lib/host/tests/htmlsx.sx"
"compose host-cp-tests-run! lib/host/tests/compose.sx"
"execute host-ex-tests-run! lib/host/tests/execute.sx"
"session host-se-tests-run! lib/host/tests/session.sx"