Files
rose-ash/lib/blogimport/import.sx
giles a4d93c61cc
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 1m9s
blogimport: lexical->persist genesis-import + at-rest parity verifier (55/55)
Implements plans/migration/data-migration.md (the un-started long-pole) and the
data-layer half of slice-01-blog §4. Host-ops migration module composing
content-on-sx + persist public APIs; isolated from lib/host and lib/content.

- lexical.sx: Ghost lexical (as SX dicts) -> content block list, deterministic ids
- import.sx: genesis import into content:<id> op-log, idempotent, + postmeta stream
- verify.sx: replay-and-diff vs row-derived oracle (proves round-trip lossless)

Inline formatting flattens to plain text (Phase-5 runs swap-point isolated in
lex-inline-text); live Postgres source (Q-M4) + improved-converter re-import (Q-M5)
flagged in README. 55/55 conformance: lexical 23, import 21, verify 11.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-30 13:14:30 +00:00

85 lines
3.5 KiB
Plaintext

; lib/blogimport/import.sx
; Genesis import: a blog Post row -> a persist content op-log stream.
;
; Per plans/migration/data-migration.md §3-5: for each Post, convert its lexical
; body to content blocks and commit them as genesis insert ops into the
; content:<id> stream, idempotently, with post metadata recorded as an event in a
; sibling stream. The same code runs on mem and durable persist backends (every fn
; takes the backend `b`, the acl.sx design principle).
;
; A `post` is a dict mirroring the blog Post row:
; {:id "uuid" :slug "hello" :title "Hello" :status "published"
; :visibility "public" :tags (list "a") :authors (list "u1")
; :lexical <lexical-doc-as-sx-dict>}
; Reading real rows (internal-data query vs direct Postgres, Q-M4) is the live-source
; edge, out of scope here; this drives content/commit! given a `post` dict.
; --- genesis ops: insert each block in document order (deterministic) -----------
; first block after nil (prepend), each subsequent after the previous block's id,
; reproducing source order so re-import yields the same sequence (data-migration §5).
(define
blogimport/genesis-ops
(fn (blocks)
(let ((ids (map blk-id blocks)))
(map-indexed
(fn (i blk) (op-insert blk (if (= i 0) nil (nth ids (- i 1)))))
blocks))))
; --- post metadata (title/slug/status/visibility/tags/authors) ------------------
(define
blogimport/post-meta
(fn (post)
{:title (or (get post :title) "")
:slug (or (get post :slug) "")
:status (or (get post :status) "")
:visibility (or (get post :visibility) "")
:tags (or (get post :tags) (list))
:authors (or (get post :authors) (list))}))
; metadata is not a content op, so it rides a sibling event stream postmeta:<id>;
; latest event wins (LWW). Replayable + durable like the block op-log.
(define blogimport/meta-stream (fn (id) (str "postmeta:" id)))
(define
blogimport/commit-meta!
(fn (b id meta at)
(persist/append b (blogimport/meta-stream id) "post-meta" at meta)))
(define
blogimport/load-meta
(fn (b id)
(let ((evs (persist/read b (blogimport/meta-stream id))))
(if (= (len evs) 0) nil (persist/event-data (nth evs (- (len evs) 1)))))))
; --- idempotency: a stream already holding events is already imported -----------
; (host-persist guarantees monotonic seq but NOT dedupe — skip-if-exists is the
; importer's dedupe, so re-running the backfill never double-imports. data-migration
; §5.) Re-import with an improved converter (Q-M5) is future work — superseding,
; not duplicating; this build is import-once.
(define
blogimport/imported?
(fn (b id) (> (content/version-count b id) 0)))
; --- import one post ------------------------------------------------------------
(define
blogimport/import-post!
(fn (b post at)
(let ((id (get post :id)))
(if
(blogimport/imported? b id)
{:id id :imported false :reason "exists"}
(let ((blocks (blogimport/lex-blocks (get post :lexical))))
(begin
(content/commit-all! b id (blogimport/genesis-ops blocks) at)
(blogimport/commit-meta! b id (blogimport/post-meta post) at)
{:id id :imported true :blocks (len blocks)}))))))
; --- import many: coverage scoreboard -------------------------------------------
(define
blogimport/import-all!
(fn (b posts at)
(let ((results (map (fn (p) (blogimport/import-post! b p at)) posts)))
{:total (len results)
:imported (len (filter (fn (r) (get r :imported)) results))
:skipped (len (filter (fn (r) (not (get r :imported))) results))})))