Files
rose-ash/plans/blogimport-pickup.md
giles 8f8688805e host: stage lib/blogimport pickup — persist-backed blog content (Phase 4)
Staged cross-loop hand-off (not started here): when the cards-as-types work lands, swap
host/blog-lookup's in-memory registry for content/head over content:<id> streams
populated by lib/blogimport (merged to local architecture a746b6ab, 76/76). Adds a
Phase 4 checklist item + plans/blogimport-pickup.md with concrete steps (merge
architecture, apply blog-side published-posts draft, inject fetch_data as fetch-fn,
backfill, swap lookup, sync-verify parity gate).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-30 14:57:24 +00:00

3.5 KiB

Staged pickup — persist-backed blog content via lib/blogimport

Staged for the host loop (2026-06-30) by the migration/blogimport work. Pick this up after the cards-as-types work lands — it's the data half that makes the live blog read endpoint serve real posts instead of the in-memory registry.

What's ready

lib/blogimport is merged into local architecture (a746b6ab, 76/76 conformance: lexical 23, import 21, verify 11, source 20/21). It is the blog Postgres→persist data-migration tooling (plans/migration/data-migration.md, Q-M4 resolved):

  • blogimport/lex-blocks doc — Ghost lexical (as SX dicts) → content-on-sx block list.
  • blogimport/import-post! b post at / import-all! — genesis import into the content:<id> op-log (idempotent) + metadata in postmeta:<id>.
  • blogimport/verify-post|verify-all — replay-and-diff parity check at rest.
  • blogimport/backfill! b fetch-fn at / sync-verify b fetch-fn — live source via an injected fetch-fn (Q-M4 = internal-data query).

To get it here: this worktree (loops/host) is behind local architecturegit merge architecture brings lib/blogimport (and the rest of the backlog) in. No origin push is involved.

The exact seam in this codebase

Phase 4's blog endpoint (lib/host/blog.sx, GET /<slug>/) renders a CtDoc via content/html, but host/blog-lookup is an in-memory slug→doc registry (the plan already says "swap for a persist-backed content stream later, handler/route unchanged"). lib/blogimport populates exactly those streams. The pickup is that swap.

Steps

  1. Merge local architecture into loops/host (gets lib/blogimport + deps: dream-json is the only new load dependency for the source layer).
  2. Apply the blog-side draft (Python, on the blog app) so the live source query exists: lib/blogimport/drafts/published-posts.sx (defquery) + drafts/README.md (the SqlBlogService.list_published_posts provider returning published rows incl. raw lexical — the current post DTO exposes sx_content/html but not lexical).
  3. Inject the transport: pass the host's HMAC fetch_data wrapper as blogimport's fetch-fn (GET /internal/data/published-posts). That wrapper is host territory.
  4. Backfill: run blogimport/backfill! b fetch-fn at against the durable persist backend → every published post becomes a content:<id> stream.
  5. Swap host/blog-lookup: resolve slug → post-id, then return (content/head b post-id) instead of the in-memory doc. Handler/route unchanged. (Slug→id: from the backfilled postmeta:<id> slug field, or a small slug index.)
  6. Parity gate (before fronting users): blogimport/sync-verify b fetch-fn must be all-ok — same discipline as A1/the slice cutover. Pairs with the still-open Phase 4 item "proxy-to-Quart fallback for un-migrated paths" (slice-01-blog's Caddy fall-through-on-404 cutover).

Notes / limits (carried from blogimport)

  • Inline formatting (bold/italic/links) currently flattens to plain text — content-on-sx Phase-5 rich runs aren't on architecture yet. Swap-point is isolated in lib/blogimport/lexical.sx lex-inline-text; no host change needed when it lands.
  • source.sx's response contract (parse-row) is the executable spec in lib/blogimport/tests/source.sx — confirm the live published-posts response matches.
  • Re-import with an improved converter (Q-M5) is import-once today (skip-if-exists).