host: stage lib/blogimport pickup — persist-backed blog content (Phase 4)

Staged cross-loop hand-off (not started here): when the cards-as-types work lands, swap
host/blog-lookup's in-memory registry for content/head over content:<id> streams
populated by lib/blogimport (merged to local architecture a746b6ab, 76/76). Adds a
Phase 4 checklist item + plans/blogimport-pickup.md with concrete steps (merge
architecture, apply blog-side published-posts draft, inject fetch_data as fetch-fn,
backfill, swap lookup, sync-verify parity gate).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-30 14:57:24 +00:00
parent a88ceda9d6
commit 8f8688805e
2 changed files with 65 additions and 0 deletions

View File

@@ -0,0 +1,59 @@
# Staged pickup — persist-backed blog content via `lib/blogimport`
Staged for the host loop (2026-06-30) by the migration/blogimport work. **Pick this up
after the cards-as-types work lands** — it's the data half that makes the live blog read
endpoint serve *real* posts instead of the in-memory registry.
## What's ready
`lib/blogimport` is **merged into local `architecture`** (`a746b6ab`, 76/76 conformance:
lexical 23, import 21, verify 11, source 20/21). It is the blog Postgres→persist
data-migration tooling (`plans/migration/data-migration.md`, Q-M4 resolved):
- `blogimport/lex-blocks doc` — Ghost lexical (as SX dicts) → content-on-sx block list.
- `blogimport/import-post! b post at` / `import-all!` — genesis import into the
`content:<id>` op-log (idempotent) + metadata in `postmeta:<id>`.
- `blogimport/verify-post|verify-all` — replay-and-diff parity check at rest.
- `blogimport/backfill! b fetch-fn at` / `sync-verify b fetch-fn` — live source via an
**injected `fetch-fn`** (Q-M4 = internal-data query).
To get it here: this worktree (`loops/host`) is behind local `architecture``git merge
architecture` brings `lib/blogimport` (and the rest of the backlog) in. No `origin` push
is involved.
## The exact seam in this codebase
Phase 4's blog endpoint (`lib/host/blog.sx`, `GET /<slug>/`) renders a `CtDoc` via
`content/html`, but `host/blog-lookup` is an **in-memory slug→doc registry** (the plan
already says "swap for a persist-backed content stream later, handler/route unchanged").
`lib/blogimport` populates exactly those streams. The pickup is that swap.
## Steps
1. **Merge** local `architecture` into `loops/host` (gets `lib/blogimport` + deps:
`dream-json` is the only new load dependency for the source layer).
2. **Apply the blog-side draft** (Python, on the blog app) so the live source query
exists: `lib/blogimport/drafts/published-posts.sx` (defquery) +
`drafts/README.md` (the `SqlBlogService.list_published_posts` provider returning
published rows **incl. raw `lexical`** — the current post DTO exposes
`sx_content`/`html` but not `lexical`).
3. **Inject the transport**: pass the host's HMAC `fetch_data` wrapper as `blogimport`'s
`fetch-fn` (`GET /internal/data/published-posts`). That wrapper is host territory.
4. **Backfill**: run `blogimport/backfill! b fetch-fn at` against the durable persist
backend → every published post becomes a `content:<id>` stream.
5. **Swap `host/blog-lookup`**: resolve `slug → post-id`, then return
`(content/head b post-id)` instead of the in-memory doc. Handler/route unchanged.
(Slug→id: from the backfilled `postmeta:<id>` slug field, or a small slug index.)
6. **Parity gate** (before fronting users): `blogimport/sync-verify b fetch-fn` must be
all-ok — same discipline as A1/the slice cutover. Pairs with the still-open Phase 4
item "proxy-to-Quart fallback for un-migrated paths" (slice-01-blog's Caddy
fall-through-on-404 cutover).
## Notes / limits (carried from blogimport)
- Inline formatting (bold/italic/links) currently **flattens to plain text**
content-on-sx Phase-5 rich runs aren't on `architecture` yet. Swap-point is isolated
in `lib/blogimport/lexical.sx` `lex-inline-text`; no host change needed when it lands.
- `source.sx`'s response contract (`parse-row`) is the executable spec in
`lib/blogimport/tests/source.sx` — confirm the live `published-posts` response matches.
- Re-import with an improved converter (Q-M5) is import-once today (skip-if-exists).

View File

@@ -185,6 +185,12 @@ lib/host/sxtp.sx subsystem APIs (feed/search/commerce/…
welcome/` renders real HTML through Caddy. Needs Smalltalk+persist+content
preloads + `(st-bootstrap-classes!)`+`(content/bootstrap!)` (self-bootstraps
at load).
- [ ] **persist-backed blog content via `lib/blogimport`** (STAGED, pick up after the
cards-as-types work). Swap `host/blog-lookup`'s in-memory registry for
`(content/head b post-id)` over `content:<id>` streams populated by `lib/blogimport`
(merged to local `architecture` `a746b6ab`, 76/76 — `git merge architecture` to
get it). Resolves Q-M4 (live source via injected `fetch-fn` = host `fetch_data`).
Full steps incl. the blog-side draft query + parity gate: `plans/blogimport-pickup.md`.
- [ ] proxy-to-Quart fallback for un-migrated paths (strangler requirement before
a real subdomain fronts users).
- [ ] internal-HMAC middleware on `/internal/*` (service-to-service auth; protocol