# Blog-side draft — the `published-posts` migration query The one blog-app change needed to make `lib/blogimport`'s live source (Q-M4) real. Two parts: an SX **defquery** (`published-posts.sx` in this dir) and a Python **provider** it binds to. Both go in the **blog app** (production `blog/` tree); they are drafted here so the importer ships with its dependency spelled out. Apply on the blog app's branch, not on this migration branch. ## Why a new query (not reuse post-by-id) `blogimport/source.sx` needs, for every published post: `id, slug, title, status, visibility, tags, authors, lexical`. The existing providers (`blog/services/__init__.py` `SqlBlogService.get_post_by_*`) return a `PostDTO` whose `_post_to_dto` exposes `sx_content`/`html` but **not `lexical`** — and the canonical migration path is lexical→blocks (slice-01-blog Q-B1), not sx_content. So a dedicated migration provider that returns full rows including the raw lexical body is the minimal, honest change. One batch call covers both enumeration (Q-D2 corpus) and bodies. ## 1. defquery (→ `blog/queries.sx`) See `published-posts.sx` in this directory: ```lisp (defquery published-posts () "Enumerate every published, non-page blog post as a full row INCLUDING the raw lexical body — the SX migration corpus (Q-D2). Read-only ..." (service "blog" "list-published-posts")) ``` Kebab→snake convention (as for `get-post-by-slug` → `get_post_by_slug`) binds `"list-published-posts"` to the `SqlBlogService.list_published_posts` method below. ## 2. Python provider (→ `blog/services/__init__.py`, in `SqlBlogService`) ```python from sqlalchemy.orm import selectinload # add to imports async def list_published_posts(self, session: AsyncSession) -> list[dict]: """Migration corpus: every published, non-page post as a full row INCLUDING the raw lexical body (Q-D2). Read-only; consumed by the SX blogimport backfill/verify. Mirrors ghost_db.list_posts() base visibility filters.""" result = await session.execute( select(Post) .where( Post.deleted_at.is_(None), Post.status == "published", Post.is_page.is_(False), ) .options(selectinload(Post.tags), selectinload(Post.authors)) .order_by(Post.published_at.desc().nullslast()) ) return [ { "id": p.id, "uuid": p.uuid, "slug": p.slug, "title": p.title, "status": p.status, "visibility": p.visibility, "lexical": p.lexical, "tags": [t.slug for t in p.tags], "authors": [a.slug for a in p.authors], } for p in result.scalars().unique().all() ] ``` **Confirm before applying:** - The relationship names on `Post` (`tags`, `authors`) — check `blog/models/content.py` join tables (`post_tags`, `post_authors`); adjust `selectinload` + the comprehensions if they differ. `.unique()` is needed because the eager joins fan out rows. - `Post.uuid` and `Post.lexical` columns exist (`models/content.py` ~lines 61-63). - Visibility filters match `ghost_db.list_posts()` (drafts excluded, pages excluded) so the corpus is exactly the published read-path set. ## 3. Verify the contract After applying, the response shape must match `blogimport/parse-row` (`lib/blogimport/source.sx`): keys `:uuid|:id :slug :title :status :visibility :tags :authors :lexical`, with `:lexical` a JSON string (parsed via `dream-json-parse`). The mock in `lib/blogimport/tests/source.sx` is the executable spec of this contract. ## 4. Then wire the transport (host loop) `blogimport/backfill!`/`sync-verify` take an injected `fetch-fn`. In production that is the host's HMAC `fetch_data` wrapper (`GET /internal/data/published-posts`) — wiring that lives in `lib/host`, not here.