blogimport: Q-M4 live source — internal-data query adapter (75/75)
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 1m5s
Some checks failed
Test, Build, and Deploy / test-build-deploy (push) Failing after 1m5s
source.sx: live-source adapter resolving Q-M4 (internal-data query, not direct PG). Injected fetch-fn transport port (hexagonal seam); parse-row maps a blog post-row to the importer post dict and parses the :lexical JSON string via dream-json-parse. End-to-end drivers: backfill! (enumerate->fetch->import) and sync-verify (enumerate->fetch->verify), + backfill-ids! explicit-id fallback. Tests mock the transport against the documented response contract incl. a real lexical JSON string. README flags the one blog-side gap (add a published-posts enumeration query) + production fetch_data wiring (lives in lib/host). source 20/20; total 75/75. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -7,7 +7,8 @@ composes the public APIs of content-on-sx (`lib/content`) and persist
|
|||||||
(`lib/persist`). Kept in its own module (not `lib/host`, not `lib/content`) so it
|
(`lib/persist`). Kept in its own module (not `lib/host`, not `lib/content`) so it
|
||||||
doesn't collide with the loops that own those.
|
doesn't collide with the loops that own those.
|
||||||
|
|
||||||
Status: **machinery complete, 55/55 conformance** (lexical 23, import 21, verify 11).
|
Status: **machinery complete + live-source wired, 75/75 conformance**
|
||||||
|
(lexical 23, import 21, verify 11, source 20).
|
||||||
|
|
||||||
## What it does
|
## What it does
|
||||||
|
|
||||||
@@ -16,6 +17,7 @@ Status: **machinery complete, 55/55 conformance** (lexical 23, import 21, verify
|
|||||||
| `lexical.sx` | `blogimport/lex-blocks doc` — Ghost **lexical** body (as SX dicts) → content-on-sx **block list**, ids deterministic by position (`b0,b1,…`). |
|
| `lexical.sx` | `blogimport/lex-blocks doc` — Ghost **lexical** body (as SX dicts) → content-on-sx **block list**, ids deterministic by position (`b0,b1,…`). |
|
||||||
| `import.sx` | `blogimport/import-post! b post at` — genesis import: convert the post's lexical, commit blocks as ordered `op-insert`s into the `content:<id>` op-log stream, record metadata in a sibling `postmeta:<id>` stream. Idempotent (skip-if-exists). `import-all!` → coverage scoreboard. |
|
| `import.sx` | `blogimport/import-post! b post at` — genesis import: convert the post's lexical, commit blocks as ordered `op-insert`s into the `content:<id>` op-log stream, record metadata in a sibling `postmeta:<id>` stream. Idempotent (skip-if-exists). `import-all!` → coverage scoreboard. |
|
||||||
| `verify.sx` | `blogimport/verify-post b post` — replay the stream → block model, diff vs the row-derived oracle with `=`. `verify-all` → `{:total :ok :mismatched}` coverage. |
|
| `verify.sx` | `blogimport/verify-post b post` — replay the stream → block model, diff vs the row-derived oracle with `=`. `verify-all` → `{:total :ok :mismatched}` coverage. |
|
||||||
|
| `source.sx` | **Live source (Q-M4 = internal-data query).** Injected `fetch-fn` transport port; `parse-row` maps a service post-row → importer `post` dict and parses the `:lexical` JSON string (`dream-json-parse`). `backfill! b fetch-fn at` = enumerate → fetch → import; `sync-verify b fetch-fn` = enumerate → fetch → verify. `backfill-ids!` is the explicit-id fallback. |
|
||||||
|
|
||||||
## What is proven
|
## What is proven
|
||||||
|
|
||||||
@@ -32,16 +34,28 @@ is *detected*, not silently passed.
|
|||||||
The single swap-point is `lex-inline-text` in `lexical.sx` — return runs there once
|
The single swap-point is `lex-inline-text` in `lexical.sx` — return runs there once
|
||||||
content-on-sx Phase 5 lands on `architecture`. Bold/italic/links currently collapse
|
content-on-sx Phase 5 lands on `architecture`. Bold/italic/links currently collapse
|
||||||
to their plain concatenation (drift-proof, == `asText`). (slice-01-blog Q-B1.)
|
to their plain concatenation (drift-proof, == `asText`). (slice-01-blog Q-B1.)
|
||||||
- **Oracle is the in-memory lexical→blocks, not the live Python block model.** This
|
- **Q-M4 RESOLVED — live source = internal-data query** (`source.sx`), via an injected
|
||||||
proves round-trip fidelity through persist. The "does SX match Python" half of Q-D2
|
`fetch-fn` port. The remaining real-world wiring is operational, not design:
|
||||||
needs the **live source**: read real `Post` rows via the internal-data query
|
1. **One blog-side query must be added**: `blog/queries.sx` has fetch-by-id/slug/ids
|
||||||
(`/internal/data/…`) or direct Postgres (**Q-M4**, undecided) and feed them as `post`
|
but **no enumeration query**. Add a `published-posts` defquery returning the
|
||||||
dicts. The diff plumbing here is the twin that step reuses.
|
published ids/slugs (Python `list_posts(status="published")`,
|
||||||
|
`blog/bp/blog/ghost_db.py:102`). Until then, drive `backfill-ids!` with an explicit
|
||||||
|
id list. `source.sx` is mocked against this contract in `tests/source.sx`.
|
||||||
|
2. **Production `fetch-fn`** = the host's HMAC-signed `fetch_data` wrapper
|
||||||
|
(`GET /internal/data/{query}`). That wiring lives in `lib/host` (the host loop's
|
||||||
|
territory); `source.sx` only needs the port injected.
|
||||||
|
3. **Confirm the response field names** of the live `get-post-by-*` data handler
|
||||||
|
against `parse-row`'s contract (`:uuid|:id :slug :title :status :visibility :tags
|
||||||
|
:authors :lexical`); a mismatch is a one-line field fix.
|
||||||
|
- **Oracle is the lexical→blocks of the SAME post, not the live Python block model.**
|
||||||
|
This proves round-trip fidelity through persist (no corruption at rest). The "does SX
|
||||||
|
match the *Python render*" half of Q-D2 would additionally diff against the Python
|
||||||
|
side's own block derivation — deferred with the read-path cutover.
|
||||||
- **Re-import with an improved converter (Q-M5)** is import-once today (skip-if-exists).
|
- **Re-import with an improved converter (Q-M5)** is import-once today (skip-if-exists).
|
||||||
Superseding prior genesis events (vs truncate+re-import) is future work.
|
Superseding prior genesis events (vs truncate+re-import) is future work.
|
||||||
|
|
||||||
## Run
|
## Run
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
bash lib/blogimport/conformance.sh # 55/55; writes scoreboard.{json,md}
|
bash lib/blogimport/conformance.sh # 75/75; writes scoreboard.{json,md}
|
||||||
```
|
```
|
||||||
|
|||||||
@@ -16,7 +16,7 @@ if [ ! -x "$SX_SERVER" ]; then
|
|||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
SUITES=(lexical import verify)
|
SUITES=(lexical import verify source)
|
||||||
|
|
||||||
OUT_JSON="lib/blogimport/scoreboard.json"
|
OUT_JSON="lib/blogimport/scoreboard.json"
|
||||||
OUT_MD="lib/blogimport/scoreboard.md"
|
OUT_MD="lib/blogimport/scoreboard.md"
|
||||||
@@ -49,9 +49,11 @@ run_suite() {
|
|||||||
(load "lib/content/callout.sx")
|
(load "lib/content/callout.sx")
|
||||||
(load "lib/content/media.sx")
|
(load "lib/content/media.sx")
|
||||||
(load "lib/content/store.sx")
|
(load "lib/content/store.sx")
|
||||||
|
(load "lib/dream/json.sx")
|
||||||
(load "lib/blogimport/lexical.sx")
|
(load "lib/blogimport/lexical.sx")
|
||||||
(load "lib/blogimport/import.sx")
|
(load "lib/blogimport/import.sx")
|
||||||
(load "lib/blogimport/verify.sx")
|
(load "lib/blogimport/verify.sx")
|
||||||
|
(load "lib/blogimport/source.sx")
|
||||||
(epoch 2)
|
(epoch 2)
|
||||||
(eval "(define bi-test-pass 0)")
|
(eval "(define bi-test-pass 0)")
|
||||||
(eval "(define bi-test-fail 0)")
|
(eval "(define bi-test-fail 0)")
|
||||||
|
|||||||
@@ -2,9 +2,10 @@
|
|||||||
"suites": {
|
"suites": {
|
||||||
"lexical": {"pass": 23, "fail": 0},
|
"lexical": {"pass": 23, "fail": 0},
|
||||||
"import": {"pass": 21, "fail": 0},
|
"import": {"pass": 21, "fail": 0},
|
||||||
"verify": {"pass": 11, "fail": 0}
|
"verify": {"pass": 11, "fail": 0},
|
||||||
|
"source": {"pass": 20, "fail": 0}
|
||||||
},
|
},
|
||||||
"total_pass": 55,
|
"total_pass": 75,
|
||||||
"total_fail": 0,
|
"total_fail": 0,
|
||||||
"total": 55
|
"total": 75
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -7,4 +7,5 @@ _Generated by `lib/blogimport/conformance.sh`_
|
|||||||
| lexical | 23 | 0 | 23 |
|
| lexical | 23 | 0 | 23 |
|
||||||
| import | 21 | 0 | 21 |
|
| import | 21 | 0 | 21 |
|
||||||
| verify | 11 | 0 | 11 |
|
| verify | 11 | 0 | 11 |
|
||||||
| **Total** | **55** | **0** | **55** |
|
| source | 20 | 0 | 20 |
|
||||||
|
| **Total** | **75** | **0** | **75** |
|
||||||
|
|||||||
92
lib/blogimport/source.sx
Normal file
92
lib/blogimport/source.sx
Normal file
@@ -0,0 +1,92 @@
|
|||||||
|
; lib/blogimport/source.sx
|
||||||
|
; Live source adapter — Q-M4 RESOLVED: import via the blog INTERNAL-DATA QUERY
|
||||||
|
; surface (decoupled), not direct Postgres. Reuses the existing query contracts
|
||||||
|
; (blog/queries.sx: post-by-id/post-by-slug/posts-by-ids) and keeps the importer in
|
||||||
|
; the SX/host world (plans/migration/data-migration.md §7 recommended default).
|
||||||
|
;
|
||||||
|
; TRANSPORT SEAM (hexagonal, like every other subsystem): a `fetch-fn` port is
|
||||||
|
; INJECTED. Contract:
|
||||||
|
; (fetch-fn query-name params-dict) -> response-data
|
||||||
|
; In production `fetch-fn` is the host's HMAC-signed fetch_data wrapper
|
||||||
|
; (GET /internal/data/{query}); in tests it's a mock. The importer never knows how
|
||||||
|
; the bytes arrive.
|
||||||
|
;
|
||||||
|
; RESPONSE CONTRACT (one published-post row), the blog `get-post-by-*` data handler:
|
||||||
|
; {:uuid|:id :slug :title :status :visibility :tags :authors :lexical}
|
||||||
|
; :lexical is the Ghost body as a JSON STRING (the Post.lexical DB column) — parsed
|
||||||
|
; here with dream-json-parse into the SX dict shape blogimport/lex-blocks expects.
|
||||||
|
; (If a handler returns :lexical already-structured, it is used as-is.)
|
||||||
|
;
|
||||||
|
; REQUIRED BLOG-SIDE ADDITION (the one gap): blog/queries.sx exposes fetch-by-id/slug
|
||||||
|
; but NO enumeration query. The corpus (Q-D2 = every published post) needs a
|
||||||
|
; `published-posts` query returning the published ids/slugs (Python: list_posts(
|
||||||
|
; status="published"), blog/bp/blog/ghost_db.py:102). Flagged for the blog app; mocked
|
||||||
|
; in tests. Until it exists, callers can pass an explicit id list to backfill-ids!.
|
||||||
|
|
||||||
|
(define blogimport/dep-json-parse dream-json-parse)
|
||||||
|
|
||||||
|
; --- lexical field -> SX dict (string from DB column, or already structured) -----
|
||||||
|
(define
|
||||||
|
blogimport/parse-lexical
|
||||||
|
(fn (lx)
|
||||||
|
(cond
|
||||||
|
((equal? lx nil) {:root {:children (list)}})
|
||||||
|
((string? lx) (blogimport/dep-json-parse lx))
|
||||||
|
(else lx))))
|
||||||
|
|
||||||
|
; --- service post-row -> importer `post` dict -----------------------------------
|
||||||
|
(define
|
||||||
|
blogimport/parse-row
|
||||||
|
(fn (row)
|
||||||
|
{:id (or (get row :uuid) (get row :id))
|
||||||
|
:slug (or (get row :slug) "")
|
||||||
|
:title (or (get row :title) "")
|
||||||
|
:status (or (get row :status) "")
|
||||||
|
:visibility (or (get row :visibility) "")
|
||||||
|
:tags (or (get row :tags) (list))
|
||||||
|
:authors (or (get row :authors) (list))
|
||||||
|
:lexical (blogimport/parse-lexical (get row :lexical))}))
|
||||||
|
|
||||||
|
; --- fetch one post via an internal-data query ----------------------------------
|
||||||
|
(define
|
||||||
|
blogimport/fetch-post
|
||||||
|
(fn (fetch-fn query params)
|
||||||
|
(blogimport/parse-row (fetch-fn query params))))
|
||||||
|
|
||||||
|
; --- enumerate published post ids (needs the `published-posts` query) -----------
|
||||||
|
(define
|
||||||
|
blogimport/published-ids
|
||||||
|
(fn (fetch-fn) (fetch-fn "published-posts" {})))
|
||||||
|
|
||||||
|
; --- fetch all published posts as importer `post` dicts -------------------------
|
||||||
|
(define
|
||||||
|
blogimport/source-posts
|
||||||
|
(fn (fetch-fn)
|
||||||
|
(map
|
||||||
|
(fn (id) (blogimport/fetch-post fetch-fn "post-by-id" {:id id}))
|
||||||
|
(blogimport/published-ids fetch-fn))))
|
||||||
|
|
||||||
|
; --- fetch an explicit id list (fallback before the enumeration query lands) ----
|
||||||
|
(define
|
||||||
|
blogimport/source-posts-by-ids
|
||||||
|
(fn (fetch-fn ids)
|
||||||
|
(map (fn (id) (blogimport/fetch-post fetch-fn "post-by-id" {:id id})) ids)))
|
||||||
|
|
||||||
|
; --- end-to-end drivers ---------------------------------------------------------
|
||||||
|
; backfill = enumerate -> fetch -> genesis-import (idempotent). Re-runnable as the
|
||||||
|
; one-way DB->persist sync (data-migration.md Strategy 1).
|
||||||
|
(define
|
||||||
|
blogimport/backfill!
|
||||||
|
(fn (b fetch-fn at)
|
||||||
|
(blogimport/import-all! b (blogimport/source-posts fetch-fn) at)))
|
||||||
|
|
||||||
|
(define
|
||||||
|
blogimport/backfill-ids!
|
||||||
|
(fn (b fetch-fn ids at)
|
||||||
|
(blogimport/import-all! b (blogimport/source-posts-by-ids fetch-fn ids) at)))
|
||||||
|
|
||||||
|
; sync-verify = enumerate -> fetch -> shadow-diff the persisted streams at rest.
|
||||||
|
(define
|
||||||
|
blogimport/sync-verify
|
||||||
|
(fn (b fetch-fn)
|
||||||
|
(blogimport/verify-all b (blogimport/source-posts fetch-fn))))
|
||||||
83
lib/blogimport/tests/source.sx
Normal file
83
lib/blogimport/tests/source.sx
Normal file
@@ -0,0 +1,83 @@
|
|||||||
|
; lib/blogimport/tests/source.sx — live-source adapter (Q-M4 internal-data query)
|
||||||
|
(st-bootstrap-classes!)
|
||||||
|
(content-bootstrap-blocks!)
|
||||||
|
(content-bootstrap-doc!)
|
||||||
|
(content-bootstrap-callout!)
|
||||||
|
(content-bootstrap-media!)
|
||||||
|
|
||||||
|
; ---- canned service responses (lexical arrives as a JSON STRING, the DB column) ----
|
||||||
|
(define
|
||||||
|
lex1
|
||||||
|
"{\"root\":{\"children\":[{\"type\":\"heading\",\"tag\":\"h2\",\"children\":[{\"type\":\"text\",\"text\":\"Live\"}]},{\"type\":\"paragraph\",\"children\":[{\"type\":\"text\",\"text\":\"from db\"}]}]}}")
|
||||||
|
(define
|
||||||
|
row1
|
||||||
|
{:uuid "post-1" :slug "live" :title "Live" :status "published"
|
||||||
|
:visibility "public" :tags (list "x") :authors (list "u") :lexical lex1})
|
||||||
|
(define
|
||||||
|
row2
|
||||||
|
{:uuid "post-2" :slug "two" :title "Two" :status "published"
|
||||||
|
:lexical "{\"children\":[{\"type\":\"paragraph\",\"children\":[{\"type\":\"text\",\"text\":\"second\"}]}]}"})
|
||||||
|
|
||||||
|
; ---- mock transport: (fetch-fn query params) -> response ----
|
||||||
|
(define
|
||||||
|
mock-fetch
|
||||||
|
(fn (query params)
|
||||||
|
(cond
|
||||||
|
((equal? query "published-posts") (list "post-1" "post-2"))
|
||||||
|
((equal? query "post-by-id")
|
||||||
|
(cond
|
||||||
|
((equal? (get params :id) "post-1") row1)
|
||||||
|
((equal? (get params :id) "post-2") row2)
|
||||||
|
(else nil)))
|
||||||
|
(else nil))))
|
||||||
|
|
||||||
|
; ---- parse-row maps fields + parses the lexical JSON string ----
|
||||||
|
(define post1 (blogimport/parse-row row1))
|
||||||
|
(bi-test "parse-row id from uuid" (get post1 :id) "post-1")
|
||||||
|
(bi-test "parse-row title" (get post1 :title) "Live")
|
||||||
|
(bi-test "parse-row tags" (get post1 :tags) (list "x"))
|
||||||
|
(bi-test "parse-row lexical parsed to blocks"
|
||||||
|
(map blk-type (blogimport/lex-blocks (get post1 :lexical))) (list "heading" "text"))
|
||||||
|
|
||||||
|
; ---- id fallback (:id when no :uuid) + structured (non-string) lexical ----
|
||||||
|
(define
|
||||||
|
post3
|
||||||
|
(blogimport/parse-row
|
||||||
|
{:id "post-3" :slug "s3"
|
||||||
|
:lexical {:children (list {:type "paragraph" :children (list {:type "text" :text "x"})})}}))
|
||||||
|
(bi-test "parse-row id fallback" (get post3 :id) "post-3")
|
||||||
|
(bi-test "parse-row structured lexical used as-is"
|
||||||
|
(map blk-type (blogimport/lex-blocks (get post3 :lexical))) (list "text"))
|
||||||
|
|
||||||
|
; ---- enumeration + source-posts ----
|
||||||
|
(bi-test "published-ids" (blogimport/published-ids mock-fetch) (list "post-1" "post-2"))
|
||||||
|
(bi-test "source-posts ids"
|
||||||
|
(map (fn (p) (get p :id)) (blogimport/source-posts mock-fetch))
|
||||||
|
(list "post-1" "post-2"))
|
||||||
|
|
||||||
|
; ---- end-to-end backfill from the live source ----
|
||||||
|
(define B (persist/open))
|
||||||
|
(define cov (blogimport/backfill! B mock-fetch 10))
|
||||||
|
(bi-test "backfill total" (get cov :total) 2)
|
||||||
|
(bi-test "backfill imported" (get cov :imported) 2)
|
||||||
|
(bi-test "backfill post-1 version-count" (content/version-count B "post-1") 2)
|
||||||
|
(bi-test "backfill post-1 head ids" (doc-ids (content/head B "post-1")) (list "b0" "b1"))
|
||||||
|
(bi-test "backfill post-1 body text"
|
||||||
|
(str (blk-send (doc-find (content/head B "post-1") "b1") "text")) "from db")
|
||||||
|
(bi-test "backfill meta title" (get (blogimport/load-meta B "post-1") :title) "Live")
|
||||||
|
|
||||||
|
; ---- backfill is idempotent (one-way sync re-run) ----
|
||||||
|
(define cov2 (blogimport/backfill! B mock-fetch 11))
|
||||||
|
(bi-test "backfill rerun skipped" (get cov2 :skipped) 2)
|
||||||
|
|
||||||
|
; ---- sync-verify: persisted streams match the live-source oracle ----
|
||||||
|
(define sv (blogimport/sync-verify B mock-fetch))
|
||||||
|
(bi-test "sync-verify total" (get sv :total) 2)
|
||||||
|
(bi-test "sync-verify ok" (get sv :ok) 2)
|
||||||
|
(bi-test "sync-verify no mismatch" (get sv :mismatched) (list))
|
||||||
|
|
||||||
|
; ---- explicit-id fallback path (before the enumeration query lands) ----
|
||||||
|
(define B2 (persist/open))
|
||||||
|
(define covx (blogimport/backfill-ids! B2 mock-fetch (list "post-2") 10))
|
||||||
|
(bi-test "backfill-ids imported" (get covx :imported) 1)
|
||||||
|
(bi-test "backfill-ids post-2 ids" (doc-ids (content/head B2 "post-2")) (list "b0"))
|
||||||
Reference in New Issue
Block a user