# content-on-sx: Documents, blocks & collaborative editing on Smalltalk > **DRAFT outline.** The CMS vertical — blog, WYSIWYG editor, Ghost sync. Depends > on `persist-on-sx` (document history as an event log). Ghost/CMS sync stays a thin > external adapter (Python/FFI) until a native replacement exists. rose-ash's `blog` domain is content management: a block-based WYSIWYG editor, navigation, Ghost CMS sync. A document is a tree of live blocks; editing is a stream of operations; collaboration needs conflict-free merge. That is an object model — blocks are objects, edits are messages, and a document is the object graph responding to them. Smalltalk's "everything is an object responding to messages" maps directly to a block/WYSIWYG model, and a semilattice (CRDT) merge keeps concurrent edits conflict-free. End-state: a Smalltalk-on-SX document model (typed blocks, structural ops), operation log + CRDT merge for collaborative editing, versioning/history via the event store, and a render boundary to HTML/SX. External CMS (Ghost) sync is an injected adapter, not core. ## Status (rolling) `bash lib/content/conformance.sh` → **551/551** (Phases 1–4 COMPLETE + extensions: HTML/SX escaping, Markdown render + import/export incl. tables & frontmatter (full round-trip), CRDT replication, tree-aware validation, snapshot cache, doc metadata, plain-text render, nested block trees + deep tree editing, doc stats, table block, HTML page wrapper + SEO page, doc composition, portable data + wire serialization) ## Ground rules - **Scope:** only `lib/content/**` and `plans/content-on-sx.md`. May **import** from `lib/smalltalk/`, and (once it exists) `lib/persist/`. Do not edit substrates. - **Architecture:** a document is an ordered tree of blocks (objects); an edit is a message (`insert`/`update`/`move`/`delete`); concurrent edits merge via a commutative (CRDT/semilattice) operation so order doesn't matter. History is the `persist` event stream; any version is a replay. - **Determinism:** merge must be commutative + idempotent (test: apply ops in any order / twice → same document). - **Commits:** one feature per commit. Progress log + tick boxes. ## Architecture sketch ``` Edit op Rendered document (insert block after id) ... HTML / SX tree │ ▲ ▼ │ lib/content/block.sx lib/content/render.sx — typed blocks as objects — block tree → HTML/SX — heading/text/image/embed — (reuses SX render boundary) │ ▲ ▼ │ lib/content/doc.sx lib/content/merge.sx — ordered block tree — CRDT/semilattice op merge — apply op, structural moves — concurrent-edit reconciliation │ ▲ ▼ │ lib/content/api.sx ── (content/edit) (content/render) (content/history) ──┐ │ │ ├── op log + versions → persist │ └── Ghost/CMS sync → injected external adapter (thin, non-core) ──┘ ``` ## Phase 1 — Block document model - [x] `block.sx` — typed block objects - [x] `doc.sx` — ordered tree, apply edit op, structural moves - [x] `render.sx` — block tree → HTML/SX - [x] `api.sx` + tests + scoreboard + conformance.sh ## Phase 2 — Op log + versioning - [x] edit ops as `persist` events; replay to any version - [x] `(content/history doc)`, diff between versions ## Phase 3 — Collaborative merge (CRDT) - [x] commutative/idempotent op merge - [x] concurrent-edit tests (any order, double-apply → identical) ## Phase 4 — External sync + federation - [x] Ghost/CMS sync via injected adapter (import/export) - [x] federated documents (peer-authored blocks) — trust-gated stub - [x] tests: round-trip import/export, conflict on concurrent external edit ## Extensions (post-roadmap) - [x] HTML escaping at the render boundary (`String>>htmlEscaped`: & < > ") - [x] asSx wire string-escaping (`String>>sxEscaped`: \ and " in SX literals) - [x] Markdown render mode (`asMarkdown:` / `content/render doc "md"`) - [x] durable CRDT replication (`crdt-store.sx`: ops on persist, replay + converge) - [x] document validation (`validate.sx`: ids, per-type fields, duplicate ids; tree-aware — descends into sections, tree-wide dup ids, section field check) - [x] Markdown import adapter (`md-import.sx`: text → blocks, round-trips export; incl. pipe tables + frontmatter → metadata) - [x] Markdown doc export (`md-doc.sx`: content/markdown-doc, frontmatter from metadata, full round-trip) - [x] snapshot cache over replay (`snapshot.sx`: cache-not-primary, transparent) - [x] document metadata (`meta.sx`: title/slug/tags + Ghost title plumbing) - [x] plain-text render + excerpt (`text.sx`: asText, content/excerpt) - [x] nested block trees (`section.sx`: CtSection container, recursive render, deep-find) - [x] document statistics (`stats.sx`: word/char/block counts, reading time) - [x] table block (`table.sx`: CtTable, renders html/sx/text/md, validated) - [x] HTML page wrapper (`page.sx`: content/page, escaped title from metadata) - [x] SEO page (`page-full.sx`: content/page-full, lang + meta description from excerpt) - [x] document composition (`compose.sx`: concat/prepend/concat-all/wrap-section) - [x] deep tree editing (`tree-edit.sx`: doc-deep-update/replace/delete/insert-into) - [x] portable data serialization (`data.sx`: content/to-data + from-data, round-trips tree) - [x] wire serialization (`wire.sx`: content/to-wire + from-wire, SX-text on the wire) ## Progress log - 2026-06-07 — Extension: deep tree editing (`tree-edit.sx`). `doc-deep-update` / `doc-deep-replace` / `doc-deep-delete` / `doc-deep-insert-into` mutate blocks anywhere in the nested tree (descending into CtSection children), completing tree mutation to match the deep-find read path; all immutable. 17 tests; suite 551/551. - 2026-06-07 — Extension: on-the-wire serialization (`wire.sx`). `content/to-wire` serialises a document to a transmittable SX-text string (data form + SX serializer); `content/from-wire` parses it back into a live document. The whole-document format for persistence / HTTP / federation (distinct from the per-op persist log); round-trips nested trees + tables; reads externally authored wire strings. 11 tests; suite 534/534. - 2026-06-07 — Extension: portable data serialization (`data.sx`). `content/to-data` converts a document to plain SX data (`{:id :title :slug :tags :blocks [{:id :type :fields}]}`, sections recursing); `content/from-data` reconstructs real block objects (section/table handled specially, others generically via mk-block). Round-trips the full tree + metadata (render-equal), decoupling storage/transport from the Smalltalk instance shape. 21 tests; suite 523/523. - 2026-06-07 — Extension: document composition (`compose.sx`). `content/concat` / `content/prepend` / `content/concat-all` combine documents (keeping the first's id + metadata, concatenating blocks, immutable); `content/wrap-section` collapses a doc's blocks into a single nested section. For assembling pages from header/body/footer parts and templates. 17 tests; suite 502/502. - 2026-06-07 — Extension: SEO-complete page (`page-full.sx`). `content/page-full` extends content/page with `` and a `` drawn from the document excerpt (plain text, escaped, 160 chars), composing the page/metadata/text layers into the SEO-ready artifact. 4 tests; suite 485/485. - 2026-06-07 — Extension: Markdown document export (`md-doc.sx`). `content/markdown-doc` emits a `---` frontmatter block from metadata (title/slug/tags, only present fields) ahead of the Markdown body, or plain asMarkdown when there's no metadata. Completes the metadata round-trip: `md/import ∘ content/markdown-doc` preserves title/slug/tags + blocks. 12 tests; suite 481/481. - 2026-06-07 — Extension: Markdown frontmatter. `md/import` parses a leading `---` / `key: value` / `---` block into document metadata (title, slug, comma-separated tags via `doc-with-meta`) before parsing the body; a `---` elsewhere stays a divider. Ties the Markdown importer to the metadata layer the way real blog posts work. +9 tests; suite 469/469. - 2026-06-07 — Extension: Markdown table import. `md-import.sx` now recognizes a `| … |` header row followed by a `| --- |` separator and parses a `CtTable` (cells trimmed, mixed with other blocks via blank-line separation), completing the Markdown table round-trip (import∘export == identity). +5 tests; suite 460/460. - 2026-06-07 — Extension: HTML page wrapper (`page.sx`). `content/page` composes metadata + render into a minimal valid HTML5 document — escaped `` from doc metadata (falling back to id) and the rendered blocks as the body. `content/page-title`. The shippable artifact the blog serves. 7 tests; suite 455/455. - 2026-06-07 — Extension: table block (`table.sx`). `CtTable` holds headers + rows (string lists); answers asHTML (escaped `<table>`), asSx, asText, and asMarkdown: (pipe table with dashed separator row) by folding rows×cells via nested `inject:into:`. Self-contained (no edits to block.sx/render.sx); `mk-table`, `table?`, `table-headers/rows`. validate.sx gained a `table` field case (headers/rows must be lists). 15 tests; suite 448/448. - 2026-06-07 — Extension: document statistics (`stats.sx`). `content/stats` returns `{:words :chars :blocks :reading-minutes}`; word/char counts derive from the tree-accurate `asText` projection, block count from an inline tree walk (no section.sx dep), reading time at 200 wpm rounded up. Counts descend into nested sections. 17 tests; suite 433/433. - 2026-06-07 — Refinement: tree-aware validation. `validate.sx` now flattens the whole block tree (descending into `CtSection` children, guarding malformed non-list children) so field checks and duplicate-id detection cover nested blocks and span section boundaries; added a `section` field-type case. Inline tree detection (class + st-iv-get) keeps it free of a section.sx dependency. +6 tests; suite 416/416. - 2026-06-07 — Extension: nested block trees (`section.sx`). `CtSection` is a block whose `children` ivar is a list of blocks (incl. nested sections → arbitrary depth), turning the flat document into the ordered TREE from the architecture sketch. Self-contained: it answers asHTML/asSx/asText/asMarkdown: by folding children's renderings (pure polymorphic recursion — no changes to block.sx/render.sx). `mk-section`, `section-children`, `section-append` (cow), and tree traversal `doc-deep-find` / `doc-tree-ids` / `doc-tree-count` that descend into sections. 25 tests; suite 410/410. - 2026-06-07 — Extension: plain-text render + excerpts (`text.sx`). Fourth boundary format via polymorphic `asText` (heading/text/code/quote→text, image→alt, embed/divider→"", list→", "-joined); the document joins non-empty child texts with a space. `content/render doc "text"`, `content/text`, `content/excerpt doc n` (first n chars + "…" if truncated). For previews, meta-descriptions, search indexing. 20 tests; suite 385/385. - 2026-06-07 — Extension: document metadata (`meta.sx`). CtDoc gained optional title/slug/tags ivars (declared in doc.sx, default nil/empty, no effect on block ops). Reads via message dispatch; copy-on-write setters (`doc-with-title/slug/tags`, `doc-add-tag`, `doc-with-meta`, `doc-new-meta`) and `content/*` aliases; `doc-meta` returns the metadata dict. Ghost adapter now carries `:title` through import/export/round-trip. 27 tests; suite 365/365. - 2026-06-07 — Extension: snapshot cache over op-log replay (`snapshot.sx`). Snapshots are a cache, never primary state — the log stays the source of truth. `content/snapshot!` stores a materialised head at a seq in the persist KV; `content/head-cached` / `content/at-cached` start from the nearest snapshot and replay only the tail, returning a document IDENTICAL to a full replay (tests assert transparency before/after snapshot, across versions, and after drop-snapshot fallback). `content/has-snapshot?` / `snapshot-seq` / `drop-snapshot!`. 20 tests; suite 338/338. - 2026-06-07 — Extension: Markdown import adapter (`md-import.sx`), inverse of asMarkdown. Line-based parser: ATX headings, fenced code (```lang), blockquotes, unordered/ordered lists (grouping consecutive items), thematic breaks, paragraphs (consecutive plain lines joined with a space). Sequential ids b0,b1…. `md/import` / `content/from-markdown` / `markdown-adapter` (import + asMarkdown export). Round-trips canonical Markdown (import∘export == identity); imported docs pass validation. 24 tests; suite 318/318. - 2026-06-07 — Extension: document validation (`validate.sx`). `content/validate` returns issue dicts `{:id :kind :detail}` (empty = valid); `content/valid?` and `content/issue-kinds` convenience. Checks block id (non-empty string), per-type required fields/types (heading level number, image src/alt strings, list ordered boolean + items list, etc.), unknown block types, and document-level duplicate ids. Guards imports/edits/federated input. 17 tests; suite 294/294. - 2026-06-07 — Extension: durable CRDT replication (`crdt-store.sx`), uniting Phase 2 (persist) + Phase 3 (CvRDT). Each replica appends its CRDT ops to its own stream (`crdt:<doc>:<replica>`); `crdt/replay` folds one log into a state, `crdt/converge` merges every replica's replayed state, `crdt/document` / `crdt/order` materialise. Converged result is identical regardless of replica order or duplicate delivery (join + idempotent apply) → offline-capable, eventually-consistent editing. 14 tests; suite 277/277. - 2026-06-07 — Extension: Markdown render mode (`markdown.sx`). Third boundary format alongside asHTML/asSx via the same polymorphic dispatch; blocks answer `asMarkdown: nl` (boundary supplies the newline — this Smalltalk dialect has no Character newline ctor). `content/render doc "md"`/`"markdown"`/`:md`, `content/markdown`, `asMarkdown`. headings (`#`×level), fenced code, `> ` quote, `![alt](src)`, `- `/`1. ` lists, `---`; doc joins blocks with a blank line. No MD escaping yet. 20 tests; suite 263/263. - 2026-06-07 — Extension: asSx wire string-escaping. Added `String>>sxEscaped` (escapes `\`→`\\` then `"`→`\"`) and routed every `asSx` text/attr/list-item through it, so the SX wire format stays valid when content contains quotes or backslashes. +5 render tests (expected strings built from `q`/`bs` helpers to avoid escaping miscounts). Suite 243/243. - 2026-06-07 — Extension: HTML escaping at the render boundary. Added `String>>htmlEscaped` (recursive char walk escaping & < > ", order-safe so & isn't double-escaped) and routed every `asHTML` text/attr through it — heading, text, code body + language, quote, image src/alt, embed url, list items. Render stays fully polymorphic in Smalltalk; escaping lives at the boundary. +8 render tests (incl. `<script>` payloads, attr breakout, ampersand-once). asSx wire-escaping deferred to next. Suite 238/238. - 2026-06-07 — Phase 4 `fed.sx` (**Phase 4 COMPLETE — roadmap done**): trust-gated federation. Peer ops carry provenance (`:author`, `:sig` stub); none are auto-accepted. The trust gate is a pluggable predicate (acl-on-sx hook) with a trusted-actor-list convenience stub. `content/merge-peer[-with]` applies only accepted ops through the CvRDT and quarantines the rest (`{:state :accepted :rejected}`). Concurrent local/external edits reconcile deterministically: same-field LWW by (ts,actor), commutative, idempotent; untrusted ops never touch state. 20 tests; suite 230/230. - 2026-06-07 — Phase 4 `sync.sx` (cb1): external CMS sync via an injected adapter. Core defines the shape — `{:import :export}` — and delegates; `content/import` / `content/export` / `content/round-trip` know nothing about Ghost. A Ghost-flavoured adapter confines all format translation (post `:sections` ↔ content blocks, all 8 kinds). Swapping in a stub `raw-adapter` works identically. Round-trip (export∘import and import∘export) preserves ids, types, fields, order. 14 tests; suite 210/210. Next: trust-gated federation + concurrent-external-edit conflict (via CRDT). - 2026-06-07 — Phase 3 `crdt.sx` (**Phase 3 complete**): collaborative merge as a state-based CvRDT. Merge is a join (lub) on a semilattice → commutative, associative, idempotent by construction. Ordering = unique dense Logoot position keys (cell = (digit actor), lexicographic); presence = OR-tombstones (remove-wins); each field = an LWW-Register keyed by logical (ts, actor). Every op contributes a PARTIAL element and per-id state is their join, so update-/delete-before-insert are not lost. `crdt-materialize` bridges back to a Phase-1 `CtDoc` (sort live elements by pos → blocks). Tests prove: ops in any order converge, double-apply is a no-op, merge commutes/associates/is idempotent, concurrent inserts order deterministically, same-field LWW by (ts,actor), disjoint fields both survive, two divergent replicas converge both ways. 34 tests; suite 196/196. - 2026-06-07 — Phase 2 `store.sx` (**Phase 2 complete**): op log + versioning over the persist event stream. `content/commit!` appends an edit op as a persist event to the doc's stream (`content:<id>`); the log is the source of truth. `content/head` / `content/at b id seq` replay the op stream to the latest / any version (materialised doc is a cache, never primary state). `content/history` returns per-version metadata; `content/diff` / `content/diff-versions` report added/removed/changed block ids. Backend is injected via `(persist/open)` — content knows nothing about which backend. Minimal persist load (event/backend/log/kv/api). 29 tests; suite 162/162. - 2026-06-07 — Phase 1 `api.sx` (**Phase 1 complete**): `content/*` facade over block + doc + render. `content/bootstrap!` registers the hierarchy; `content/edit` applies one op or an op stream; `content/render` picks the boundary format ("html"/"sx" or keyword). Re-exports `content/new`, `content/append`, `content/insert|update|move|delete`, `content/find`, etc. `content/op?` distinguishes a single op from a list/block. 26 tests; suite 133/133. content/history deferred to Phase 2 (needs the persist op log). - 2026-06-07 — Phase 1 `render.sx`: render boundary as polymorphic message dispatch. Every block and `CtDoc` answers `asHTML` / `asSx`; the document folds children via Smalltalk `inject:into:` (works on raw SX lists), so `(asHTML doc)` / `(asSx doc)` are pure sends with zero type-switching in SX. Lists/headings render in Smalltalk source. No HTML escaping yet (noted in render.sx — boundary concern before untrusted content). 29 tests; suite 107/107. - 2026-06-06 — Phase 1 `doc.sx`: ordered block document (`CtDoc`) as a Smalltalk object holding an ordered block sequence. Edit ops are data dicts (`insert`/`update`/`move`/`delete`) with `op-*` constructors; `doc-apply` / `doc-apply-all` interpret an op stream, each returning a NEW document (input never mutated → replay-safe). Structural moves, insert-after/at, find/index, immutability all tested. 40 tests; suite 78/78. - 2026-06-06 — Phase 1 `block.sx`: typed block objects as Smalltalk instances (`CtBlock` hierarchy: text/heading/code/quote/image/embed/divider/list). Type tag + accessors are message sends (polymorphic dispatch); fields are immutable copy-on-write via functional `st-iv-set!` (history-safe). Added `mk-*` constructors, `block?` predicate, `lib/content/conformance.sh` + scoreboard. 38/38. ## Blockers - Smalltalk-only load chain (tokenizer/parser/runtime/eval) does **not** load `lib/r7rs.sx`/`spec/stdlib.sx`, so r7rs aliases (`car`/`cdr`/`null?`) are absent. Use base SX primitives (`first`/`rest`/`(= (len x) 0)`) in `lib/content/**`. Not a substrate bug — just the load surface.