# content-on-sx: Documents, blocks & collaborative editing on Smalltalk
> **DRAFT outline.** The CMS vertical — blog, WYSIWYG editor, Ghost sync. Depends
> on `persist-on-sx` (document history as an event log). Ghost/CMS sync stays a thin
> external adapter (Python/FFI) until a native replacement exists.
rose-ash's `blog` domain is content management: a block-based WYSIWYG editor,
navigation, Ghost CMS sync. A document is a tree of live blocks; editing is a
stream of operations; collaboration needs conflict-free merge. That is an object
model — blocks are objects, edits are messages, and a document is the object graph
responding to them. Smalltalk's "everything is an object responding to messages"
maps directly to a block/WYSIWYG model, and a semilattice (CRDT) merge keeps
concurrent edits conflict-free.
End-state: a Smalltalk-on-SX document model (typed blocks, structural ops),
operation log + CRDT merge for collaborative editing, versioning/history via the
event store, and a render boundary to HTML/SX. External CMS (Ghost) sync is an
injected adapter, not core.
## Status (rolling)
`bash lib/content/conformance.sh` → **615/615** (Phases 1–4 COMPLETE + extensions: HTML/SX escaping, Markdown render + import/export incl. tables & frontmatter (full round-trip), CRDT replication, tree-aware validation, snapshot cache, doc metadata, plain-text render, nested block trees + deep tree editing, doc stats, table block, HTML page wrapper + SEO page, doc composition + id-remap, portable data + wire serialization, block query + transforms + find/replace, TOC rendering, normalization)
## Ground rules
- **Scope:** only `lib/content/**` and `plans/content-on-sx.md`. May **import**
from `lib/smalltalk/`, and (once it exists) `lib/persist/`. Do not edit substrates.
- **Architecture:** a document is an ordered tree of blocks (objects); an edit is a
message (`insert`/`update`/`move`/`delete`); concurrent edits merge via a
commutative (CRDT/semilattice) operation so order doesn't matter. History is the
`persist` event stream; any version is a replay.
- **Determinism:** merge must be commutative + idempotent (test: apply ops in any
order / twice → same document).
- **Commits:** one feature per commit. Progress log + tick boxes.
## Architecture sketch
```
Edit op Rendered document
(insert block after id) ... HTML / SX tree
│ ▲
▼ │
lib/content/block.sx lib/content/render.sx
— typed blocks as objects — block tree → HTML/SX
— heading/text/image/embed — (reuses SX render boundary)
│ ▲
▼ │
lib/content/doc.sx lib/content/merge.sx
— ordered block tree — CRDT/semilattice op merge
— apply op, structural moves — concurrent-edit reconciliation
│ ▲
▼ │
lib/content/api.sx ── (content/edit) (content/render) (content/history) ──┐
│ │
├── op log + versions → persist │
└── Ghost/CMS sync → injected external adapter (thin, non-core) ──┘
```
## Phase 1 — Block document model
- [x] `block.sx` — typed block objects
- [x] `doc.sx` — ordered tree, apply edit op, structural moves
- [x] `render.sx` — block tree → HTML/SX
- [x] `api.sx` + tests + scoreboard + conformance.sh
## Phase 2 — Op log + versioning
- [x] edit ops as `persist` events; replay to any version
- [x] `(content/history doc)`, diff between versions
## Phase 3 — Collaborative merge (CRDT)
- [x] commutative/idempotent op merge
- [x] concurrent-edit tests (any order, double-apply → identical)
## Phase 4 — External sync + federation
- [x] Ghost/CMS sync via injected adapter (import/export)
- [x] federated documents (peer-authored blocks) — trust-gated stub
- [x] tests: round-trip import/export, conflict on concurrent external edit
## Extensions (post-roadmap)
- [x] HTML escaping at the render boundary (`String>>htmlEscaped`: & < > ")
- [x] asSx wire string-escaping (`String>>sxEscaped`: \ and " in SX literals)
- [x] Markdown render mode (`asMarkdown:` / `content/render doc "md"`)
- [x] durable CRDT replication (`crdt-store.sx`: ops on persist, replay + converge)
- [x] document validation (`validate.sx`: ids, per-type fields, duplicate ids; tree-aware — descends into sections, tree-wide dup ids, section field check)
- [x] Markdown import adapter (`md-import.sx`: text → blocks, round-trips export; incl. pipe tables + frontmatter → metadata)
- [x] Markdown doc export (`md-doc.sx`: content/markdown-doc, frontmatter from metadata, full round-trip)
- [x] snapshot cache over replay (`snapshot.sx`: cache-not-primary, transparent)
- [x] document metadata (`meta.sx`: title/slug/tags + Ghost title plumbing)
- [x] plain-text render + excerpt (`text.sx`: asText, content/excerpt)
- [x] nested block trees (`section.sx`: CtSection container, recursive render, deep-find)
- [x] document statistics (`stats.sx`: word/char/block counts, reading time)
- [x] table block (`table.sx`: CtTable, renders html/sx/text/md, validated)
- [x] HTML page wrapper (`page.sx`: content/page, escaped title from metadata)
- [x] SEO page (`page-full.sx`: content/page-full, lang + meta description from excerpt)
- [x] document composition (`compose.sx`: concat/prepend/concat-all/wrap-section)
- [x] deep tree editing (`tree-edit.sx`: doc-deep-update/replace/delete/insert-into)
- [x] id remapping / clone (`clone.sx`: content/remap-ids + prefix-ids, collision-free compose)
- [x] block query + TOC (`query.sx`: content/select/select-type/count-type/headings)
- [x] block transforms (`transform.sx`: content/map-blocks/map-type/set-field-on)
- [x] TOC rendering (`toc.sx`: content/toc-markdown + toc-html from headings)
- [x] document normalization (`normalize.sx`: content/normalize, drop empty blocks/sections)
- [x] global find/replace (`find-replace.sx`: content/find-replace across text-bearing blocks)
- [x] portable data serialization (`data.sx`: content/to-data + from-data, round-trips tree)
- [x] wire serialization (`wire.sx`: content/to-wire + from-wire, SX-text on the wire)
## Progress log
- 2026-06-07 — Extension: global find/replace (`find-replace.sx`).
`content/find-replace` replaces every occurrence of a substring in the text
field of text/heading/code/quote blocks tree-wide (via the transform layer) —
rename a term throughout a doc. Leaves non-text fields (image alt/src) alone;
immutable, case-sensitive. 10 tests; suite 615/615.
- 2026-06-07 — Extension: document normalization (`normalize.sx`).
`content/normalize` drops empty text blocks and empty sections tree-wide;
sections are normalised first so one emptied by the pass is itself removed.
For tidying imported/edited docs; non-text empties (dividers, blank-alt images)
preserved. Inline tree handling; immutable. 11 tests; suite 605/605.
- 2026-06-07 — Extension: table-of-contents rendering (`toc.sx`).
`content/toc-markdown` produces a Markdown bullet list indented by heading
level with `[text](#id)` links; `content/toc-html` produces a `
` of escaped
anchor links (`#id`). Built from `content/headings`; the blog page's TOC
artifact. 8 tests; suite 594/594.
- 2026-06-07 — Extension: tree-wide block transforms (`transform.sx`). The write
counterpart to query: `content/map-blocks` (predicate) / `content/map-type` /
`content/set-field-on` apply a function to every matching block across the tree
(sections rebuilt), for bulk edits (cdn src rewrites, heading-level bumps, text
sanitisation). Inline tree rebuild (no section.sx dep); immutable. 12 tests;
suite 586/586.
- 2026-06-07 — Extension: block query + TOC (`query.sx`). `content/select`
(predicate) / `content/select-type` / `content/count-type` / `content/select-ids`
collect blocks across the whole tree (sections recurse); `content/headings`
derives a table of contents (`{:id :level :text}` per heading, document order).
Inline tree detection (no section.sx dep). 13 tests; suite 574/574.
- 2026-06-07 — Extension: id remapping / clone (`clone.sx`).
`content/remap-ids` deep-rewrites every block id across the tree (sections
recurse) via a function; `content/prefix-ids` prefixes them. Enables
collision-free composition (prefix each doc before concat → validates clean,
where the unprefixed concat has duplicate ids). Content unchanged, only ids;
immutable. 10 tests; suite 561/561.
- 2026-06-07 — Extension: deep tree editing (`tree-edit.sx`). `doc-deep-update`
/ `doc-deep-replace` / `doc-deep-delete` / `doc-deep-insert-into` mutate blocks
anywhere in the nested tree (descending into CtSection children), completing
tree mutation to match the deep-find read path; all immutable. 17 tests; suite
551/551.
- 2026-06-07 — Extension: on-the-wire serialization (`wire.sx`).
`content/to-wire` serialises a document to a transmittable SX-text string (data
form + SX serializer); `content/from-wire` parses it back into a live document.
The whole-document format for persistence / HTTP / federation (distinct from
the per-op persist log); round-trips nested trees + tables; reads externally
authored wire strings. 11 tests; suite 534/534.
- 2026-06-07 — Extension: portable data serialization (`data.sx`).
`content/to-data` converts a document to plain SX data
(`{:id :title :slug :tags :blocks [{:id :type :fields}]}`, sections recursing);
`content/from-data` reconstructs real block objects (section/table handled
specially, others generically via mk-block). Round-trips the full tree +
metadata (render-equal), decoupling storage/transport from the Smalltalk
instance shape. 21 tests; suite 523/523.
- 2026-06-07 — Extension: document composition (`compose.sx`). `content/concat`
/ `content/prepend` / `content/concat-all` combine documents (keeping the
first's id + metadata, concatenating blocks, immutable); `content/wrap-section`
collapses a doc's blocks into a single nested section. For assembling pages
from header/body/footer parts and templates. 17 tests; suite 502/502.
- 2026-06-07 — Extension: SEO-complete page (`page-full.sx`). `content/page-full`
extends content/page with `` and a ``
drawn from the document excerpt (plain text, escaped, 160 chars), composing the
page/metadata/text layers into the SEO-ready artifact. 4 tests; suite 485/485.
- 2026-06-07 — Extension: Markdown document export (`md-doc.sx`).
`content/markdown-doc` emits a `---` frontmatter block from metadata
(title/slug/tags, only present fields) ahead of the Markdown body, or plain
asMarkdown when there's no metadata. Completes the metadata round-trip:
`md/import ∘ content/markdown-doc` preserves title/slug/tags + blocks. 12
tests; suite 481/481.
- 2026-06-07 — Extension: Markdown frontmatter. `md/import` parses a leading
`---` / `key: value` / `---` block into document metadata (title, slug,
comma-separated tags via `doc-with-meta`) before parsing the body; a `---`
elsewhere stays a divider. Ties the Markdown importer to the metadata layer the
way real blog posts work. +9 tests; suite 469/469.
- 2026-06-07 — Extension: Markdown table import. `md-import.sx` now recognizes a
`| … |` header row followed by a `| --- |` separator and parses a `CtTable`
(cells trimmed, mixed with other blocks via blank-line separation), completing
the Markdown table round-trip (import∘export == identity). +5 tests; suite
460/460.
- 2026-06-07 — Extension: HTML page wrapper (`page.sx`). `content/page` composes
metadata + render into a minimal valid HTML5 document — escaped `` from
doc metadata (falling back to id) and the rendered blocks as the body.
`content/page-title`. The shippable artifact the blog serves. 7 tests; suite
455/455.
- 2026-06-07 — Extension: table block (`table.sx`). `CtTable` holds headers +
rows (string lists); answers asHTML (escaped ``), asSx, asText, and
asMarkdown: (pipe table with dashed separator row) by folding rows×cells via
nested `inject:into:`. Self-contained (no edits to block.sx/render.sx);
`mk-table`, `table?`, `table-headers/rows`. validate.sx gained a `table` field
case (headers/rows must be lists). 15 tests; suite 448/448.
- 2026-06-07 — Extension: document statistics (`stats.sx`). `content/stats`
returns `{:words :chars :blocks :reading-minutes}`; word/char counts derive
from the tree-accurate `asText` projection, block count from an inline tree
walk (no section.sx dep), reading time at 200 wpm rounded up. Counts descend
into nested sections. 17 tests; suite 433/433.
- 2026-06-07 — Refinement: tree-aware validation. `validate.sx` now flattens the
whole block tree (descending into `CtSection` children, guarding malformed
non-list children) so field checks and duplicate-id detection cover nested
blocks and span section boundaries; added a `section` field-type case. Inline
tree detection (class + st-iv-get) keeps it free of a section.sx dependency.
+6 tests; suite 416/416.
- 2026-06-07 — Extension: nested block trees (`section.sx`). `CtSection` is a
block whose `children` ivar is a list of blocks (incl. nested sections →
arbitrary depth), turning the flat document into the ordered TREE from the
architecture sketch. Self-contained: it answers asHTML/asSx/asText/asMarkdown:
by folding children's renderings (pure polymorphic recursion — no changes to
block.sx/render.sx). `mk-section`, `section-children`, `section-append` (cow),
and tree traversal `doc-deep-find` / `doc-tree-ids` / `doc-tree-count` that
descend into sections. 25 tests; suite 410/410.
- 2026-06-07 — Extension: plain-text render + excerpts (`text.sx`). Fourth
boundary format via polymorphic `asText` (heading/text/code/quote→text,
image→alt, embed/divider→"", list→", "-joined); the document joins non-empty
child texts with a space. `content/render doc "text"`, `content/text`,
`content/excerpt doc n` (first n chars + "…" if truncated). For previews,
meta-descriptions, search indexing. 20 tests; suite 385/385.
- 2026-06-07 — Extension: document metadata (`meta.sx`). CtDoc gained optional
title/slug/tags ivars (declared in doc.sx, default nil/empty, no effect on
block ops). Reads via message dispatch; copy-on-write setters
(`doc-with-title/slug/tags`, `doc-add-tag`, `doc-with-meta`, `doc-new-meta`)
and `content/*` aliases; `doc-meta` returns the metadata dict. Ghost adapter
now carries `:title` through import/export/round-trip. 27 tests; suite 365/365.
- 2026-06-07 — Extension: snapshot cache over op-log replay (`snapshot.sx`).
Snapshots are a cache, never primary state — the log stays the source of truth.
`content/snapshot!` stores a materialised head at a seq in the persist KV;
`content/head-cached` / `content/at-cached` start from the nearest snapshot and
replay only the tail, returning a document IDENTICAL to a full replay (tests
assert transparency before/after snapshot, across versions, and after
drop-snapshot fallback). `content/has-snapshot?` / `snapshot-seq` /
`drop-snapshot!`. 20 tests; suite 338/338.
- 2026-06-07 — Extension: Markdown import adapter (`md-import.sx`), inverse of
asMarkdown. Line-based parser: ATX headings, fenced code (```lang), blockquotes,
unordered/ordered lists (grouping consecutive items), thematic breaks,
paragraphs (consecutive plain lines joined with a space). Sequential ids
b0,b1…. `md/import` / `content/from-markdown` / `markdown-adapter` (import +
asMarkdown export). Round-trips canonical Markdown (import∘export == identity);
imported docs pass validation. 24 tests; suite 318/318.
- 2026-06-07 — Extension: document validation (`validate.sx`). `content/validate`
returns issue dicts `{:id :kind :detail}` (empty = valid); `content/valid?`
and `content/issue-kinds` convenience. Checks block id (non-empty string),
per-type required fields/types (heading level number, image src/alt strings,
list ordered boolean + items list, etc.), unknown block types, and
document-level duplicate ids. Guards imports/edits/federated input. 17 tests;
suite 294/294.
- 2026-06-07 — Extension: durable CRDT replication (`crdt-store.sx`), uniting
Phase 2 (persist) + Phase 3 (CvRDT). Each replica appends its CRDT ops to its
own stream (`crdt::`); `crdt/replay` folds one log into a state,
`crdt/converge` merges every replica's replayed state, `crdt/document` /
`crdt/order` materialise. Converged result is identical regardless of replica
order or duplicate delivery (join + idempotent apply) → offline-capable,
eventually-consistent editing. 14 tests; suite 277/277.
- 2026-06-07 — Extension: Markdown render mode (`markdown.sx`). Third boundary
format alongside asHTML/asSx via the same polymorphic dispatch; blocks answer
`asMarkdown: nl` (boundary supplies the newline — this Smalltalk dialect has
no Character newline ctor). `content/render doc "md"`/`"markdown"`/`:md`,
`content/markdown`, `asMarkdown`. headings (`#`×level), fenced code, `> ` quote,
``, `- `/`1. ` lists, `---`; doc joins blocks with a blank line. No
MD escaping yet. 20 tests; suite 263/263.
- 2026-06-07 — Extension: asSx wire string-escaping. Added `String>>sxEscaped`
(escapes `\`→`\\` then `"`→`\"`) and routed every `asSx` text/attr/list-item
through it, so the SX wire format stays valid when content contains quotes or
backslashes. +5 render tests (expected strings built from `q`/`bs` helpers to
avoid escaping miscounts). Suite 243/243.
- 2026-06-07 — Extension: HTML escaping at the render boundary. Added
`String>>htmlEscaped` (recursive char walk escaping & < > ", order-safe so &
isn't double-escaped) and routed every `asHTML` text/attr through it — heading,
text, code body + language, quote, image src/alt, embed url, list items.
Render stays fully polymorphic in Smalltalk; escaping lives at the boundary.
+8 render tests (incl. `