Files
mono/docs/ghost-removal-plan.md
giles 0bb57136d2 Add sexpr.js runtime plan and comprehensive Ghost removal plan
Two planning documents for the next major architectural steps:
- sexpr-js-runtime-plan: isomorphic JS s-expression runtime for
  client-side rendering, content-addressed component caching,
  and native hypermedia mutations
- ghost-removal-plan: full Ghost CMS replacement covering content
  (Lexical→sexp), membership, newsletters, Stripe subscriptions,
  and media uploads

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 12:53:12 +00:00

334 lines
14 KiB
Markdown

# Ghost Removal Plan
**Replace Ghost CMS entirely with native infrastructure.**
---
## What Ghost Currently Provides
Ghost is deeply integrated across three major areas:
### 1. Content Management (blog service)
- Post/page storage in Lexical JSON format
- Author and tag entities with many-to-many relationships
- WYSIWYG editing via Ghost Admin API
- Media uploads (images, audio/video, files) via Ghost's upload endpoints
- OEmbed lookups for embedded media
- Content sync: Ghost → local DB via Content API + webhooks
- ActivityPub publishing triggered after post sync
### 2. Membership & Subscriptions (account service)
- **Members**: Ghost is the member store — users have `ghost_id`, synced bidirectionally
- **Labels**: tagging/segmentation of members (M2M via `GhostLabel` / `UserLabel`)
- **Newsletters**: newsletter entities with per-user subscription tracking (`GhostNewsletter` / `UserNewsletter` with `subscribed` flag)
- **Tiers**: membership levels (free/paid) stored in `GhostTier`
- **Subscriptions**: paid plans with Stripe integration — cadence, price, Stripe customer/subscription IDs stored in `GhostSubscription`
- **Bidirectional sync**: Ghost → DB (`sync_all_membership_from_ghost`, `sync_single_member`) and DB → Ghost (`sync_member_to_ghost`)
### 3. Infrastructure
- JWT token generation for Admin API (`ghost_admin_token.py`)
- Webhook handlers for real-time sync (member, post, page, author, tag events)
- Email campaign sending (newsletter selection on publish, `email_segment` parameter)
- Stripe payment processing for paid subscriptions (handled entirely by Ghost)
- Ghost Docker container (Node.js app alongside our Python stack)
### Environment Variables
```
GHOST_API_URL, GHOST_ADMIN_API_URL, GHOST_PUBLIC_URL
GHOST_CONTENT_API_KEY, GHOST_ADMIN_API_KEY
GHOST_WEBHOOK_SECRET
```
### Ghost-Related Files
```
blog/bp/blog/ghost/ghost_sync.py # Content fetch & sync
blog/bp/blog/ghost/ghost_posts.py # Post CRUD via Admin API
blog/bp/blog/ghost/ghost_admin_token.py # JWT generation
blog/bp/blog/ghost/lexical_validator.py # Lexical JSON validation
blog/bp/blog/ghost/editor_api.py # Media upload proxy
blog/bp/blog/ghost_db.py # Ghost DB client
blog/bp/blog/web_hooks/routes.py # Webhook handlers
shared/infrastructure/ghost_admin_token.py # JWT generation (shared copy)
shared/models/ghost_content.py # Post, Author, Tag, junction tables
shared/models/ghost_membership_entities.py # Label, Newsletter, Tier, Subscription
account/services/ghost_membership.py # Membership sync service
```
### Ghost-Related Database Tables
```
# Content
posts (ghost_id, uuid, slug, title, status, lexical, html, ...)
authors (ghost_id, slug, name, email, ...)
tags (ghost_id, slug, name, ...)
post_authors (post_id, author_id, sort_order)
post_tags (post_id, tag_id, sort_order)
# Membership
ghost_labels (ghost_id, name, slug)
user_labels (user_id, label_id)
ghost_newsletters (ghost_id, name, slug, description)
user_newsletters (user_id, newsletter_id, subscribed)
ghost_tiers (ghost_id, name, slug, type, visibility)
ghost_subscriptions (ghost_id, user_id, status, cadence, price_amount,
price_currency, stripe_customer_id, stripe_subscription_id,
tier_id, raw)
# User model fields
users.ghost_id, users.ghost_status, users.ghost_subscribed,
users.ghost_note, users.ghost_raw, users.stripe_customer_id
```
---
## Problems
- **Two sources of truth** for content AND membership — constant sync overhead
- Every edit round-trips through Ghost's API — we don't own the write path
- Ghost sync is fragile (advisory locks, error recovery, partial sync states)
- Lexical JSON is opaque — we validate but never truly control the format
- Ghost is an entire Node.js application running alongside our Python stack
- Stripe integration is locked inside Ghost — we can't customize payment flows
- Newsletter/email is Ghost-native — no control over templates, scheduling, deliverability
- Membership tiers are Ghost concepts that don't map cleanly to our cooperative model
---
## Target State
Everything Ghost does is handled natively by our services:
| Ghost Feature | Replacement |
|---|---|
| Post/page content | Sexp in `posts.body_sexp` column |
| Lexical editor | WYSIWYG editor saving sexp directly to DB |
| Media uploads | Direct upload to our storage (S3/local) — blog service endpoint |
| Authors | Already in our DB — just drop `ghost_id` column |
| Tags | Already in our DB — just drop `ghost_id` column |
| Members | Already our `users` table — drop Ghost sync, Ghost fields |
| Labels | Rename `ghost_labels``labels`, drop `ghost_id` |
| Newsletters | Native newsletter service (see Phase 7 below) |
| Tiers | Native membership tiers on `account` service |
| Subscriptions | Direct Stripe integration on `orders` service (already has SumUp) |
| Email sending | Transactional email service (Postmark/SES/SMTP) |
| Webhooks | Not needed — we own the write path |
| Ghost Docker container | Removed entirely |
---
## Migration Phases
### Phase 1: Lexical → Sexp Converter
Write a one-time conversion script that transforms Lexical JSON into equivalent sexp.
| Lexical Node | S-expression |
|---|---|
| `paragraph` | `(p ...)` |
| `heading` (level 1-6) | `(h1 ...)` ... `(h6 ...)` |
| `text` (plain) | `"string"` |
| `text` (bold) | `(strong "string")` |
| `text` (italic) | `(em "string")` |
| `text` (bold+italic) | `(strong (em "string"))` |
| `text` (code) | `(code "string")` |
| `link` | `(a :href "url" "text")` |
| `list` (bullet) | `(ul (li ...) ...)` |
| `list` (number) | `(ol (li ...) ...)` |
| `quote` | `(blockquote ...)` |
| `image` | `(use "image" :src "url" :alt "text" :caption "text")` |
| `code-block` | `(pre (code :class "language-x" "..."))` |
| `divider` | `(hr)` |
| `embed` | `(use "embed" :url "..." :type "...")` |
Run against all existing posts, verify round-trip fidelity by rendering both versions and comparing HTML output.
### Phase 2: Schema Changes — Content
- Add `body_sexp` text column to `Post` model (or repurpose `lexical` column)
- Keep all existing metadata columns (title, slug, status, published_at, feature_image, etc.)
- Drop `ghost_id` from `Post`, `Author`, `Tag` tables (after full migration)
- Drop `mobiledoc` column (legacy Ghost format, unused)
### Phase 3: Editor Integration
Update the WYSIWYG editor to save sexp instead of Lexical JSON:
- Editor toolbar actions produce sexp nodes
- Save endpoint writes directly to our DB (no Ghost Admin API call)
- Preview renders via the same sexp pipeline used for the public view
- Draft/publish workflow stays the same — just a `status` column update
### Phase 4: Media Uploads
Replace Ghost's upload proxy with native endpoints on the blog service:
- `POST /admin/upload/image/` — accept image, store to S3/local, return URL
- `POST /admin/upload/media/` — audio/video
- `POST /admin/upload/file/` — generic files
- `GET /admin/oembed/?url=...` — OEmbed lookup (call providers directly)
The editor already posts to proxy endpoints in `editor_api.py` — just retarget them to store directly rather than forwarding to Ghost.
### Phase 5: Rendering Pipeline
Update `post_data()` and related functions:
- Parse `body_sexp` through the sexp evaluator
- Render to HTML via the existing `shared/sexp/html.py` pipeline
- Components referenced in post content (`use "image-gallery"`, etc.) resolve from the component registry
- Context variables (author data, related posts, etc.) passed as environment bindings
### Phase 6: Membership Decoupling
Migrate membership from Ghost to native account service:
**Labels → native labels:**
- Rename `ghost_labels``labels`, drop `ghost_id` column
- `user_labels` stays as-is
- Admin UI manages labels directly (no Ghost sync)
**Tiers → native membership tiers:**
- Rename `ghost_tiers``membership_tiers`, drop `ghost_id`
- Add tier management to account admin UI
- Tier assignment logic moves from Ghost webhook handler to account service
**User model cleanup:**
- Drop: `ghost_id`, `ghost_status`, `ghost_subscribed`, `ghost_note`, `ghost_raw`
- Keep: `stripe_customer_id` (needed for direct Stripe integration)
- Add: `membership_tier_id` FK, `membership_status` enum (free/active/cancelled)
### Phase 7: Newsletter System
Replace Ghost's newsletter infrastructure with a native implementation:
**Newsletter model (replaces `ghost_newsletters`):**
```
newsletters (id, name, slug, description, from_email, reply_to, template_sexp, created_at)
user_newsletters (user_id, newsletter_id, subscribed, subscribed_at, unsubscribed_at)
```
**Email sending:**
- Integrate a transactional email provider (Postmark, AWS SES, or direct SMTP)
- Newsletter templates as sexp — rendered to HTML email via the same pipeline
- Send endpoint on account or blog service: select newsletter, select segment (by label/tier), queue sends
- Unsubscribe handling: tokenized unsubscribe links, one-click List-Unsubscribe header
**Post → email campaign:**
- On publish, optionally select newsletter + segment (replaces Ghost's `?newsletter=slug&email_segment=...`)
- Render post body sexp to email-safe HTML (inline styles, table layout for email clients)
- Queue via background task (Celery or async worker)
**What we gain over Ghost:**
- Email templates are sexp — same format as everything else
- Full control over deliverability (SPF/DKIM/DMARC on our domain)
- Segment by any user attribute, not just Ghost's limited filter syntax
- Send analytics stored in our DB
### Phase 8: Subscription & Payment
Replace Ghost's Stripe integration with direct Stripe on the orders service:
**Current state:** Orders service already handles SumUp payments for marketplace/events. Adding Stripe for recurring subscriptions follows the same pattern.
**Implementation:**
- Stripe Checkout for subscription creation (redirect flow, PCI compliant)
- Stripe Webhooks for subscription lifecycle (created, updated, cancelled, payment_failed)
- `subscriptions` table (replaces `ghost_subscriptions`):
```
subscriptions (id, user_id, tier_id, stripe_subscription_id, stripe_customer_id,
status, cadence, price_amount, price_currency,
current_period_start, current_period_end, cancelled_at)
```
- Customer portal: link to Stripe's hosted portal for card updates/cancellation
- Webhook handler on orders service (same pattern as SumUp webhooks)
**What we gain:**
- Unified payment handling (SumUp for one-off, Stripe for recurring)
- Custom subscription logic (cooperative membership models, sliding scale, etc.)
- Direct access to Stripe customer data without Ghost intermediary
### Phase 9: Remove Ghost
Delete all Ghost integration code:
| File/Directory | Action |
|---|---|
| `blog/bp/blog/ghost/` | Delete entire directory |
| `blog/bp/blog/ghost_db.py` | Delete |
| `blog/bp/blog/web_hooks/` | Delete |
| `shared/infrastructure/ghost_admin_token.py` | Delete |
| `account/services/ghost_membership.py` | Delete |
| Ghost Docker service | Remove from docker-compose |
| Ghost env vars | Remove all `GHOST_*` variables |
| Ghost webhook blueprint registration | Remove from blog routes |
| Startup sync (`sync_all_content_from_ghost`) | Remove from blog init |
| Startup sync (`sync_all_membership_from_ghost`) | Remove from account init |
| Advisory lock `900001` | Remove from blog init |
Rename models:
- `ghost_content.py` → `content.py`
- `ghost_membership_entities.py` → `membership.py`
- Drop all `ghost_id` columns via Alembic migration
### Phase 10: Content-Addressable Caching (ties into sexpr.js)
Once posts are sexp and the JS client runtime exists:
- Hash post body → content address
- Client caches post bodies in localStorage keyed by hash
- Server sends manifest of slug → hash mappings
- Unchanged posts served entirely from client cache
- Only the data envelope (metadata, component params) travels on repeat visits
---
## What Stays the Same
- `Post` model and all its metadata fields (minus ghost-specific ones)
- URL structure (`/slug/`)
- Tag, author, and tag group systems
- Draft/publish workflow
- Admin edit UI (updated to save sexp instead of Lexical)
- RSS feeds (rendered from sexp → HTML)
- Search indexing (extract text content from sexp)
- ActivityPub federation (triggered on publish, same as now)
- Alembic migrations (add/modify/drop columns)
- OAuth2 auth system (already independent of Ghost)
---
## Ordering & Dependencies
```
Phase 1-2 (Content schema) ──→ Phase 3 (Editor) ──→ Phase 5 (Rendering)
Phase 4 (Uploads) ──┘
Phase 6 (Membership) ──→ Phase 8 (Payments)
Phase 7 (Newsletters) ── independent, needs email provider choice
Phase 9 (Remove Ghost) ── after all above complete
Phase 10 (Content-addressed) ── after sexpr.js runtime exists
```
Phases 1-5 (content) and Phases 6-8 (membership/payments) can proceed in parallel — they touch different services.
---
## Risk Mitigation
- **Data safety**: Run Lexical → sexp converter in dry-run mode first, diff HTML output for every post
- **Rollback**: Keep `lexical` column and Ghost running during transition, feature flag to switch renderers
- **Editor UX**: Editor remains WYSIWYG — authors never see sexp syntax
- **SEO continuity**: URLs don't change, HTML output structurally identical
- **Email deliverability**: Set up SPF/DKIM/DMARC before sending first newsletter from our domain
- **Payment migration**: Run Ghost Stripe and direct Stripe in parallel during transition, migrate active subscriptions via Stripe API (change the subscription's application)
- **Membership data**: One-time migration script to clean User model fields, verified against Ghost export
---
## Dependencies
- Stable sexp parser + evaluator (already built: `shared/sexp/`)
- Component registry with post-relevant components: image, embed, gallery, code-block
- Editor sexp serialization (new work)
- Email provider account (Postmark/SES/SMTP)
- Stripe account with recurring billing enabled (may already exist via Ghost)
- Optional: sexpr.js client runtime for content-addressable caching (see `sexpr-js-runtime-plan.md`)