Files
mono/docs/decoupling-plan.md
giles 094b6c55cd Fix AP blueprint cross-DB queries + harden Ghost sync init
AP blueprints (activitypub.py, ap_social.py) were querying federation
tables (ap_actor_profiles etc.) on g.s which points to the app's own DB
after the per-app split. Now uses g._ap_s backed by get_federation_session()
for non-federation apps.

Also hardens Ghost sync before_app_serving to catch/rollback on failure
instead of crashing the Hypercorn worker.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 14:06:42 +00:00

17 KiB

Rose Ash Decoupling Plan

Context

The four Rose Ash apps (blog, market, cart, events) are tightly coupled through:

  • A shared model layer (blog/shared_lib/models/) containing ALL models for ALL apps
  • Cross-app foreign keys (calendars→posts, cart_items→market_places, calendar_entries→orders, etc.)
  • Post as the universal parent — calendars, markets, page_configs all hang off post_id
  • Internal HTTP calls for menu items, cart summaries, and login adoption

This makes it impossible to attach services to anything other than a Post, and means apps can't have independent databases. The goal is to decouple so apps are independently deployable, new services can be added easily, and the composition of "what's attached to what" is defined in a separate glue layer.


Phase 1: Extract shared_lib out of blog/

What: Move shared infrastructure into a top-level shared/ package. Split models by ownership.

New structure

/root/rose-ash/
  shared/                              # Extracted from blog/shared_lib/
    db/base.py, session.py             # Unchanged
    models/                            # ONLY shared models:
      user.py                          #   User (used by all apps)
      kv.py                            #   KV (settings)
      magic_link.py                    #   MagicLink (auth)
      ghost_membership_entities.py     #   Ghost labels/newsletters/tiers/subscriptions
      menu_item.py                     #   MenuItem (temporary, moves to glue in Phase 4)
    infrastructure/                    # Renamed from shared/
      factory.py                       #   create_base_app()
      internal_api.py                  #   HTTP client for inter-app calls
      context.py                       #   base_context()
      user_loader.py, jinja_setup.py, cart_identity.py, cart_loader.py, urls.py, http_utils.py
    browser/                           # Renamed from suma_browser/
      (middleware, templates, csrf, errors, filters, redis, payments, authz)
    config.py, config/
    alembic/, static/, editor/

  blog/models/                         # Blog-owned models
    ghost_content.py                   #   Post, Author, Tag, PostAuthor, PostTag, PostLike
    snippet.py                         #   Snippet
    tag_group.py                       #   TagGroup, TagGroupTag

  market/models/                       # Market-owned models
    market.py                          #   Product, CartItem, NavTop, NavSub, Listing, etc.
    market_place.py                    #   MarketPlace

  cart/models/                         # Cart-owned models
    order.py                           #   Order, OrderItem
    page_config.py                     #   PageConfig

  events/models/                       # Events-owned models
    calendars.py                       #   Calendar, CalendarEntry, CalendarSlot, Ticket, TicketType

Key changes

  • Update path_setup.py in each app to add project root to sys.path
  • Update all from models import Xfrom blog.models import X / from shared.models import X etc.
  • Update from db.base import Basefrom shared.db.base import Base in every model file
  • Update from shared.factory importfrom shared.infrastructure.factory import in each app.py
  • Alembic env.py imports from all locations so Base.metadata sees every table
  • Add a transitional compat layer in old location that re-exports everything (remove later)

Critical files to modify

  • blog/app.py (line 9: from shared.factory), market/app.py, cart/app.py, events/app.py
  • blog/shared_lib/shared/factory.pyshared/infrastructure/factory.py
  • Every model file (Base import)
  • blog/shared_lib/alembic/env.pyshared/alembic/env.py
  • Each app's path_setup.py

Verify

  • All four apps start without import errors
  • alembic check produces no diff (schema unchanged)
  • All routes return correct responses
  • Internal API calls between apps still work

Phase 2: Event Infrastructure + Logging

What: Add the durable event system (transactional outbox) and shared structured logging.

2a. DomainEvent model (the outbox)

New file: shared/models/domain_event.py

class DomainEvent(Base):
    __tablename__ = "domain_events"

    id            = Integer, primary_key
    event_type    = String(128), indexed     # "calendar.created", "order.completed"
    aggregate_type = String(64)              # "calendar", "order"
    aggregate_id  = Integer                  # ID of the thing that changed
    payload       = JSONB                    # Event-specific data
    state         = String(20), default "pending"  # pending → processing → completed | failed
    attempts      = Integer, default 0
    max_attempts  = Integer, default 5
    last_error    = Text, nullable
    created_at    = DateTime, server_default now()
    processed_at  = DateTime, nullable

The critical property: emit_event() writes to this table in the same DB transaction as the domain change. If the app crashes after commit, the event is already persisted. If it crashes before commit, neither the domain change nor the event exists. This is atomic.

2b. Event bus

New directory: shared/events/

shared/events/
    __init__.py        # exports emit_event, register_handler, EventProcessor
    bus.py             # emit_event(session, event_type, aggregate_type, aggregate_id, payload)
                       # register_handler(event_type, async_handler_fn)
    processor.py       # EventProcessor: polls domain_events table, dispatches to handlers

emit_event(session, ...) — called within service functions, writes to outbox in current transaction register_handler(event_type, fn) — called at app startup (by glue layer) to register handlers EventProcessor — background polling loop:

  1. SELECT ... FROM domain_events WHERE state='pending' FOR UPDATE SKIP LOCKED
  2. Run all registered handlers for that event_type
  3. Mark completed or retry on failure
  4. Runs as an asyncio.create_task within each app process (started in factory.py)

2c. Structured logging

New directory: shared/logging/

shared/logging/
    __init__.py
    setup.py           # configure_logging(app_name), get_logger(name)
  • JSON-structured output to stdout (timestamp, level, app, message, plus optional fields: event_type, user_id, request_id, duration_ms)
  • configure_logging(app_name) called in create_base_app()
  • All apps get consistent log format; in production these go to a log aggregator

2d. Integration

Update shared/infrastructure/factory.py:

  • Call configure_logging(name) at app creation
  • Start EventProcessor as background task in @app.before_serving
  • Stop it in @app.after_serving

Verify

  • domain_events table exists after migration
  • Call emit_event() in a test, verify row appears in table
  • EventProcessor picks up pending events and marks them completed
  • JSON logs appear on stdout with correct structure
  • No behavioral changes — this is purely additive infrastructure

Phase 3: Generic Container Concept

What: Replace cross-app post_id FKs with container_type + container_id soft references.

Models to change

Calendar (events/models/calendars.py):

# REMOVE: post_id = Column(Integer, ForeignKey("posts.id"), ...)
# REMOVE: post = relationship("Post", ...)
# ADD:
container_type = Column(String(32), nullable=False)  # "page", "market", etc.
container_id = Column(Integer, nullable=False)

MarketPlace (market/models/market_place.py):

# Same pattern: remove post_id FK, add container_type + container_id

PageConfig (cart/models/page_config.py):

# Same pattern

CalendarEntryPost → rename to CalendarEntryContent:

# REMOVE: post_id FK
# ADD: content_type + content_id (generic reference)

From Post model (blog/models/ghost_content.py), remove:

  • calendars relationship
  • markets relationship
  • page_config relationship
  • calendar_entries relationship (via CalendarEntryPost)
  • menu_items relationship (moves to glue in Phase 4)

Helper in shared/containers.py:

class ContainerType:
    PAGE = "page"
    # Future: MARKET = "market", GROUP = "group", etc.

def container_filter(model, container_type, container_id):
    """Return SQLAlchemy filter clauses."""
    return [model.container_type == container_type, model.container_id == container_id]

Three-step migration (non-breaking)

  1. Add columns (nullable) — keeps old post_id FK intact
  2. BackfillUPDATE calendars SET container_type='page', container_id=post_id; make NOT NULL
  3. Drop old FK — remove post_id column and FK constraint

Update all queries

Key files that reference Calendar.post_id, MarketPlace.post_id, PageConfig.post_id:

  • events/app.py (~line 108)
  • market/app.py (~line 119)
  • cart/app.py (~line 131)
  • cart/bp/cart/services/checkout.py (lines 77-85, 160-163) — resolve_page_config() and create_order_from_cart()
  • cart/bp/cart/services/page_cart.py
  • cart/bp/cart/api.py

All change from X.post_id == post.id to X.container_type == "page", X.container_id == post.id.

Verify

  • Creating a calendar/market/page_config uses container_type + container_id
  • Cart checkout still resolves correct page config via container references
  • No cross-app FKs remain for these three models
  • Alembic migration is clean

Phase 4: Glue Layer

What: New top-level glue/ package that owns container relationships, navigation, and event handlers.

Structure

/root/rose-ash/glue/
    __init__.py
    models/
        container_relation.py      # Parent-child container relationships
        menu_node.py               # Navigation tree (replaces MenuItem)
    services/
        navigation.py              # Build menu from relationship tree
        relationships.py           # attach_child(), get_children(), detach_child()
    handlers/
        calendar_handlers.py       # on calendar attached → rebuild nav
        market_handlers.py         # on market attached → rebuild nav
        order_handlers.py          # on order completed → confirm calendar entries
        login_handlers.py          # on login → adopt anonymous cart/calendar items
    setup.py                       # Registers all handlers at app startup

ContainerRelation model

class ContainerRelation(Base):
    __tablename__ = "container_relations"
    id, parent_type, parent_id, child_type, child_id, sort_order, label, created_at, deleted_at
    # Unique constraint: (parent_type, parent_id, child_type, child_id)

This is the central truth about "what's attached to what." A page has calendars and markets attached to it — defined here, not by FKs on the calendar/market tables.

MenuNode model (replaces MenuItem)

class MenuNode(Base):
    __tablename__ = "menu_nodes"
    id, container_type, container_id,
    parent_id (self-referential tree), sort_order, depth,
    label, slug, href, icon, feature_image,
    created_at, updated_at, deleted_at

This is a cached navigation tree built FROM ContainerRelations. A page doesn't know it has markets — but its MenuNode has child MenuNodes for the market because the glue layer put them there.

Navigation service (glue/services/navigation.py)

  • get_navigation_tree(session) → nested dict for templates (replaces /internal/menu-items API)
  • rebuild_navigation(session) → reads ContainerRelations, creates/updates MenuNodes
  • Called by event handlers when relationships change

Relationship service (glue/services/relationships.py)

  • attach_child(session, parent_type, parent_id, child_type, child_id) → creates ContainerRelation + emits container.child_attached event
  • get_children(session, parent_type, parent_id, child_type=None) → query children
  • detach_child(...) → soft delete + emit container.child_detached event

Event handlers (the "real code" in the glue layer)

# glue/handlers/calendar_handlers.py
@handler("container.child_attached")
async def on_child_attached(payload, session):
    if payload["child_type"] in ("calendar", "market"):
        await rebuild_navigation(session)

# glue/handlers/order_handlers.py  (Phase 5 but registered here)
@handler("order.created")
async def on_order_created(payload, session):
    # Confirm calendar entries for this order
    ...

# glue/handlers/login_handlers.py  (Phase 5 but registered here)
@handler("user.logged_in")
async def on_user_logged_in(payload, session):
    # Adopt anonymous cart items and calendar entries
    ...

Replace menu_items flow

Old: Each app calls GET /internal/menu-items → coop queries MenuItem → returns JSON New: Each app calls glue.services.navigation.get_navigation_tree(g.s) → direct DB query of MenuNode

Update context functions in all four app.py files:

# REMOVE: menu_data = await api_get("coop", "/internal/menu-items")
# ADD:    from glue.services.navigation import get_navigation_tree
#         ctx["menu_items"] = await get_navigation_tree(g.s)

Data migration

  • Backfill menu_nodes from existing menu_items + posts
  • Backfill container_relations from existing calendar/market/page_config container references
  • Deprecate (then remove) old MenuItem model and /internal/menu-items endpoint
  • Update menu admin UI (blog/bp/menu_items/) to manage ContainerRelations + MenuNodes

Verify

  • Navigation renders correctly in all four apps without HTTP calls
  • Adding a market to a page (via ContainerRelation) triggers nav rebuild and market appears in menu
  • Adding a calendar to a page does the same
  • Menu admin UI works with new models

Phase 5: Event-Driven Cross-App Workflows

What: Replace remaining cross-app FKs and HTTP calls with event-driven flows.

5a. Replace CalendarEntry.order_id FK with soft reference

# REMOVE: order_id = Column(Integer, ForeignKey("orders.id"), ...)
# ADD:    order_ref_id = Column(Integer, nullable=True, index=True)
# (No FK constraint — just stores the order ID as an integer)

Same for Ticket.order_id. Three-step migration (add, backfill, drop FK).

5b. Replace CartItem.market_place_id FK with soft reference

# REMOVE: market_place_id = ForeignKey("market_places.id")
# ADD:    market_ref_id = Column(Integer, nullable=True, index=True)

5c. Event-driven order completion

Currently cart/bp/cart/services/checkout.py line 166 directly writes CalendarEntry rows (cross-domain). Replace:

# In create_order_from_cart(), instead of direct UPDATE on CalendarEntry:
await emit_event(session, "order.created", "order", order.id, {
    "order_id": order.id,
    "user_id": user_id,
    "calendar_entry_ids": [...],
})

Glue handler picks it up and updates calendar entries via events-domain code.

5d. Event-driven login adoption

Currently blog/bp/auth/routes.py line 265 calls POST /internal/cart/adopt. Replace:

# In magic() route, instead of api_post("cart", "/internal/cart/adopt"):
await emit_event(session, "user.logged_in", "user", user_id, {
    "user_id": user_id,
    "anonymous_session_id": anon_session_id,
})

Glue handler adopts cart items and calendar entries.

5e. Remove cross-domain ORM relationships

From models, remove:

  • Order.calendar_entries (relationship to CalendarEntry)
  • CalendarEntry.order (relationship to Order)
  • Ticket.order (relationship to Order)
  • CartItem.market_place (relationship to MarketPlace)

5f. Move cross-domain queries to glue services

cart/bp/cart/services/checkout.py currently imports CalendarEntry, Calendar, MarketPlace directly. Move these queries to glue service functions that bridge the domains:

  • glue/services/cart_calendar.py — query calendar entries for a cart identity
  • glue/services/page_resolution.py — determine which page/container a cart belongs to using ContainerRelation

Final FK audit after Phase 5

All remaining FKs are either:

  • Within the same app domain (Order→OrderItem, Calendar→CalendarSlot, etc.)
  • To shared models (anything→User)
  • One pragmatic exception: OrderItem.product_id → products.id (cross cart→market, but OrderItem already snapshots title/price, so this FK is just for reporting)

Verify

  • Login triggers user.logged_in event → cart/calendar adoption happens via glue handler
  • Order creation triggers order.created event → calendar entries confirmed via glue handler
  • No cross-app FKs remain (except the pragmatic OrderItem→Product)
  • All apps could theoretically point at separate databases
  • Event processor reliably processes and retries all events

Execution Order

Each phase leaves the system fully working. No big-bang migration.

Phase Risk Size Depends On
1. Extract shared_lib Low (mechanical refactor) Medium Nothing
2. Event infra + logging Low (purely additive) Small Phase 1
3. Generic containers Medium (schema + query changes) Medium Phase 1
4. Glue layer Medium (new subsystem, menu migration) Large Phases 2 + 3
5. Event-driven workflows Medium (behavioral change in checkout/login) Medium Phase 4

Phases 2 and 3 can run in parallel after Phase 1. Phase 4 needs both. Phase 5 needs Phase 4.