From 1a179de5479611a90aa37dc1a384599f3d1aeb60 Mon Sep 17 00:00:00 2001 From: giles Date: Fri, 27 Feb 2026 09:05:02 +0000 Subject: [PATCH] Add s-expression architecture transformation plan Vision document for migrating rose-ash to an s-expression-based architecture where pages, media renders, and LLM-generated content share a unified DAG execution model with content-addressed caching on IPFS/IPNS. Co-Authored-By: Claude Opus 4.6 --- docs/sexp-architecture-plan.md | 600 +++++++++++++++++++++++++++++++++ 1 file changed, 600 insertions(+) create mode 100644 docs/sexp-architecture-plan.md diff --git a/docs/sexp-architecture-plan.md b/docs/sexp-architecture-plan.md new file mode 100644 index 0000000..0efbdb9 --- /dev/null +++ b/docs/sexp-architecture-plan.md @@ -0,0 +1,600 @@ +# Rose-Ash S-Expression Architecture Transformation + +## Context + +Rose-ash is a federated cooperative platform built as a Quart microservice monorepo (blog, market, cart, events, federation, account, likes, relations). It currently uses Jinja2 templates for rendering, ad-hoc Python code for fragment composition, and HTTP-based inter-service communication (data, actions, fragments, inbox). + +Separately, the art-dag and art-dag-mono repos contain a DAG execution engine, s-expression parser/evaluator, Celery rendering pipeline, and 104k lines of media processing primitives — but are not integrated with rose-ash. + +**The transformation**: Unify these into a single architecture where s-expressions are the application language. Pages and media renders become the same computation — DAG execution over content-addressed nodes. Python drops to the role of runtime primitives (DB, HTTP, IPFS, GPU). The app logic, layouts, components, routes, and data bindings are all expressed in s-expressions. + +**Key insight 1**: Rendering a page and rendering a video are the same function — walk a DAG of content-addressed nodes, check cache, compute what's missing, assemble the result. The only difference is the leaf executor (Jinja/HTML vs JAX/pixels). + +**Key insight 2**: S-expressions are the natural output language for an LLM. An LLM trained on the primitive vocabulary generates s-expressions in response to natural language prompts and HTTP requests. The primitives are the guardrails — the LLM can compose freely but can only invoke what's registered. The s-expressions are then resolved and rendered as HTML pages, media assets, or any other output format. The app becomes a conversation: user speaks natural language → LLM speaks s-expressions → resolver renders results. The LLM learns continually from the data flowing through the system — the s-expressions it generates, the content they resolve to, the user interactions they produce. + +--- + +## Architecture Overview + +``` +┌──────────────────────────────────────────────┐ +│ Natural language (the interface) │ +│ user prompts, HTTP requests, AP activities │ +├──────────────────────────────────────────────┤ +│ LLM (the compiler) │ +│ natural language → s-expressions │ +│ trained on primitives, learns from data │ +├──────────────────────────────────────────────┤ +│ S-expressions (the application) │ +│ components, pages, routes, data bindings, │ +│ media effects, composition, federation │ +├──────────────────────────────────────────────┤ +│ Resolver (the engine) │ +│ parse → analyze → plan → execute → cache │ +│ content-addressed at every node │ +├──────────────────────────────────────────────┤ +│ Python primitives (the runtime) │ +│ HTTP/DB/Redis/IPFS/Celery/JAX/AP/OAuth │ +└──────────────────────────────────────────────┘ + +Storage tiers: + Hot: Redis (seconds-minutes, user-specific, ephemeral) + Warm: IPFS (immutable, content-addressed, shared) + Pointers: IPNS (mutable "current version" references) + +Feedback loop: + LLM generates s-expression → resolver executes → result cached + → user interacts → interaction data feeds back to LLM training + → LLM improves its s-expression generation +``` + +--- + +## Phase 1: S-Expression Core Library + +**Goal**: A standalone s-expression parser, evaluator, and primitive registry that every service can import. Pure data manipulation — no I/O, no rendering, no HTTP. + +**Source material**: `~/art-dag-mono/core/artdag/sexp/` (parser.py, evaluator.py, compiler.py, primitives.py) + +**Deliverables**: +``` +shared/sexp/ + __init__.py + parser.py # tokenize + parse s-expressions to AST + types.py # SExp, Symbol, Keyword, Atom types + evaluator.py # evaluate with environments, closures, let-bindings + primitives.py # register_primitive decorator + base primitives + env.py # environment/scope management +``` + +**Base primitives** (no I/O, pure transforms): +- `seq`, `list`, `map`, `filter`, `reduce` +- `let`, `lambda`, `defcomp`, `if`/`when`/`cond` +- `str`, `concat`, `format` +- `slot` (access keyword fields from data) +- Arithmetic, comparison, logic + +**Tasks**: +1. Port parser from art-dag-mono, adapt to rose-ash conventions +2. Port evaluator, add `defcomp` (component definition) and `defroute` forms +3. Define type system (SExp, Symbol, Keyword, String, Number, List, Nil) +4. Implement environment/scope chain +5. Write primitive registry with `@register_primitive` decorator +6. Unit tests + +**Verification**: Pure unit tests — parse → evaluate → assert result. + +--- + +## Phase 2: HTML Renderer + +**Goal**: An HSX-style renderer that walks an s-expression tree and emits HTML strings. Handles elements, attributes, components, fragments, raw HTML, escaping. + +**Reference**: HSX (Common Lisp) — s-expressions map directly to HTML elements. + +**Deliverables**: +``` +shared/sexp/ + html.py # s-expression → HTML string renderer + escape.py # attribute and text escaping (XSS prevention) + components.py # defcomp registry, component resolution +``` + +**Syntax conventions**: +```scheme +(div :class "foo" :id "bar" ;; HTML element with attributes + (h1 "Title") ;; text children + (p "Paragraph")) + +(defcomp ~card (&key title children) ;; component (~ prefix) + (div :class "card" + (h2 title) + children)) + +(~card :title "Hello" ;; component invocation + (p "Body")) + +(<> (li "One") (li "Two")) ;; fragment (no wrapper element) + +(raw! "trusted") ;; unescaped HTML + +(when condition (p "shown")) ;; conditional rendering +(map fn items) ;; list rendering +``` + +**Tasks**: +1. Implement HTML element rendering (tag, attributes, children) +2. Implement text escaping (prevent XSS — escape &, <, >, ", ') +3. Implement `raw!` for trusted HTML (existing Jinja `| safe` equivalent) +4. Implement fragment (`<>`) rendering +5. Implement `defcomp` / component registry and invocation +6. Implement void elements (img, br, input, meta, link) +7. Boolean attributes (disabled, checked, required) +8. Unit tests — render s-expression, assert HTML output + +**Verification**: Render existing fragment templates as s-expressions, diff against current Jinja output. + +--- + +## Phase 3: Async Resolver + +**Goal**: Walk an s-expression tree, identify nodes that need I/O (service fragments, data queries), fetch them in parallel, substitute results. This is the DAG execution engine applied to page rendering. + +**Source material**: +- `~/art-dag-mono/core/artdag/engine.py` (analyze → plan → execute) +- `~/rose-ash/shared/infrastructure/fragments.py` (fetch_fragment, fetch_fragments, fetch_fragment_batch) + +**Deliverables**: +``` +shared/sexp/ + resolver.py # async tree walker — identify, fetch, substitute + cache.py # content-addressed caching (SHA3-256 → Redis/IPFS) + primitives_io.py # I/O primitives (frag, query, action) +``` + +**I/O primitives** (async, registered separately from pure primitives): +- `(frag service type :key val ...)` → fetch_fragment +- `(query service query-name :key val ...)` → fetch_data +- `(action service action-name :key val ...)` → call_action +- `(current-user)` → load user from request context +- `(htmx-request?)` → check HX-Request header + +**Resolution strategy**: +1. Parse the s-expression tree +2. Walk the tree, identify all `frag`/`query` nodes +3. Group independent fetches, dispatch via `asyncio.gather()` +4. Substitute results into the tree +5. Render resolved tree to HTML + +**Content addressing**: +- Hash each subtree (SHA3-256 of the s-expression text) +- Check Redis (hot cache) → check IPFS (warm cache) → compute +- Cache rendered subtrees at configurable granularity + +**Tasks**: +1. Implement async tree walker with parallel fetch grouping +2. Port content-addressed caching from art-dag-mono/core/artdag/cache.py +3. Implement `frag` primitive (wraps existing fetch_fragment) +4. Implement `query` primitive (wraps existing fetch_data) +5. Implement `action` primitive (wraps existing call_action) +6. Implement request-context primitives (current-user, htmx-request?) +7. Integration tests against running services + +**Verification**: Render a blog post page via resolver, compare output to current Jinja render. + +--- + +## Phase 4: Bridge — Coexistence with Jinja + +**Goal**: Allow s-expression components and Jinja templates to coexist. Migrate incrementally — one component at a time, one page at a time. + +**Deliverables**: +``` +shared/sexp/ + jinja_bridge.py # Jinja filter/global to render s-expressions in templates + # + helper to embed Jinja output in s-expressions via raw! +``` + +**Bridge patterns**: + +```python +# In Jinja: render an s-expression component +{{ sexp('(~link-card :slug "apple" :title "Apple")') | safe }} + +# In s-expression: embed existing Jinja template output +(raw! (jinja "fragments/nav_tree.html" :items nav-items)) +``` + +**Migration order for fragments** (leaf nodes first): +1. `link-card` (blog, market, events, federation) — simplest, self-contained +2. `cart-mini` — small, user-specific +3. `auth-menu` — small, user-specific +4. `nav-tree` — recursive structure, good test of composition +5. `container-nav`, `container-cards` — cross-service composites + +**Tasks**: +1. Implement `sexp()` Jinja global function +2. Implement `jinja` s-expression primitive +3. Rewrite `link-card` fragment as s-expression component (all services) +4. Rewrite `cart-mini` and `auth-menu` fragments +5. Rewrite `nav-tree` fragment +6. Rewrite `container-nav` and `container-cards` fragments +7. Verify each rewritten fragment produces identical HTML + +**Verification**: A/B test — render via Jinja, render via s-expression, diff output. + +--- + +## Phase 5: Page Layouts as S-Expressions + +**Goal**: Replace Jinja template inheritance (`{% extends %}`, `{% block %}`) with s-expression component composition. Layouts become components with slots. + +**Current template hierarchy**: +``` +_types/root/index.html → base HTML shell + _types/root/_index.html → layout with aside, filter, content slots + _types/blog/index.html → blog-specific layout + _types/post/index.html → post page +``` + +**Becomes**: +```scheme +(defcomp ~base-layout (&key title user cart &rest content) + (html :lang "en" + (head (title title) ...) + (body :class "min-h-screen" + (~header :user user :cart cart) + (main :class "flex" content)))) + +(defcomp ~app-layout (&key title user cart aside filter content) + (~base-layout :title title :user user :cart cart + (when filter (div :id "filter" filter)) + (aside :id "aside" aside) + (section :id "main-panel" content))) + +(defcomp ~post-page (&key post nav-items user cart) + (~app-layout + :title (:slot post :title) + :user user :cart cart + :aside (~nav-tree :items nav-items) + :content + (article + (h1 (:slot post :title)) + (div :class "prose" (raw! (:slot post :body)))))) +``` + +**OOB updates** (HTMX partial renders): +```scheme +(defcomp ~post-oob (&key post nav-items) + (<> + (div :id "filter" :hx-swap-oob "outerHTML" + (~post-filter :post post)) + (aside :id "aside" :hx-swap-oob "outerHTML" + (~nav-tree :items nav-items)) + (section :id "main-panel" + (article ...)))) +``` + +**Tasks**: +1. Define `~base-layout` component (replaces `_types/root/index.html`) +2. Define `~app-layout` component (replaces `_types/root/_index.html`) +3. Define `~header` component (replaces header block) +4. Define OOB rendering pattern (replaces `oob_elements.html`) +5. Rewrite blog post page as s-expression +6. Rewrite market product page +7. Rewrite cart page +8. Rewrite events calendar page +9. Update route handlers to use resolver instead of render_template +10. Remove migrated Jinja templates + +**Verification**: Visual comparison — deploy both paths, screenshot diff. + +--- + +## Phase 6: Routes as S-Expressions + +**Goal**: Route definitions move from Python decorators + handler functions to s-expression declarations. Python route handlers become thin dispatchers. + +**Current**: +```python +@post_bp.get("//") +async def post_view(slug): + post = await services.blog.get_post_by_slug(g.s, slug) + ctx = await post_data(slug, g.s) + if is_htmx_request(): + return render_template("_types/post/_oob_elements.html", **ctx) + return render_template("_types/post/index.html", **ctx) +``` + +**Becomes**: +```scheme +(defroute "/blog/:slug/" + (let ((post (query blog post-by-slug :slug slug)) + (nav (query blog nav-tree)) + (user (current-user)) + (cart (when user (query cart cart-summary :user_id (:slot user :id))))) + (if (htmx-request?) + (render (~post-oob :post post :nav-items nav)) + (render (~post-page :post post :nav-items nav :user user :cart cart))))) +``` + +**Python dispatcher**: +```python +# One generic route handler per service +@bp.route("/") +async def dispatch(path): + route_expr = match_route(path) # find matching defroute + return await resolve_and_render(route_expr, request) +``` + +**Tasks**: +1. Implement `defroute` form with path pattern matching +2. Implement route registry (load s-expression route files at startup) +3. Implement request context binding (path params, query params, headers) +4. Write generic Quart dispatcher +5. Migrate blog routes +6. Migrate market routes +7. Migrate cart routes +8. Migrate events routes +9. Migrate account/federation routes + +**Verification**: Full integration test — HTTP requests produce correct responses. + +--- + +## Phase 7: Content Addressing + IPFS + IPNS + +**Goal**: Resolved fragment trees are content-addressed and cached on IPFS. IPNS provides mutable pointers to current versions. Cache invalidation becomes IPNS pointer updates. + +**Source material**: +- `~/rose-ash/shared/utils/ipfs_client.py` (already in rose-ash) +- `~/rose-ash/shared/utils/anchoring.py` (merkle trees, OTS) +- `~/art-dag-mono/core/artdag/cache.py` (content-addressed caching) + +**Two-tier caching**: +``` +Hot tier (Redis): user-specific, short TTL, ephemeral + key: sha3(s-expression) → rendered HTML + examples: cart-mini, auth-menu + +Warm tier (IPFS): deterministic, immutable, shared + CID: sha3(s-expression) → rendered HTML on IPFS + IPNS name: stable identity → current CID + examples: post-body, nav-tree, link-card, full pages +``` + +**Invalidation**: +- Content changes → new s-expression → new hash → new CID +- Service publishes new CID to IPNS name +- ActivityPub `Update` activity propagates to federated instances +- No TTL-based expiry for warm tier — immutable content, versioned pointers + +**Tasks**: +1. Implement SHA3-256 hashing of s-expression subtrees +2. Implement two-tier cache lookup (Redis → IPFS → compute) +3. Implement IPNS name management per fragment type +4. Implement cache warming (pre-render and pin stable content) +5. Wire invalidation into event bus (content change → IPNS update) +6. Wire IPNS updates into AP federation (Update activities) + +**Verification**: Cache hit rates, IPFS pin counts, IPNS resolution latency. + +--- + +## Phase 8: Media Pipeline Integration + +**Goal**: Bring art-dag's Celery rendering pipeline into rose-ash as the `render` service. Same s-expression language, same resolver, different leaf executors (JAX/FFmpeg instead of HTML). + +**Source material**: +- `~/art-dag-mono/l1/` (Celery app, tasks, sexp_effects, streaming) +- `~/art-dag-mono/core/artdag/` (engine, analysis, planning, nodes, effects) + +**Deliverables**: +``` +rose-ash/ + render/ # new service + celery_app.py # from art-dag-mono/l1/celery_app.py + tasks/ # from art-dag-mono/l1/tasks/ + sexp_effects/ # from art-dag-mono/l1/sexp_effects/ + primitives.py # 104k lines of media primitives + interpreter.py + wgsl_compiler.py # GPU shaders + effects/ # effect plugins + streaming/ # video streaming output + Dockerfile + Dockerfile.gpu + docker-compose.yml +``` + +**Integration points**: +- `(render-media expr)` primitive dispatches to Celery task +- Results stored on IPFS, CID returned +- Event bus activity emitted on completion +- Same content-addressing — same s-expression → same output CID + +**Tasks**: +1. Move L1 codebase into `rose-ash/render/` +2. Adapt imports to use `shared/sexp/` parser (replace local copy) +3. Register media primitives alongside web primitives +4. Implement `render-media` primitive (dispatch to Celery) +5. Wire Celery task completion into event bus +6. Integration tests — submit recipe, verify output on IPFS + +**Verification**: Submit an s-expression media recipe via the resolver, get back an IPFS CID with the rendered output. + +--- + +## Phase 9: Unified DAG Executor + +**Goal**: One execution engine that handles both page renders and media renders. The executor dispatches to different primitive sets based on node type, but the resolution, caching, and content-addressing logic is shared. + +**Deliverables**: +``` +shared/sexp/ + executor.py # unified DAG executor + registry.py # executor registry (HTML executors, media executors) +``` + +**Unified flow**: +``` +input s-expression + → parse (shared/sexp/parser.py) + → analyze (identify node types, owners, dependencies) + → plan (check cache tiers, determine fetch/compute order) + → execute (dispatch to registered executors in parallel) + → cache (store results at appropriate tier) + → return (HTML string, media CID, or composed result) +``` + +**Tasks**: +1. Abstract the resolver into a generic DAG executor +2. Implement executor registry (register by node type/prefix) +3. Register HTML executors (frag, query, render, defcomp) +4. Register media executors (transcode, filter, compose, source) +5. Implement mixed-mode execution (page with embedded media) +6. Provenance tracking (link executor output to AP activities) + +**Verification**: A single `resolve()` call handles a page that contains both HTML components and embedded media references. + +--- + +## Phase 10: Federation of S-Expressions + +**Goal**: S-expression components and pages are first-class ActivityPub objects. Remote instances can fetch, render, cache, and re-style federated content expressed as s-expressions. + +**Integration with existing AP infrastructure**: +- `shared/infrastructure/activitypub.py` — actor endpoints +- `shared/events/bus.py` — activity emission +- `shared/utils/ipfs_client.py` — content storage +- `shared/utils/anchoring.py` — provenance + +**Tasks**: +1. Define AP object type for s-expression content (`rose:SExpression`) +2. Publish component definitions as AP Create activities +3. Federate page updates as AP Update activities with IPNS pointers +4. Implement remote component resolution (fetch s-expr from remote instance) +5. Implement content verification (signature on s-expression CID) +6. Implement re-styling (apply local theme to remote s-expression) + +**Verification**: Instance A publishes a post, Instance B resolves and renders it from the federated s-expression. + +--- + +## Phase 11: LLM as S-Expression Compiler + +**Goal**: An LLM trained on the primitive vocabulary generates s-expressions from natural language. Users describe what they want in plain English. The LLM outputs valid s-expressions. The resolver renders them. Pages are generated on the fly from conversation. + +**The LLM speaks s-expressions because**: +- The primitive vocabulary is small and well-defined (unlike HTML/CSS/JS) +- S-expressions are structurally simple — easy for an LLM to generate correctly +- The resolver validates and sandboxes — the LLM can't produce unsafe output +- Every generated s-expression is content-addressed — same prompt → cacheable result +- The primitives are the training data — the LLM learns what `~product-card`, `~nav-tree`, `(query market ...)` do from examples + +**How it works**: + +``` +User: "show me a page of seasonal vegetables under £5" + +LLM generates: +(~app-layout :title "Seasonal Vegetables Under £5" + :content + (let ((products (query market products-search + :category "vegetables" + :seasonal true + :max_price 5.00 + :sort "price-asc"))) + (div :class "grid grid-cols-3 gap-4" + (if (empty? products) + (p :class "text-gray-500" "No vegetables found matching your criteria.") + (map (lambda (p) (~product-card :slug (:slot p :slug))) + products))))) + +Resolver: parse → fetch products → render cards → HTML page +``` + +**Three modes of LLM integration**: + +1. **Generative pages**: User prompt → LLM → s-expression → rendered page + - Conversational UI: user refines via follow-up prompts + - Each generated page is a CID on IPFS — shareable, cacheable + +2. **Adaptive layouts**: LLM observes user behavior → generates personalized component arrangements + - Home page adapts: frequent buyer sees cart-heavy layout + - Event organizer sees calendar-first layout + - Same primitives, different composition + +3. **Content authoring**: LLM assists in creating blog posts, product descriptions, event listings + - Author describes intent → LLM generates structured s-expression content + - Content is data (s-expression), not just text — queryable, composable, versionable + +**Training the LLM on primitives**: +- Primitive catalog: every registered primitive with its signature, description, examples +- Component library: every `defcomp` with usage examples +- Query catalog: every `(query service name)` with parameter schemas and return types +- Interaction logs: successful s-expressions that produced good user outcomes +- Continuous learning: new primitives/components automatically extend the vocabulary + +**Safety model**: +- S-expressions can only invoke registered primitives — no arbitrary code execution +- The resolver validates the tree before execution +- I/O primitives respect existing auth (HMAC, OAuth, user context) +- Rate limiting on LLM generation endpoint +- Content-addressed caching prevents regeneration of identical requests +- Generated s-expressions are logged as AP activities (provenance tracking) + +**Deliverables**: +``` +shared/sexp/ + llm.py # LLM integration — prompt → s-expression generation + catalog.py # primitive/component catalog for LLM context + validation.py # validate generated s-expressions before execution + +rose-ash/ + llm/ # LLM service (or integration with external LLM API) + routes.py # conversational endpoint + training.py # continuous learning from interaction data + prompts/ # system prompts with primitive catalog +``` + +**Tasks**: +1. Build primitive catalog generator (introspect registry → structured docs) +2. Build component catalog generator (introspect defcomp registry → examples) +3. Build query catalog generator (introspect data endpoints → schemas) +4. Design system prompt that teaches LLM the s-expression grammar + primitives +5. Implement generation endpoint (natural language → s-expression) +6. Implement validation layer (parse + type-check generated expressions) +7. Implement conversational refinement (user feedback → modified s-expression) +8. Implement caching of generated s-expressions (prompt hash → CID) +9. Wire into AP for provenance (LLM-generated content attributed to LLM actor) +10. Implement feedback loop (interaction data → training signal) + +**Verification**: User prompt → generated page → visual inspection + primitive coverage audit. + +--- + +## Summary of Phases + +| Phase | What | Depends On | Scope | +|-------|------|-----------|-------| +| 1 | S-expression core library | — | `shared/sexp/` | +| 2 | HTML renderer (HSX-style) | 1 | `shared/sexp/html.py` | +| 3 | Async resolver | 1, 2 | `shared/sexp/resolver.py` | +| 4 | Jinja bridge + fragment migration | 2, 3 | All services' fragment templates | +| 5 | Page layouts as s-expressions | 4 | All services' page templates | +| 6 | Routes as s-expressions | 5 | All services' route handlers | +| 7 | Content addressing + IPFS/IPNS | 3 | `shared/sexp/cache.py` | +| 8 | Media pipeline integration | 1, 7 | `render/` service | +| 9 | Unified DAG executor | 3, 8 | `shared/sexp/executor.py` | +| 10 | Federation of s-expressions | 7, 9 | AP + IPFS integration | +| 11 | LLM as s-expression compiler | 1-6, 9 | `shared/sexp/llm.py`, `llm/` service | + +**Foundation** (Phases 1-3): The s-expression language, HTML rendering, and async resolver. Everything else builds on this. + +**Migration** (Phases 4-6): Incremental replacement of Jinja templates and Python route handlers with s-expressions. The bridge ensures coexistence — the app never breaks during migration. + +**Infrastructure** (Phases 7-9): Content addressing, IPFS/IPNS caching, media pipeline, and unified DAG execution. Pages and video renders become the same computation. + +**Intelligence** (Phases 10-11): Federation makes s-expressions portable across instances. The LLM makes s-expressions accessible to non-programmers — natural language in, rendered pages out. The system learns from its own data, continuously improving the quality of generated s-expressions. + +Each phase is independently deployable. The end state: a platform where the application logic is expressed in a small, composable, content-addressed language that humans author, LLMs generate, resolvers execute, IPFS stores, and ActivityPub federates.