Files

giles 2a1d3a34e7 Update CLAUDE.md with test harness and all 44 MCP tools

- Added sx_harness_eval to the MCP tools table
- Added spec/harness.sx to the specification files list
- Added full test harness design section (sessions, interceptors, IO log,
  assertions, extensibility, platform-specific extensions, CID-based
  component/test association)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-26 07:36:30 +00:00

20 KiB

Raw Blame History

Rose Ash Monorepo

Cooperative web platform: federated content, commerce, events, and media processing. Each domain runs as an independent Quart microservice with its own database, communicating via HMAC-signed internal HTTP and ActivityPub events.

S-expression files — reading and editing protocol

Never use Edit, Read, or Write on .sx or .sxc files. A hook blocks these tools on .sx/.sxc files. Use the sx-tree MCP server tools instead — they operate on the parsed tree, not raw text. Bracket errors are impossible by construction.

Before doing anything in an `.sx` file

Call sx_summarise to get a structural overview of the whole file
Call sx_read_subtree on the region you intend to work in
Call sx_get_context on specific nodes to understand their position
Call sx_find_all to locate definitions or patterns by name
For project-wide searches, use sx_find_across, sx_comp_list, or sx_comp_usage

Never proceed to an edit without first establishing where you are in the tree using the comprehension tools.

For every s-expression edit

Path-based (when you know the exact path):

Call sx_read_subtree on the target region to confirm the correct path
Call sx_replace_node / sx_insert_child / sx_delete_node / sx_wrap_node
Call sx_validate to confirm structural integrity
Call sx_read_subtree again on the edited region to verify the result

Pattern-based (when you can describe what to find):

sx_rename_symbol — rename all occurrences of a symbol in a file
sx_replace_by_pattern — find + replace first/all nodes matching a pattern
sx_insert_near — insert before/after a pattern match (top-level)
sx_rename_across — rename a symbol across all .sx files (use dry_run=true first)

Creating new `.sx` files

Use sx_write_file — it validates the source by parsing before writing. Malformed SX is rejected.

On failure

Read the error carefully. Fragment errors give the parse failure in the new source. Path errors tell you which segment was not found. Fix the specific problem and retry the tree edit. Never fall back to raw file writes.

Available MCP tools (sx-tree server)

Comprehension:

Tool	Purpose
`sx_read_tree`	Annotated tree — auto-summarises large files. Params: `focus` (expand matching subtrees), `max_depth`, `max_lines`/`offset`
`sx_summarise`	Folded overview at configurable depth
`sx_read_subtree`	Expand a specific subtree by path
`sx_get_context`	Enclosing chain from root to target
`sx_find_all`	Search by pattern in one file, returns paths
`sx_get_siblings`	Siblings of a node with target marked
`sx_validate`	Structural integrity checks

Path-based editing:

Tool	Purpose
`sx_replace_node`	Replace node at path with new source
`sx_insert_child`	Insert child at index in a list
`sx_delete_node`	Remove node, siblings shift
`sx_wrap_node`	Wrap in template with `_` placeholder

Smart editing (pattern-based):

Tool	Purpose
`sx_rename_symbol`	Rename all occurrences of a symbol in a file
`sx_replace_by_pattern`	Find + replace first/all nodes matching a pattern. `all=true` for all matches
`sx_insert_near`	Insert before/after a pattern match (top-level). `position="before"` or `"after"`
`sx_rename_across`	Rename symbol across all `.sx` files in a directory. `dry_run=true` to preview

Project-wide:

Tool	Purpose
`sx_find_across`	Search pattern across all `.sx` files in a directory
`sx_comp_list`	List all definitions (defcomp/defisland/defmacro/defpage/define) across files
`sx_comp_usage`	Find all uses of a component/symbol across files

Development:

Tool	Purpose
`sx_pretty_print`	Reformat an `.sx` file with indentation. Also used automatically by all edit tools
`sx_write_file`	Create/overwrite `.sx` file with parse validation
`sx_build`	Build JS bundle (`target="js"`) or OCaml binary (`target="ocaml"`)
`sx_test`	Run test suite (`host="js"` or `"ocaml"`, `full=true` for extensions)
`sx_format_check`	Lint: empty bindings, missing bodies, duplicate params
`sx_macroexpand`	Evaluate expression with a file's macro definitions loaded
`sx_eval`	REPL — evaluate SX expressions in the MCP server env

Git integration:

Tool	Purpose
`sx_changed`	List `.sx` files changed since a ref with structural summaries
`sx_diff_branch`	Structural diff of all `.sx` changes on branch vs base ref
`sx_blame`	Git blame for `.sx` file, optionally focused on a tree path

Test harness:

Tool	Purpose
`sx_harness_eval`	Evaluate SX in a sandboxed harness with mock IO. Returns result + IO trace. Params: `expr`, optional `mock` (SX dict of overrides), optional `file` (load definitions)

Analysis:

Tool	Purpose
`sx_diff`	Structural diff between two `.sx` files (ADDED/REMOVED/CHANGED)
`sx_doc_gen`	Generate component docs from signatures across a directory
`sx_playwright`	Run Playwright browser tests for the SX docs site

Deployment

Do NOT push until explicitly told to. Pushes reload code to dev automatically.
NEVER push to main — pushing to main triggers a PRODUCTION deploy. Only push to main when the user explicitly requests a production deploy. Work on the macros branch by default; merge to main only with explicit permission.

Project Structure

blog/           # Content management, Ghost CMS sync, navigation, WYSIWYG editor
market/         # Product catalog, marketplace pages, web scraping
cart/           # Shopping cart CRUD, checkout (delegates order creation to orders)
events/         # Calendar & event management, ticketing
federation/     # ActivityPub social hub, user profiles
account/        # OAuth2 authorization server, user dashboard, membership
orders/         # Order history, SumUp payment/webhook handling, reconciliation
relations/      # (internal) Cross-domain parent/child relationship tracking
likes/          # (internal) Unified like/favourite tracking across domains
shared/         # Shared library: models, infrastructure, templates, static assets
artdag/         # Art DAG — media processing engine (separate codebase, see below)

Shared Library (`shared/`)

shared/
  models/          # Canonical SQLAlchemy ORM models for all domains
  db/              # Async session management, per-domain DB support, alembic helpers
  infrastructure/  # App factory, OAuth, ActivityPub, fragments, internal auth, Jinja
  services/        # Domain service implementations + DI registry
  contracts/       # DTOs and service protocols
  browser/         # Middleware, Redis caching, CSRF, error handlers
  events/          # Activity bus + background processor (AP-shaped events)
  config/          # YAML config loading (frozen/readonly)
  static/          # Shared CSS, JS, images
  templates/       # Base HTML layouts, partials (inherited by all apps)

Art DAG (`artdag/`)

Federated content-addressed DAG execution engine for distributed media processing.

artdag/
  core/      # DAG engine (artdag package) — nodes, effects, analysis, planning
  l1/        # L1 Celery rendering server (FastAPI + Celery + Redis + PostgreSQL)
  l2/        # L2 ActivityPub registry (FastAPI + PostgreSQL)
  common/    # Shared templates, middleware, models (artdag_common package)
  client/    # CLI client
  test/      # Integration & e2e tests

SX Language — Canonical Reference

The SX language is defined by a self-hosting specification in shared/sx/ref/. Read these files for authoritative SX semantics — they supersede any implementation detail in sx.js or Python evaluators.

Specification files

shared/sx/ref/eval.sx — Core evaluator: types, trampoline (TCO), eval-expr dispatch, special forms (if, when, cond, case, let, and, or, lambda, define, defcomp, defmacro, quasiquote), higher-order forms (map, filter, reduce, some, every?, for-each), macro expansion, function/lambda/component calling.
shared/sx/ref/parser.sx — Tokenizer and parser: grammar, string escapes, dict literals {:key val}, quote sugar (`, ,, ,@), serializer.
shared/sx/ref/primitives.sx — All ~80 built-in pure functions: arithmetic, comparison, predicates, string ops, collection ops, dict ops, format helpers, CSSX style primitives.
shared/sx/ref/render.sx — Three rendering modes: render-to-html (server HTML), render-to-sx/aser (SX wire format for client), render-to-dom (browser). HTML tag registry, void elements, boolean attrs.
shared/sx/ref/bootstrap_js.py — Transpiler: reads the .sx spec files and emits sx-ref.js.
spec/harness.sx — Test harness: mock IO platform for testing components. Sessions, IO interception, log queries, assertions (assert-io-called, assert-io-count, assert-io-args, assert-no-io, assert-state). Extensible — new platforms add entries to the platform dict. Loaded automatically by test runners.
spec/tests/test-harness.sx — Tests for the harness itself (15 tests).

Type system

number, string, boolean, nil, symbol, keyword, list, dict,
lambda, component, macro, thunk (TCO deferred eval)

Evaluation rules (from eval.sx)

Literals (number, string, boolean, nil) — pass through
Symbols — look up in env, then primitives, then true/false/nil, else error
Keywords — evaluate to their string name
Dicts — evaluate all values recursively
Lists — dispatch on head:
- Special forms (if, when, cond, case, let, lambda, define, defcomp, defmacro, quote, quasiquote, begin/do, set!, ->)
- Higher-order forms (map, filter, reduce, some, every?, for-each, map-indexed)
- Macros — expand then re-evaluate
- Function calls — evaluate head and args, then: native callable → apply, lambda → bind params + TCO thunk, component → parse keyword args + bind params + TCO thunk

Component calling convention

(defcomp ~card (&key title subtitle &rest children)
  (div :class "card"
    (h2 title)
    (when subtitle (p subtitle))
    children))

&key params are keyword arguments: (~card :title "Hi" :subtitle "Sub")
&rest children captures positional args as children
Component body evaluated in merged env: closure + caller-env + bound-params

Rendering modes (from render.sx)

Mode	Function	Expands components?	Output
HTML	`render-to-html`	Yes (recursive)	HTML string
SX wire	`aser`	No — serializes `(~name ...)`	SX source text
DOM	`render-to-dom`	Yes (recursive)	DOM nodes

The aser (async-serialize) mode evaluates control flow and function calls but serializes HTML tags and component calls as SX source — the client renders them. This is the wire format for HTMX-like responses.

Test harness (from harness.sx)

The harness provides sandboxed testing of IO behavior. It's a spec-level facility — works on every host.

Core concepts:

Session — (make-harness &key platform) creates a session with mock IO operations
Interceptor — (make-interceptor session op-name mock-fn) wraps a mock to record calls
IO log — append-only trace of every IO call. Query with io-calls, io-call-count, io-call-args
Assertions — assert-io-called, assert-no-io, assert-io-count, assert-io-args, assert-state

Default platform provides 30+ mock IO operations (fetch, query, action, cookies, DOM, storage, etc.) that return sensible empty values. Override per-test with :platform on make-harness.

Extensibility: New platforms add entries to the platform dict. The harness intercepts any registered operation — no harness code changes needed for new IO types.

Platform-specific test extensions live in the platform spec, not the core harness:

web/harness-web.sx — DOM assertions, simulate-click, CSS class checks
web/harness-reactive.sx — signal assertions: assert-signal-value, assert-signal-subscribers

Components ship with tests via deftest forms. Tests reference components by name or CID (:for param). Tests are independent content-addressed objects — anyone can publish tests for any component.

Platform interface

Each target (JS, Python) must provide: type inspection (type-of), constructors (make-lambda, make-component, make-macro, make-thunk), accessors, environment operations (env-has?, env-get, env-set!, env-extend, env-merge), and DOM/HTML rendering primitives.

Tech Stack

Web platform: Python 3.11+, Quart (async Flask), SQLAlchemy (asyncpg), Jinja2, HTMX, PostgreSQL, Redis, Docker Swarm, Hypercorn.

Art DAG: FastAPI, Celery, JAX (CPU/GPU), IPFS/Kubo, Pydantic.

Key Commands

Development

./dev.sh                        # Start all services + infra (db, redis, pgbouncer)
./dev.sh blog market            # Start specific services + infra
./dev.sh --build blog           # Rebuild image then start
./dev.sh down                   # Stop everything
./dev.sh logs blog              # Tail service logs

Deployment

./deploy.sh                     # Auto-detect changed apps, build + push + restart
./deploy.sh blog market         # Deploy specific apps
./deploy.sh --all               # Deploy everything

Art DAG

cd artdag/l1 && pytest tests/              # L1 unit tests
cd artdag/core && pytest tests/            # Core unit tests
cd artdag/test && python run.py            # Full integration pipeline
cd artdag/l1 && ruff check .               # Lint
cd artdag/l1 && mypy app/types.py app/routers/recipes.py tests/

Architecture Patterns

Web Platform

App factory: create_base_app(name, context_fn, before_request_fns, domain_services_fn) in shared/infrastructure/factory.py — creates Quart app with DB, Redis, CSRF, OAuth, AP, session management
Blueprint pattern: Each blueprint exposes register() -> Blueprint, handlers stored in _handlers dict
Per-service database: Each service has own PostgreSQL DB via PgBouncer; cross-domain data fetched via HTTP
Alembic per-service: Each service declares MODELS and TABLES in alembic/env.py, delegates to shared.db.alembic_env.run_alembic()
Inter-service reads: fetch_data(service, query, params) → GET /internal/data/{query} (HMAC-signed, 3s timeout)
Inter-service writes: call_action(service, action, payload) → POST /internal/actions/{action} (HMAC-signed, 5s timeout)
Inter-service AP inbox: send_internal_activity() → POST /internal/inbox (HMAC-signed, AP-shaped activities for cross-service writes)
Fragments: HTML fragments fetched cross-service via fetch_fragments() for composing shared UI (nav, cart mini, auth menu)
Soft deletes: Models use deleted_at column pattern
Context processors: Each app provides its own context_fn that assembles template context from local DB + cross-service fragments

Auth

Account is the OAuth2 authorization server; all other apps are OAuth clients
Per-app first-party session cookies (Safari ITP compatible), synchronized via device ID
Grant verification: apps check grant validity against account DB (cached in Redis)
Silent SSO: prompt=none OAuth flow for automatic cross-app login
ActivityPub: RSA signatures, per-app virtual actor projections sharing same keypair

SX Rendering Pipeline

The SX system renders component trees defined in s-expressions. Canonical semantics are in shared/sx/ref/ (see "SX Language" section above). The same AST can be evaluated in different modes depending on where the server/client rendering boundary is drawn:

render_to_html(name, **kw) — server-side, produces HTML. Maps to render-to-html in the spec.
render_to_sx(name, **kw) — server-side, produces SX wire format. Maps to aser in the spec. Component calls stay unexpanded.
render_to_sx_with_env(name, env, **kw) — server-side, expands known components then serializes as SX wire format. Used by layout components that need Python context.
sx_page(ctx, page_sx) — produces the full HTML shell (<!doctype html>...) with component definitions, CSS, and page SX inlined for client-side boot.

See the docstring in shared/sx/async_eval.py for the full evaluation modes table.

Service SX Directory Convention

Each service has two SX-related directories:

{service}/sx/ — service-specific component definitions (.sx files with defcomp). Loaded at startup by load_service_components(). These define layout components, reusable UI fragments, etc.
{service}/sxc/ — page definitions and Python rendering logic. Contains defpage definitions (client-routed pages) and the Python functions that compose headers, layouts, and page content.

Shared components live in shared/sx/templates/ and are loaded by load_shared_components() in the app factory.

Art DAG

3-Phase Execution: Analyze → Plan → Execute (tasks in artdag/l1/tasks/)
Content-Addressed: All data identified by SHA3-256 hashes or IPFS CIDs
S-Expression Effects: Composable effect language in artdag/l1/sexp_effects/
Storage: Local filesystem, S3, or IPFS backends
L1 ↔ L2: scoped JWT tokens; L2: password + OAuth SSO

Domains

Service	Public URL	Dev Port
blog	blog.rose-ash.com	8001
market	market.rose-ash.com	8002
cart	cart.rose-ash.com	8003
events	events.rose-ash.com	8004
federation	federation.rose-ash.com	8005
account	account.rose-ash.com	8006
relations	(internal only)	8008
likes	(internal only)	8009
orders	orders.rose-ash.com	8010

Dev Container Mounts

Dev bind mounts in docker-compose.dev.yml must mirror the Docker image's COPY paths. When adding a new directory to a service (e.g. {service}/sx/), add a corresponding volume mount (./service/sx:/app/sx) or the directory won't be visible inside the dev container. Hypercorn --reload watches for Python file changes; .sx file hot-reload is handled by reload_if_changed() in shared/sx/jinja_bridge.py.

Key Config Files

docker-compose.yml / docker-compose.dev.yml — service definitions, env vars, volumes
deploy.sh / dev.sh — deployment and development scripts
shared/infrastructure/factory.py — app factory (all services use this)
{service}/alembic/env.py — per-service migration config
_config/app-config.yaml — runtime YAML config (mounted into containers)

Tools

Use Context7 MCP for up-to-date library documentation
Playwright MCP is available for browser automation/testing

Service introspection MCP (rose-ash-services)

Python-based MCP server for understanding the microservice topology. Static analysis — works without running services.

Tool	Purpose
`svc_status`	Docker container status for all rose-ash services
`svc_routes`	List all HTTP routes for a service by scanning blueprints
`svc_calls`	Map inter-service calls (fetch_data/call_action/send_internal_activity/fetch_fragment)
`svc_config`	Environment variables and config for a service
`svc_models`	SQLAlchemy models, columns, relationships for a service
`svc_schema`	Live defquery/defaction manifest from a running service
`alembic_status`	Migration count and latest migration per service
`svc_logs`	Recent Docker logs for a service
`svc_start`	Start services via dev.sh
`svc_stop`	Stop all services
`svc_queries`	List all defquery definitions from queries.sx files
`svc_actions`	List all defaction definitions from actions.sx files

20 KiB Raw Blame History

Rose Ash Monorepo

S-expression files — reading and editing protocol

Before doing anything in an .sx file

For every s-expression edit

Creating new .sx files

On failure

Available MCP tools (sx-tree server)

Deployment

Project Structure

Shared Library (shared/)

Art DAG (artdag/)

SX Language — Canonical Reference

Specification files

Type system

Evaluation rules (from eval.sx)

Component calling convention

Rendering modes (from render.sx)

Test harness (from harness.sx)

Platform interface

Tech Stack

Key Commands

Development

Deployment

Art DAG

Architecture Patterns

Web Platform

Auth

SX Rendering Pipeline

Service SX Directory Convention

Art DAG

Domains

Dev Container Mounts

Key Config Files

Tools

Service introspection MCP (rose-ash-services)

20 KiB

Raw Blame History

Before doing anything in an `.sx` file

Creating new `.sx` files

Shared Library (`shared/`)

Art DAG (`artdag/`)