diff --git a/sx/sx/essays/hypermedia-age-of-ai.sx b/sx/sx/essays/hypermedia-age-of-ai.sx index 016d476..634cf91 100644 --- a/sx/sx/essays/hypermedia-age-of-ai.sx +++ b/sx/sx/essays/hypermedia-age-of-ai.sx @@ -6,115 +6,101 @@ (defcomp ~essays/hypermedia-age-of-ai/essay-hypermedia-age-of-ai () (~docs/page :title "Hypermedia in the Age of AI" (p :class "text-stone-500 text-sm italic mb-8" - "Nick Blow argues that JSON hypermedia can serve AI agents better than HTML or RPC. He is right about the problem and wrong about the solution. The answer is not a better serialization of links. It is a representation that is simultaneously content, control, and code.") + "Neither JSON nor HTML is hypermedia. There is only the hypermedium — a self-defining representation — and s-expressions are an instance of it.") (~docs/section :title "I. The argument" :id "the-argument" (p :class "text-stone-600" - (a :href "https://nickblow.tech/posts/hypermedia-in-the-age-of-ai" :class "text-violet-600 hover:underline" "Blow's essay") - " starts from a position familiar to anyone who has followed the hypermedia discourse: Carson Gross's contention that sprinkling link objects into JSON does not make an API truly RESTful, because REST demands generic clients capable of interpreting hypermedia controls. Blow agrees that this has merit but argues the position is too restrictive. HTML is not the universal hypermedia format. LLMs choke on it. The real prize is " (em "progressive discovery") " — a client that learns what it can do by following links, not by reading documentation upfront.") + (a :href "https://nickblow.tech/posts/hypermedia-in-the-age-of-ai" :class "text-violet-600 hover:underline" "Nick Blow argues") + " that JSON hypermedia can serve AI agents better than HTML or RPC. Carson Gross contends that sprinkling link objects into JSON does not make an API truly RESTful, because REST demands generic clients capable of interpreting hypermedia controls. Blow agrees this has merit but thinks the position too restrictive. HTML is not the universal hypermedia format — LLMs choke on it. The real prize is " (em "progressive discovery") ": a client that learns what it can do by following links, not by reading documentation upfront.") (p :class "text-stone-600" - "He contrasts this with MCP, the Model Context Protocol now dominant for LLM tool use. MCP is RPC in a trench coat: the server declares all its tools upfront, the model receives the full catalogue in its system prompt, and it calls functions by name. This works, but it does not scale. A file management API that exposes create, read, update, delete, list, search, share, unshare, move, copy, rename, and permission management overwhelms the model before it has done anything. MCP forces " (em "total disclosure") " where hypermedia would offer " (em "progressive revelation") ".") + "He contrasts this with MCP, the Model Context Protocol now dominant for LLM tool use. MCP is RPC in a trench coat: the server declares all its tools upfront, the model receives the full catalogue in its system prompt, and it calls functions by name. This works, but it does not scale. MCP forces " (em "total disclosure") " where hypermedia would offer " (em "progressive revelation") ".") (p :class "text-stone-600" - "Blow proposes JSON-flavoured hypermedia as the fix — " (code "vnd.siren+json") ", custom content types, link relations in response payloads — so agents can discover capabilities by doing, not by reading a manual. Resources become state machines. Hyperlinks are state transitions. The agent explores the graph.") + "Blow proposes JSON-flavoured hypermedia as the fix — " (code "vnd.siren+json") ", custom content types, link relations in response payloads. Resources become state machines. Hyperlinks are state transitions. The agent explores the graph.") (p :class "text-stone-600" - "This is the right diagnosis. The prescription is too weak.")) + "This is the right diagnosis. But the prescription misidentifies the disease. The problem is not that we need a better wire format for links. The problem is that neither JSON nor HTML is actually hypermedia. And no amount of convention layered on top will make them so, because what makes something hypermedia is not what it carries but whether it can " (em "define itself") ".")) - (~docs/section :title "II. The JSON hypermedia trap" :id "json-trap" + (~docs/section :title "II. The self-definition criterion" :id "self-definition" (p :class "text-stone-600" - "JSON hypermedia has been proposed, specified, and implemented many times. HAL, JSON:API, Siren, Hydra, UBER, Collection+JSON. Each adds a linking convention on top of JSON. Each generates roughly the same amount of adoption: enough for a conference talk, not enough for an ecosystem. The reason is structural, not cultural.") + "What does it mean for a format to be hypermedia? The conventional answer focuses on " (em "affordances") ": a hypermedia format includes controls that tell the client what it can do next. Links, forms, actions. This is the definition that drives the JSON hypermedia proposals — add links to JSON, and JSON becomes hypermedia.") (p :class "text-stone-600" - "JSON is a data serialization format. It represents maps, arrays, strings, numbers, booleans, and null. That is all it represents. To make it hypermedia you must layer conventions on top: " (em "this key means a link") ", " (em "this object means an action") ", " (em "this array means available transitions") ". The client must know the convention. The convention must be specified in a separate document. The separate document must be read before the response makes sense.") + "But this definition is too shallow. It describes a property of the content without asking where the " (em "meaning") " of the content comes from. A Siren response includes an " (code "actions") " array — but what makes that array mean \"available actions\" rather than just \"a list of objects with href fields\"? The Siren specification, written in English prose, hosted on a separate document, maintained by a separate community. The meaning is " (em "external") " to the format.") (p :class "text-stone-600" - "This is exactly the problem Blow identifies with MCP — the client needs upfront knowledge — relocated from the system prompt to a media type specification. You have not eliminated the manual. You have moved it.") + "A true hypermedium does not need an external document to explain what it means. It defines its own interpretation. The format carries not just content and controls, but the rules by which content and controls are understood. The map " (em "is") " the territory — not because the map is accurate, but because the map can redraw itself.") (p :class "text-stone-600" - "HTML avoids this because presentation " (em "is") " semantics. A " (code "
") " with " (code "method=\"POST\"") " and " (code "action=\"/orders\"") " is self-interpreting to a browser. The browser does not need a separate specification explaining what forms do. The rendering itself conveys the interaction model. This is what makes HTML genuinely hypermedia and JSON-with-links merely data-with-metadata.") - (p :class "text-stone-600" - "But Blow is right that HTML is a terrible format for AI. The " (a :href "/sx/(etc.(essay.sx-and-ai))" :class "text-violet-600 hover:underline" "syntax tax") " is enormous. Attributes split across arbitrary quoting rules. Closing tags that must match opening tags. CSS class strings encoding visual semantics in an opaque blob. JavaScript embedded in event handlers with different escaping rules. An LLM reading an HTML page is doing archaeology, not interpretation.") - (p :class "text-stone-600" - "So: HTML is hypermedia but hostile to machines. JSON is machine-friendly but not hypermedia. The obvious question — is there a format that is both? — is the question nobody in this discourse seems to be asking.")) + "This is a strict criterion. It eliminates almost everything. And it should, because almost nothing on the web is actually hypermedia in this sense. Most of what we call hypermedia is data decorated with navigational metadata, interpreted by an engine built from a separate specification in a separate language.")) - (~docs/section :title "III. The third option" :id "third-option" + (~docs/section :title "III. JSON cannot define itself" :id "json-cannot" (p :class "text-stone-600" - "S-expressions.") + "You cannot write the JSON specification in JSON.") (p :class "text-stone-600" - "Consider what an SX response looks like when the server sends a page fragment:") - (~docs/code :code (highlight "(div :class \"space-y-4\"\n (h2 \"Your orders\")\n (ul :class \"divide-y\"\n (li :class \"py-3\"\n (span :class \"font-medium\" \"Order #4281\")\n (span :class \"text-stone-500\" \"3 items · £42.00\")\n (div :class \"mt-2 flex gap-2\"\n (a :sx-get \"/orders/4281\"\n :sx-target \"#main\"\n :class \"text-violet-600\"\n \"View details\")\n (button :sx-post \"/orders/4281/reorder\"\n :sx-target \"#main\"\n :class \"text-violet-600\"\n \"Reorder\"))))\n (a :sx-get \"/orders?page=2\"\n :sx-target \"#main\"\n :sx-swap \"innerHTML\"\n :class \"text-violet-600\"\n \"Next page\"))" "lisp")) + "This is not a trivia question. It reveals something fundamental about the format. JSON is a data serialization: maps, arrays, strings, numbers, booleans, null. It has no evaluation semantics. It cannot express a grammar. It cannot express a parser. It cannot express the rules by which a JSON document should be interpreted — because it cannot express " (em "rules") " at all. JSON describes " (em "what") " without any capacity to describe " (em "how") " or " (em "why") ".") (p :class "text-stone-600" - "This is simultaneously:") - (ul :class "space-y-2 text-stone-600" - (li (strong "Content") " — it describes what to display: headings, text, layout.") - (li (strong "Presentation") " — it specifies how to display it: classes, structure, hierarchy.") - (li (strong "Controls") " — it declares what the user " (em "can do next") ": view details, reorder, paginate. Each control carries its method (" (code "sx-get") ", " (code "sx-post") "), target, and swap strategy.") - (li (strong "Code") " — it is a valid program. An evaluator can parse it, walk it, extract the controls, understand the state transitions, and act on them.")) + "The JSON specification is " (a :href "https://www.rfc-editor.org/rfc/rfc8259" :class "text-violet-600 hover:underline" "RFC 8259") " — English prose. This is not an accident or a stylistic choice. JSON " (em "cannot") " define itself because it has no mechanisms for definition. It is inert data. Every scrap of meaning attached to a JSON document — this key means a link, this object means an action, this array means available transitions — must come from outside the document. From a spec. From a schema. From shared understanding between producer and consumer.") (p :class "text-stone-600" - "An LLM reading this response does not need a Siren specification to understand the available actions. The actions are " (em "in the content") " — just as they are in HTML, but in a format with " (a :href "/sx/(etc.(essay.sx-and-ai))" :class "text-violet-600 hover:underline" "essentially zero syntax tax") ". No closing tags. No attribute quoting ambiguity. No JavaScript in event handlers. One syntactic form — " (code "(head args...)") " — for everything.") - (p :class "text-stone-600" - "This is not JSON with links bolted on. This is not HTML simplified. This is a representation where content and control are " (em "the same syntax") " — because s-expressions make no distinction between data and code. A " (code "div") " and an " (code "sx-get") " are both list elements. The AI reads them with the same parser, reasons about them with the same model, and generates them with the same grammar.")) + "This is why JSON hypermedia formats never achieve escape velocity. HAL, Siren, Hydra, JSON:API, UBER, Collection+JSON — each adds a linking convention on top of JSON, and each generates the same amount of adoption: enough for a conference talk, not enough for an ecosystem. The problem is not that the conventions are poorly designed. The problem is that a convention layered on inert data is " (em "always") " a separate document the client must already understand. You have not eliminated the manual. You have moved it from the system prompt to a media type specification.")) - (~docs/section :title "IV. Progressive discovery, natively" :id "progressive-discovery" + (~docs/section :title "IV. HTML carries but does not define itself" :id "html-carries" (p :class "text-stone-600" - "Blow's strongest argument is against MCP's total-disclosure model. A hypermedia API reveals capabilities incrementally: you see what you can do " (em "from where you are") ", not everything the system supports. This is how the web works for humans — you do not receive a manifest of every URL on a site before you can browse it.") + "HTML is closer to the mark. You " (em "can") " write the HTML specification in HTML — and indeed, the " (a :href "https://html.spec.whatwg.org/" :class "text-violet-600 hover:underline" "WHATWG specification") " is an HTML document. This already puts HTML in a different category from JSON. The format can at least " (em "carry") " its own definition.") (p :class "text-stone-600" - "SX achieves this without any additional protocol machinery. Each response is a tree of content and controls. The controls present in " (em "this") " response are " (em "this") " resource's available transitions. Follow one, get a new tree with new controls. The state machine Blow describes is not layered on top of the data format — it " (em "is") " the data format.") + "But carrying is not defining. The HTML spec rendered in a browser is text and diagrams — documentation that happens to be displayed in the format it documents. The browser interpreting that document was built from a " (em "separate implementation") " — millions of lines of C++ in Chromium, Gecko, WebKit — that was written by reading the spec and translating it into executable code. The spec does not " (em "execute") ". It does not " (em "interpret") ". It does not " (em "define itself") " in any operative sense.") (p :class "text-stone-600" - "But SX goes further than any JSON hypermedia format can, because the controls are not just declarations — they are " (em "evaluable") ". Consider a response that includes conditional controls:") - (~docs/code :code (highlight "(div :class \"order-actions\"\n (when can-cancel\n (button :sx-post \"/orders/4281/cancel\"\n :sx-confirm \"Cancel this order?\"\n :class \"text-red-600\"\n \"Cancel order\"))\n (when can-refund\n (button :sx-post \"/orders/4281/refund\"\n :sx-target \"#status\"\n \"Request refund\"))\n (when (and shipped tracking-url)\n (a :href tracking-url\n :class \"text-violet-600\"\n \"Track shipment\")))" "lisp")) + "Put it this way: if you gave the HTML specification (as an HTML document) to a system that had never seen HTML before, could that system learn to render HTML by reading it? No. The document is English prose in HTML tags. The tags are meaningless without a pre-existing renderer. The spec " (em "presupposes") " the very thing it specifies. It is circular, but not " (em "productively") " circular — not metacircular in the way that would let the system bootstrap itself.") (p :class "text-stone-600" - "The available actions depend on server-evaluated state. An order that has shipped shows a tracking link. An order that can be cancelled shows a cancel button. The client — human or AI — sees only the actions that apply " (em "right now") ". This is not progressive discovery bolted onto a data format. It is the server authoring a state machine in the response itself, using the same language as the content.") - (p :class "text-stone-600" - "A JSON hypermedia format would need a separate " (code "actions") " array with method/href/type metadata, plus a schema for each action's payload, plus documentation for what each action does. SX needs none of this. The button is its own documentation. Its label says what it does. Its attributes say how. An AI reading " (code "(button :sx-post \"/orders/4281/cancel\" \"Cancel order\")") " knows everything it needs to act.")) + "This matters because it means HTML's meaning is ultimately external too. More deeply embedded than JSON's — the browser internalizes the spec so thoroughly that HTML " (em "feels") " self-interpreting — but still dependent on a vast external system to give it life. The " (code "") " tag is self-describing only to a client that already knows what forms are. To everything else, it is angle brackets.")) - (~docs/section :title "V. The component advantage" :id "component-advantage" + (~docs/section :title "V. SX defines itself" :id "sx-defines-itself" (p :class "text-stone-600" - "Blow does not discuss the composition problem, but it is where his proposal breaks down hardest. JSON hypermedia formats specify individual resources. They do not specify how resources compose into interfaces. A Siren entity has properties, actions, and links — but no concept of a reusable UI fragment that accepts parameters and renders children.") + "The SX specification is written in SX. This sounds like the same trick as HTML — the spec in its own format — but it is categorically different.") (p :class "text-stone-600" - "SX has components:") - (~docs/code :code (highlight "(defcomp ~order-card (&key order &rest actions)\n (div :class \"rounded border p-4\"\n (div :class \"flex justify-between\"\n (span :class \"font-medium\"\n (str \"Order #\" (get order \"id\")))\n (span :class \"text-stone-500\"\n (get order \"status\")))\n (p :class \"text-sm text-stone-600 mt-1\"\n (str (get order \"item-count\") \" items · \"\n (get order \"total\")))\n (div :class \"mt-3 flex gap-2\" actions)))" "lisp")) + (code "eval.sx") " defines the SX evaluator as s-expressions. Not as documentation " (em "about") " the evaluator. As the evaluator " (em "itself") ". A bootstrapper reads " (code "eval.sx") " and produces a working evaluator — in JavaScript, in Python, in any target language. " (code "parser.sx") " defines the SX parser as s-expressions. A bootstrapper reads it and produces a working parser. " (code "render.sx") " defines the renderer. " (code "primitives.sx") " defines the primitive operations.") + (~docs/code :code (highlight ";; From eval.sx — the evaluator defining itself:\n\n(define eval-expr\n (fn (expr env mode)\n (cond\n ((number? expr) expr)\n ((string? expr) expr)\n ((boolean? expr) expr)\n ((nil? expr) expr)\n ((symbol? expr) (resolve-symbol expr env))\n ((keyword? expr) (keyword-name expr))\n ((dict? expr) (eval-dict expr env mode))\n ((list? expr)\n (let ((head (first expr)))\n (if (symbol? head)\n (let ((name (symbol-name head)))\n (if (special-form? name)\n (eval-special-form name (rest expr) env mode)\n (eval-call expr env mode)))\n (eval-call expr env mode)))))))" "lisp")) (p :class "text-stone-600" - "This is a hypermedia control in the fullest sense — it takes data and actions as parameters and renders a self-contained interactive unit. An AI generating a page can compose it:") - (~docs/code :code (highlight "(map (fn (order)\n (~order-card :order order\n (when (get order \"can-reorder\")\n (button :sx-post (str \"/orders/\" (get order \"id\") \"/reorder\")\n :sx-target \"#main\"\n \"Reorder\"))))\n orders)" "lisp")) + "This is not documentation. It is a " (em "program that defines the rules of its own interpretation") ". The evaluator that processes SX expressions is itself an SX expression. The spec does not merely " (em "describe") " how SX works — it " (em "is") " how SX works. Give this file to a bootstrapper, and out comes a functioning evaluator. The specification is executable. The definition " (em "is") " the implementation.") (p :class "text-stone-600" - "No JSON hypermedia format supports this. They cannot, because JSON has no function abstraction. You can parameterize data — templates, schemas — but you cannot express " (em "a reusable piece of interactive UI") " in JSON. You always need a separate rendering layer that interprets the JSON and produces something visual. SX does not have this separation. The component definition " (em "is") " the rendering.")) + "This is what makes SX a genuine hypermedium. The meaning of an SX document is not external. It is not in a separate RFC. It is not in a browser engine compiled from a prose specification. The meaning is " (em "in the same language as the content") " — and it is the kind of meaning that executes. You do not need prior knowledge of SX to interpret SX, because SX carries the knowledge required to interpret it. You need only a bootstrapper — a minimal bridge to a host language — and the spec bootstraps the rest.") + (p :class "text-stone-600" + "The " (a :href "/sx/(etc.(essay.self-defining-medium))" :class "text-violet-600 hover:underline" "true hypermedium must define itself with itself") ". JSON cannot even attempt this. HTML can carry the words but not the meaning. SX carries the meaning because the meaning is code, and SX is code all the way down.")) - (~docs/section :title "VI. SX URLs as evaluable affordances" :id "evaluable-affordances" + (~docs/section :title "VI. What self-definition gives you" :id "consequences" (p :class "text-stone-600" - "Blow describes REST resources as state machines where hyperlinks represent allowed transitions. This maps cleanly to SX's URL system, which goes further: URLs are not opaque strings but " (a :href "/sx/(applications.(sx-urls))" :class "text-violet-600 hover:underline" "evaluable expressions") ".") - (~docs/code :code (highlight ";; Opaque URL (conventional)\n\"/orders/4281/details\"\n\n;; SX URL (evaluable)\n\"/sx/(etc.(essay.hypermedia-age-of-ai))\"\n\n;; The URL is a program:\n;; 1. Call the 'etc' section function\n;; 2. Which calls the 'essay' page function\n;; 3. With slug \"hypermedia-age-of-ai\"\n;; 4. Returns the component AST to render" "lisp")) + "Every practical advantage of SX over JSON and HTML for hypermedia flows from this single property.") (p :class "text-stone-600" - "The URL itself is a composition of functions. An AI examining the URL can understand the content hierarchy — this is an essay, in the 'etc' section, about a specific topic. It can generate new URLs by composing known functions: " (code "(etc.(essay.new-slug))") " follows the same pattern. The addressing scheme is not a convention imposed from outside. It is the language applied to navigation.") + (strong "Progressive discovery") " works because controls are not metadata interpreted by convention — they are expressions evaluated by the same evaluator that processes content. The server renders conditional controls using the language itself:") + (~docs/code :code (highlight "(when can-cancel\n (button :sx-post \"/orders/4281/cancel\"\n :sx-confirm \"Cancel this order?\"\n \"Cancel order\"))" "lisp")) (p :class "text-stone-600" - "This dissolves the distinction between \"following a link\" and \"calling a function\" that bedevils every attempt to make JSON APIs hypermedia. In SX, following a link " (em "is") " calling a function. The URL evaluates to a component tree. The component tree renders to interactive content. The interactive content contains more URLs. The cycle is closed without any protocol machinery beyond HTTP.")) + "The " (code "when") " is not a convention layered on data. It is a special form defined in " (code "eval.sx") ". The server evaluates it. The client sees only the controls that survive evaluation. The state machine is authored in the language, not in metadata " (em "about") " the language.") + (p :class "text-stone-600" + (strong "Components") " work because the language has function abstraction — " (code "defcomp") " — defined in its own spec. A component is a reusable hypermedia control that accepts parameters and renders children. JSON cannot have components because JSON cannot define functions. HTML cannot have components natively — Web Components are defined in JavaScript, a " (em "separate") " language. SX components are defined in SX, evaluated by an evaluator defined in SX.") + (p :class "text-stone-600" + (strong "Evaluable URLs") " work because the URL is an s-expression — " (code "(etc.(essay.hypermedia-age-of-ai))") " — evaluated by the same evaluator. Following a link " (em "is") " calling a function. The addressing scheme is not a convention imposed from outside. It is the " (a :href "/sx/(applications.(sx-urls))" :class "text-violet-600 hover:underline" "language applied to navigation") ".") + (p :class "text-stone-600" + (strong "AI legibility") " works because the " (a :href "/sx/(etc.(essay.sx-and-ai))" :class "text-violet-600 hover:underline" "spec fits in a context window") ". An AI agent can load " (code "eval.sx") ", " (code "parser.sx") ", " (code "render.sx") ", and " (code "primitives.sx") " — roughly 3,000 lines — and hold the " (em "complete definition of the language") " alongside the content it is reading. It does not need training data about SX. It does not need documentation. It has the actual executable specification. The language it reads and the language that defines how to read it are the same language.") + (p :class "text-stone-600" + "None of these are features bolted onto a data format. They are consequences of a format that defines itself.")) - (~docs/section :title "VII. The AI agent that reads the page" :id "ai-agent" + (~docs/section :title "VII. What MCP gets wrong" :id "mcp" (p :class "text-stone-600" - "Here is the scenario Blow is really imagining: an LLM agent that interacts with web services not through a bespoke API layer but by reading responses and following controls, the way a human reads a page and clicks links. MCP makes this possible through function calling. Blow argues hypermedia would make it better through progressive discovery.") + "MCP's design reflects a fundamental confusion: it treats capability as a catalogue rather than an affordance. The server lists every tool. The model reads the list. The model calls a tool by name. The response is data. The model must maintain its own mental model of the server's state to know which tools are appropriate next.") (p :class "text-stone-600" - "SX makes it natural.") + "This is the exact inversion of the hypermedia principle. In a hypermedia system, the server tells you what you can do " (em "in each response") ". You do not need a mental model of the server because the server renders its own state into the controls it offers you.") (p :class "text-stone-600" - "An SX response is a tree. The AI parses it (trivially — parentheses balance). It extracts all " (code "sx-get") " and " (code "sx-post") " attributes to build a list of available actions. It reads labels and context to understand what each action does. It decides which action to take. It issues the request. It receives a new tree. The loop repeats.") + "But the deeper problem with MCP is not the catalogue model. It is that MCP tools are defined " (em "outside") " the protocol — in Python functions, TypeScript classes, whatever the server implementer chose. The tool definitions are opaque to the client. The model sees names and descriptions, not the actual logic. It is trusting documentation written by a human about code written by a human. Every layer of indirection is a place where meaning can leak.") (p :class "text-stone-600" - "But unlike HTML, the AI does not need to strip out presentational noise to find the semantic signal. There is no " (code "
") " wrapping a " (code "
") " wrapping a " (code "
") " before the actual content. The tree structure " (em "is") " the semantic structure. Every node is either content or control, and the distinction is syntactically obvious: controls have " (code "sx-") " attributes, content does not.") - (p :class "text-stone-600" - "And unlike JSON hypermedia, the AI does not need to understand a media type specification to interpret the controls. The controls are HTML-like elements with method and target attributes — a pattern every LLM already understands from training on web content. SX inherits the discoverability of HTML (controls are self-describing) without the noise (no closing tags, no attribute soup, no CSS class archaeology).") - (p :class "text-stone-600" - "The " (a :href "/sx/(etc.(essay.sx-and-ai))" :class "text-violet-600 hover:underline" "spec fits in a context window") ". The complete SX language — evaluator, parser, renderer, all primitives — is roughly 3,000 lines. An AI agent that loads the spec into its context can not only read SX responses but " (em "generate") " them. It can produce new pages, new components, new interactions — because the language it reads and the language it writes are the same, and both fit in memory at once.")) + "In an SX hypermedia system, the controls in the response " (em "are") " the capability. Not a description of capability. Not a pointer to capability. The control itself — its structure, its attributes, its evaluable URL — carries everything needed to act on it. And the rules for interpreting the control are carried by the same language the control is written in. There is no indirection. There is no separate layer where meaning must be maintained. The response means what it says, and it says what it means, because it carries its own interpreter.")) - (~docs/section :title "VIII. What MCP gets wrong" :id "mcp" + (~docs/section :title "VIII. There is only the hypermedium" :id "the-hypermedium" (p :class "text-stone-600" - "MCP's design reflects a fundamental confusion: it treats capability as a catalogue rather than an affordance. The server lists every tool. The model reads the list. The model calls a tool. The response is data. The model must maintain its own mental model of the server's state to know which tools are appropriate next.") + "The hypermedia discourse frames the question as a competition between formats: HTML vs JSON, server-rendered vs client-rendered, REST vs RPC. Blow adds a new axis — human clients vs AI agents — and asks which format serves both. These are all the wrong questions. They assume that hypermedia is a property that a format can have in greater or lesser degree, and the task is to find the format with the most of it.") (p :class "text-stone-600" - "This is the exact inversion of the hypermedia principle. In a hypermedia system, the server tells you what you can do " (em "in each response") ". You do not need a mental model of the server because the server renders its own state into the controls it offers you. Following controls is safe because the server would not offer a control that is not valid right now.") + "Hypermedia is not a spectrum. It is a threshold. Either a format can define its own interpretation, or it cannot. If it cannot, it depends on external meaning — an RFC, a browser engine, a media type specification — and that dependency makes it something less than a " (em "medium") ". A medium is self-sustaining. It does not require a separate system to explain what it is. Sheet music is a medium because a musician can read it and produce sound. A JSON object with " (code "\"type\": \"sheet-music\"") " is not a medium. It is data that requires a separate program, written in a separate language, consulting a separate specification, to become anything at all.") (p :class "text-stone-600" - "SX pages embody this principle. Each response contains exactly the controls that are available from the current state. The server evaluates the conditions — " (code "can-cancel") ", " (code "can-refund") ", " (code "shipped") " — and renders only the applicable controls. The client, human or AI, does not need to reason about what is possible. It just reads what is offered.") + "By this criterion, HTML is not hypermedia. It is the closest the web has come — close enough that the browser's deep internalization of the spec creates the " (em "illusion") " of self-interpretation. But the illusion breaks the moment you step outside the browser. Give HTML to a new client — an AI agent, a screen reader, a search crawler — and that client must reimagine the spec from training data, heuristics, and hope. The meaning was never in the HTML. It was in the browser.") (p :class "text-stone-600" - "This is why the htmx camp's insistence on HTML hypermedia is not mere nostalgia. The principle is correct: the server should author the interaction. Where htmx falls short is in choosing a format that is optimized for human visual consumption at the expense of machine interpretability. SX resolves this by using a format that is equally transparent to both — because s-expressions are the " (em "simplest possible") " structured representation, and simplicity is readable by anything.")) - - (~docs/section :title "IX. Beyond the format wars" :id "beyond" + "JSON is not hypermedia and never will be, no matter how many link relations you attach to it. The meaning is always elsewhere.") (p :class "text-stone-600" - "The deeper issue with the \"JSON vs HTML for hypermedia\" debate is the assumption that content and control are separate concerns that a format must somehow reunite. HTML reunites them through rendering semantics. JSON formats try to reunite them through metadata conventions. Both accept the premise that there are two things — data and interaction — and the question is how to ship them together.") + "There is only the hypermedium: a representation that defines its own interpretation. A representation where the spec is written in the language, and the spec " (em "is") " the language — executable, bootstrappable, self-sustaining. S-expressions are not the only possible instance of this. Any homoiconic, metacircular language could satisfy the criterion. But s-expressions are the simplest. The minimal syntax — " (code "(head args...)") " — is the minimal overhead between a format and self-definition. McCarthy arrived at this in 1958. It has taken sixty-eight years for the rest of computing to need it.") (p :class "text-stone-600" - "S-expressions reject the premise. In a homoiconic language, data and code are the same thing. A list that describes a button is also an instruction to render a button. A list that describes a link is also a navigable reference. The component that renders an order card " (em "is") " the order card. There is no gap between the representation and the thing represented because the representation is executable.") + "The age of AI makes the need urgent. When your clients are no longer just browsers with built-in specs but arbitrary agents that must learn interpretation on the fly, a format that carries its own interpreter is not a luxury. It is the only thing that works. Not because it is clever. Because everything else depends on meaning that lives somewhere else — and " (em "somewhere else") " is the one place an ad-hoc agent cannot be guaranteed to look.") (p :class "text-stone-600" - "This is what Blow is reaching for when he says resources should be state machines with hyperlinks as transitions. He is describing a system where the response does not merely " (em "describe") " the available transitions but " (em "enacts") " them — where clicking a link is not interpreting a JSON object but following a control that the server authored into the response. HTML does this for humans. SX does this for both humans and machines. And SX does it with a grammar so minimal that generating it is nearly trivial for any system that can produce structured text.") - (p :class "text-stone-600" - "The age of AI does not need a new hypermedia format. It needs to notice that the oldest structured notation in computing — McCarthy's parenthesized lists from 1958 — already solves the problem. Content, control, code, in one syntax, readable by anything, generable by anything, evaluable anywhere. The true hypermedium was " (a :href "/sx/(etc.(essay.self-defining-medium))" :class "text-violet-600 hover:underline" "always already here") ".")))) + "SX represents the hypermedium. Not a new format in the format wars. Not a better JSON. Not a simpler HTML. The thing itself — a representation that " (em "means what it says") " because it carries the rules for saying it."))))