diff --git a/sx/sx/essays/hypermedia-age-of-ai.sx b/sx/sx/essays/hypermedia-age-of-ai.sx new file mode 100644 index 0000000..37bab67 --- /dev/null +++ b/sx/sx/essays/hypermedia-age-of-ai.sx @@ -0,0 +1,120 @@ +;; --------------------------------------------------------------------------- +;; Hypermedia in the Age of AI +;; A response to Nick Blow's article on JSON hypermedia and LLM agents. +;; --------------------------------------------------------------------------- + +(defcomp ~essays/hypermedia-age-of-ai/essay-hypermedia-age-of-ai () + (~doc-page :title "Hypermedia in the Age of AI" + (p :class "text-stone-500 text-sm italic mb-8" + "Nick Blow argues that JSON hypermedia can serve AI agents better than HTML or RPC. He is right about the problem and wrong about the solution. The answer is not a better serialization of links. It is a representation that is simultaneously content, control, and code.") + + (~doc-section :title "I. The argument" :id "the-argument" + (p :class "text-stone-600" + (a :href "https://nickblow.tech/posts/hypermedia-in-the-age-of-ai" :class "text-violet-600 hover:underline" "Blow's essay") + " starts from a position familiar to anyone who has followed the hypermedia discourse: Carson Gross's contention that sprinkling link objects into JSON does not make an API truly RESTful, because REST demands generic clients capable of interpreting hypermedia controls. Blow agrees that this has merit but argues the position is too restrictive. HTML is not the universal hypermedia format. LLMs choke on it. The real prize is " (em "progressive discovery") " — a client that learns what it can do by following links, not by reading documentation upfront.") + (p :class "text-stone-600" + "He contrasts this with MCP, the Model Context Protocol now dominant for LLM tool use. MCP is RPC in a trench coat: the server declares all its tools upfront, the model receives the full catalogue in its system prompt, and it calls functions by name. This works, but it does not scale. A file management API that exposes create, read, update, delete, list, search, share, unshare, move, copy, rename, and permission management overwhelms the model before it has done anything. MCP forces " (em "total disclosure") " where hypermedia would offer " (em "progressive revelation") ".") + (p :class "text-stone-600" + "Blow proposes JSON-flavoured hypermedia as the fix — " (code "vnd.siren+json") ", custom content types, link relations in response payloads — so agents can discover capabilities by doing, not by reading a manual. Resources become state machines. Hyperlinks are state transitions. The agent explores the graph.") + (p :class "text-stone-600" + "This is the right diagnosis. The prescription is too weak.")) + + (~doc-section :title "II. The JSON hypermedia trap" :id "json-trap" + (p :class "text-stone-600" + "JSON hypermedia has been proposed, specified, and implemented many times. HAL, JSON:API, Siren, Hydra, UBER, Collection+JSON. Each adds a linking convention on top of JSON. Each generates roughly the same amount of adoption: enough for a conference talk, not enough for an ecosystem. The reason is structural, not cultural.") + (p :class "text-stone-600" + "JSON is a data serialization format. It represents maps, arrays, strings, numbers, booleans, and null. That is all it represents. To make it hypermedia you must layer conventions on top: " (em "this key means a link") ", " (em "this object means an action") ", " (em "this array means available transitions") ". The client must know the convention. The convention must be specified in a separate document. The separate document must be read before the response makes sense.") + (p :class "text-stone-600" + "This is exactly the problem Blow identifies with MCP — the client needs upfront knowledge — relocated from the system prompt to a media type specification. You have not eliminated the manual. You have moved it.") + (p :class "text-stone-600" + "HTML avoids this because presentation " (em "is") " semantics. A " (code "
") " with " (code "method=\"POST\"") " and " (code "action=\"/orders\"") " is self-interpreting to a browser. The browser does not need a separate specification explaining what forms do. The rendering itself conveys the interaction model. This is what makes HTML genuinely hypermedia and JSON-with-links merely data-with-metadata.") + (p :class "text-stone-600" + "But Blow is right that HTML is a terrible format for AI. The " (a :href "/sx/(etc.(essay.sx-and-ai))" :class "text-violet-600 hover:underline" "syntax tax") " is enormous. Attributes split across arbitrary quoting rules. Closing tags that must match opening tags. CSS class strings encoding visual semantics in an opaque blob. JavaScript embedded in event handlers with different escaping rules. An LLM reading an HTML page is doing archaeology, not interpretation.") + (p :class "text-stone-600" + "So: HTML is hypermedia but hostile to machines. JSON is machine-friendly but not hypermedia. The obvious question — is there a format that is both? — is the question nobody in this discourse seems to be asking.")) + + (~doc-section :title "III. The third option" :id "third-option" + (p :class "text-stone-600" + "S-expressions.") + (p :class "text-stone-600" + "Consider what an SX response looks like when the server sends a page fragment:") + (~doc-code :code (highlight "(div :class \"space-y-4\"\n (h2 \"Your orders\")\n (ul :class \"divide-y\"\n (li :class \"py-3\"\n (span :class \"font-medium\" \"Order #4281\")\n (span :class \"text-stone-500\" \"3 items · £42.00\")\n (div :class \"mt-2 flex gap-2\"\n (a :sx-get \"/orders/4281\"\n :sx-target \"#main\"\n :class \"text-violet-600\"\n \"View details\")\n (button :sx-post \"/orders/4281/reorder\"\n :sx-target \"#main\"\n :class \"text-violet-600\"\n \"Reorder\"))))\n (a :sx-get \"/orders?page=2\"\n :sx-target \"#main\"\n :sx-swap \"innerHTML\"\n :class \"text-violet-600\"\n \"Next page\"))" "lisp")) + (p :class "text-stone-600" + "This is simultaneously:") + (ul :class "space-y-2 text-stone-600" + (li (strong "Content") " — it describes what to display: headings, text, layout.") + (li (strong "Presentation") " — it specifies how to display it: classes, structure, hierarchy.") + (li (strong "Controls") " — it declares what the user " (em "can do next") ": view details, reorder, paginate. Each control carries its method (" (code "sx-get") ", " (code "sx-post") "), target, and swap strategy.") + (li (strong "Code") " — it is a valid program. An evaluator can parse it, walk it, extract the controls, understand the state transitions, and act on them.")) + (p :class "text-stone-600" + "An LLM reading this response does not need a Siren specification to understand the available actions. The actions are " (em "in the content") " — just as they are in HTML, but in a format with " (a :href "/sx/(etc.(essay.sx-and-ai))" :class "text-violet-600 hover:underline" "essentially zero syntax tax") ". No closing tags. No attribute quoting ambiguity. No JavaScript in event handlers. One syntactic form — " (code "(head args...)") " — for everything.") + (p :class "text-stone-600" + "This is not JSON with links bolted on. This is not HTML simplified. This is a representation where content and control are " (em "the same syntax") " — because s-expressions make no distinction between data and code. A " (code "div") " and an " (code "sx-get") " are both list elements. The AI reads them with the same parser, reasons about them with the same model, and generates them with the same grammar.")) + + (~doc-section :title "IV. Progressive discovery, natively" :id "progressive-discovery" + (p :class "text-stone-600" + "Blow's strongest argument is against MCP's total-disclosure model. A hypermedia API reveals capabilities incrementally: you see what you can do " (em "from where you are") ", not everything the system supports. This is how the web works for humans — you do not receive a manifest of every URL on a site before you can browse it.") + (p :class "text-stone-600" + "SX achieves this without any additional protocol machinery. Each response is a tree of content and controls. The controls present in " (em "this") " response are " (em "this") " resource's available transitions. Follow one, get a new tree with new controls. The state machine Blow describes is not layered on top of the data format — it " (em "is") " the data format.") + (p :class "text-stone-600" + "But SX goes further than any JSON hypermedia format can, because the controls are not just declarations — they are " (em "evaluable") ". Consider a response that includes conditional controls:") + (~doc-code :code (highlight "(div :class \"order-actions\"\n (when can-cancel\n (button :sx-post \"/orders/4281/cancel\"\n :sx-confirm \"Cancel this order?\"\n :class \"text-red-600\"\n \"Cancel order\"))\n (when can-refund\n (button :sx-post \"/orders/4281/refund\"\n :sx-target \"#status\"\n \"Request refund\"))\n (when (and shipped tracking-url)\n (a :href tracking-url\n :class \"text-violet-600\"\n \"Track shipment\")))" "lisp")) + (p :class "text-stone-600" + "The available actions depend on server-evaluated state. An order that has shipped shows a tracking link. An order that can be cancelled shows a cancel button. The client — human or AI — sees only the actions that apply " (em "right now") ". This is not progressive discovery bolted onto a data format. It is the server authoring a state machine in the response itself, using the same language as the content.") + (p :class "text-stone-600" + "A JSON hypermedia format would need a separate " (code "actions") " array with method/href/type metadata, plus a schema for each action's payload, plus documentation for what each action does. SX needs none of this. The button is its own documentation. Its label says what it does. Its attributes say how. An AI reading " (code "(button :sx-post \"/orders/4281/cancel\" \"Cancel order\")") " knows everything it needs to act.")) + + (~doc-section :title "V. The component advantage" :id "component-advantage" + (p :class "text-stone-600" + "Blow does not discuss the composition problem, but it is where his proposal breaks down hardest. JSON hypermedia formats specify individual resources. They do not specify how resources compose into interfaces. A Siren entity has properties, actions, and links — but no concept of a reusable UI fragment that accepts parameters and renders children.") + (p :class "text-stone-600" + "SX has components:") + (~doc-code :code (highlight "(defcomp ~order-card (&key order &rest actions)\n (div :class \"rounded border p-4\"\n (div :class \"flex justify-between\"\n (span :class \"font-medium\"\n (str \"Order #\" (get order \"id\")))\n (span :class \"text-stone-500\"\n (get order \"status\")))\n (p :class \"text-sm text-stone-600 mt-1\"\n (str (get order \"item-count\") \" items · \"\n (get order \"total\")))\n (div :class \"mt-3 flex gap-2\" actions)))" "lisp")) + (p :class "text-stone-600" + "This is a hypermedia control in the fullest sense — it takes data and actions as parameters and renders a self-contained interactive unit. An AI generating a page can compose it:") + (~doc-code :code (highlight "(map (fn (order)\n (~order-card :order order\n (when (get order \"can-reorder\")\n (button :sx-post (str \"/orders/\" (get order \"id\") \"/reorder\")\n :sx-target \"#main\"\n \"Reorder\"))))\n orders)" "lisp")) + (p :class "text-stone-600" + "No JSON hypermedia format supports this. They cannot, because JSON has no function abstraction. You can parameterize data — templates, schemas — but you cannot express " (em "a reusable piece of interactive UI") " in JSON. You always need a separate rendering layer that interprets the JSON and produces something visual. SX does not have this separation. The component definition " (em "is") " the rendering.")) + + (~doc-section :title "VI. SX URLs as evaluable affordances" :id "evaluable-affordances" + (p :class "text-stone-600" + "Blow describes REST resources as state machines where hyperlinks represent allowed transitions. This maps cleanly to SX's URL system, which goes further: URLs are not opaque strings but " (a :href "/sx/(applications.(sx-urls))" :class "text-violet-600 hover:underline" "evaluable expressions") ".") + (~doc-code :code (highlight ";; Opaque URL (conventional)\n\"/orders/4281/details\"\n\n;; SX URL (evaluable)\n\"/sx/(etc.(essay.hypermedia-age-of-ai))\"\n\n;; The URL is a program:\n;; 1. Call the 'etc' section function\n;; 2. Which calls the 'essay' page function\n;; 3. With slug \"hypermedia-age-of-ai\"\n;; 4. Returns the component AST to render" "lisp")) + (p :class "text-stone-600" + "The URL itself is a composition of functions. An AI examining the URL can understand the content hierarchy — this is an essay, in the 'etc' section, about a specific topic. It can generate new URLs by composing known functions: " (code "(etc.(essay.new-slug))") " follows the same pattern. The addressing scheme is not a convention imposed from outside. It is the language applied to navigation.") + (p :class "text-stone-600" + "This dissolves the distinction between \"following a link\" and \"calling a function\" that bedevils every attempt to make JSON APIs hypermedia. In SX, following a link " (em "is") " calling a function. The URL evaluates to a component tree. The component tree renders to interactive content. The interactive content contains more URLs. The cycle is closed without any protocol machinery beyond HTTP.")) + + (~doc-section :title "VII. The AI agent that reads the page" :id "ai-agent" + (p :class "text-stone-600" + "Here is the scenario Blow is really imagining: an LLM agent that interacts with web services not through a bespoke API layer but by reading responses and following controls, the way a human reads a page and clicks links. MCP makes this possible through function calling. Blow argues hypermedia would make it better through progressive discovery.") + (p :class "text-stone-600" + "SX makes it natural.") + (p :class "text-stone-600" + "An SX response is a tree. The AI parses it (trivially — parentheses balance). It extracts all " (code "sx-get") " and " (code "sx-post") " attributes to build a list of available actions. It reads labels and context to understand what each action does. It decides which action to take. It issues the request. It receives a new tree. The loop repeats.") + (p :class "text-stone-600" + "But unlike HTML, the AI does not need to strip out presentational noise to find the semantic signal. There is no " (code "
") " wrapping a " (code "
") " wrapping a " (code "
") " before the actual content. The tree structure " (em "is") " the semantic structure. Every node is either content or control, and the distinction is syntactically obvious: controls have " (code "sx-") " attributes, content does not.") + (p :class "text-stone-600" + "And unlike JSON hypermedia, the AI does not need to understand a media type specification to interpret the controls. The controls are HTML-like elements with method and target attributes — a pattern every LLM already understands from training on web content. SX inherits the discoverability of HTML (controls are self-describing) without the noise (no closing tags, no attribute soup, no CSS class archaeology).") + (p :class "text-stone-600" + "The " (a :href "/sx/(etc.(essay.sx-and-ai))" :class "text-violet-600 hover:underline" "spec fits in a context window") ". The complete SX language — evaluator, parser, renderer, all primitives — is roughly 3,000 lines. An AI agent that loads the spec into its context can not only read SX responses but " (em "generate") " them. It can produce new pages, new components, new interactions — because the language it reads and the language it writes are the same, and both fit in memory at once.")) + + (~doc-section :title "VIII. What MCP gets wrong" :id "mcp" + (p :class "text-stone-600" + "MCP's design reflects a fundamental confusion: it treats capability as a catalogue rather than an affordance. The server lists every tool. The model reads the list. The model calls a tool. The response is data. The model must maintain its own mental model of the server's state to know which tools are appropriate next.") + (p :class "text-stone-600" + "This is the exact inversion of the hypermedia principle. In a hypermedia system, the server tells you what you can do " (em "in each response") ". You do not need a mental model of the server because the server renders its own state into the controls it offers you. Following controls is safe because the server would not offer a control that is not valid right now.") + (p :class "text-stone-600" + "SX pages embody this principle. Each response contains exactly the controls that are available from the current state. The server evaluates the conditions — " (code "can-cancel") ", " (code "can-refund") ", " (code "shipped") " — and renders only the applicable controls. The client, human or AI, does not need to reason about what is possible. It just reads what is offered.") + (p :class "text-stone-600" + "This is why the htmx camp's insistence on HTML hypermedia is not mere nostalgia. The principle is correct: the server should author the interaction. Where htmx falls short is in choosing a format that is optimized for human visual consumption at the expense of machine interpretability. SX resolves this by using a format that is equally transparent to both — because s-expressions are the " (em "simplest possible") " structured representation, and simplicity is readable by anything.")) + + (~doc-section :title "IX. Beyond the format wars" :id "beyond" + (p :class "text-stone-600" + "The deeper issue with the \"JSON vs HTML for hypermedia\" debate is the assumption that content and control are separate concerns that a format must somehow reunite. HTML reunites them through rendering semantics. JSON formats try to reunite them through metadata conventions. Both accept the premise that there are two things — data and interaction — and the question is how to ship them together.") + (p :class "text-stone-600" + "S-expressions reject the premise. In a homoiconic language, data and code are the same thing. A list that describes a button is also an instruction to render a button. A list that describes a link is also a navigable reference. The component that renders an order card " (em "is") " the order card. There is no gap between the representation and the thing represented because the representation is executable.") + (p :class "text-stone-600" + "This is what Blow is reaching for when he says resources should be state machines with hyperlinks as transitions. He is describing a system where the response does not merely " (em "describe") " the available transitions but " (em "enacts") " them — where clicking a link is not interpreting a JSON object but following a control that the server authored into the response. HTML does this for humans. SX does this for both humans and machines. And SX does it with a grammar so minimal that generating it is nearly trivial for any system that can produce structured text.") + (p :class "text-stone-600" + "The age of AI does not need a new hypermedia format. It needs to notice that the oldest structured notation in computing — McCarthy's parenthesized lists from 1958 — already solves the problem. Content, control, code, in one syntax, readable by anything, generable by anything, evaluable anywhere. The true hypermedium was " (a :href "/sx/(etc.(essay.self-defining-medium))" :class "text-violet-600 hover:underline" "always already here") ".")))) diff --git a/sx/sx/nav-data.sx b/sx/sx/nav-data.sx index 8f4d7a5..24ade0c 100644 --- a/sx/sx/nav-data.sx +++ b/sx/sx/nav-data.sx @@ -99,7 +99,9 @@ (dict :label "The Art Chain" :href "/sx/(etc.(essay.the-art-chain))" :summary "On making, self-making, and the chain of artifacts that produces itself. Ars, techne, content addressing, and why the spec is the art.") (dict :label "The True Hypermedium" :href "/sx/(etc.(essay.self-defining-medium))" - :summary "The true hypermedium must define itself with itself. On ontological uniformity, the metacircular web, and why address and content should be the same stuff."))) + :summary "The true hypermedium must define itself with itself. On ontological uniformity, the metacircular web, and why address and content should be the same stuff.") + (dict :label "Hypermedia in the Age of AI" :href "/sx/(etc.(essay.hypermedia-age-of-ai))" + :summary "JSON hypermedia, MCP, and why s-expressions are the format both humans and AI agents actually need."))) (define philosophy-nav-items (list (dict :label "The SX Manifesto" :href "/sx/(etc.(philosophy.sx-manifesto))" diff --git a/sx/sx/page-functions.sx b/sx/sx/page-functions.sx index 07271bb..4cee90f 100644 --- a/sx/sx/page-functions.sx +++ b/sx/sx/page-functions.sx @@ -472,6 +472,7 @@ "hegelian-synthesis" '(~essays/hegelian-synthesis/essay-hegelian-synthesis) "the-art-chain" '(~essays/the-art-chain/essay-the-art-chain) "self-defining-medium" '(~essays/self-defining-medium/essay-self-defining-medium) + "hypermedia-age-of-ai" '(~essays/hypermedia-age-of-ai/essay-hypermedia-age-of-ai) :else '(~essays/index/essays-index-content))))) ;; Philosophy (under etc)