5.3 KiB
OpenTelemetry in SX — loop briefing
Goal: self-hosting observability for the SX host — traces/spans/metrics in pure SX, a
live SVG waterfall dashboard (reactive island), and OTLP-JSON export for interop with
real backends (Jaeger/Grafana). Reference shape: nektro/zig-tracer src/otel.zig (the OTLP span
struct + HTTP emit) — that's just the export step here.
The key insight — a TRACE is a COMPOSITION. A span has {name, start, end, parent, attrs},
so a trace is a tree of spans — the same shape as an object's :body composition. So reuse the
existing fold machinery in lib/host/compose.sx (render-fold) and lib/host/execute.sx
(execute-fold): a span is a timed effect; a waterfall is a render-fold over the span tree;
OTLP export is an export-fold; metrics are an aggregate-fold. Don't reinvent — fold.
Base: this worktree is branched off loops/host (has the composition machinery + Parts A/C:
type-block grammar + type-def editor). You are on branch loops/otel in
/root/rose-ash-loops/otel.
Rules
- Test-first. Write the failing test, then implement to green.
- Fast tests via the warm server:
bash lib/host/warm-conf.sh run <suite>(starts a warm persistent server;runalone = full conformance;eval "<expr>"for a REPL probe). New suite → add it to the runner the same waylib/host/tests/*.sxare wired. - Do NOT deploy to the live container. blog.rose-ash.com is bind-mounted from
/root/rose-ash-loops/host(a different worktree). Build + test only; integration/deploy happens when this branch is merged. (If you want a live smoke, ask — don't recreate the shared container.) .sxediting: prefersx_write_file(validates on parse); if the sx-tree WRITE tools raise a yojson-null error in this worktree, fall back to theWritetool +sx_validate.- Commit each increment to
loops/otelwith a short factual message. Never push tomain. - Cheap by construction: spans go in a bounded in-memory ring buffer, NOT the durable KV
(persisting every span would hammer persist like the old
relations/relatere-saturation bug). Sample + export on demand.
Roadmap — do ONE unchecked [ ] per iteration, test, commit, tick the box.
- P1 — span model + API.
lib/host/otel.sx: a span dict{:trace :span :parent :name :t0 :t1 :attrs :events};otel/with-span name attrs thunk(records t0/t1, pushes/pops a dynamic parent stack so nesting builds the tree); a bounded ring buffer (otel/record!,otel/recent, cap ~1000, drop-oldest);otel/current-span/otel/current-trace. Tests: nested with-span builds parent links; ring caps at N. - P2 — monotonic clock. Find/confirm a time prim on the OCaml host (the warm-conf
profiler + response cache already measure time; grep
lib/host+ the OCaml bridge). Wrap asotel/now-ns. Tests: monotonic non-decreasing, non-negative, awith-spanhast1 >= t0. - P3 — auto-instrument the handlers. Wrap route handlers at the
host/make-app/ router seam (seelib/host/server.sx) so every HTTP request becomes a trace: a root span per request named by method+route, with{:http.method :http.route :http.status}attrs. Tests: a request through the app produces one trace with the right span name + status attr. - P4 — render-fold → SVG waterfall. A trace → an inline
<svg>timeline: one<rect>per span,x∝ (t0 − trace.t0),width∝ duration,y∝ depth, a label. Reuse the compose-fold walk shape. Tests: N spans → N rects; nested spans get increasing y. - P5 — metrics (aggregate-fold). Fold recent spans → per-route counters (request count)
- latency histogram (p50/p95/p99 from durations). Tests: known spans → expected counts + percentiles.
- P6 — live dashboard.
GET /otel— a reactive island (signals + an SSE stream of new traces) that renders the waterfall of the latest trace + the metrics strip, updating live without reload. Reuse the reactive runtime (sx/sx/reactive-runtime.sx,web/) + Dream SSE/streaming already inlib/host. Tests: the island SSRs; the SSE endpoint emits a span event; the page lists recent traces. - P7 — OTLP-JSON export. Serialize spans to the OTLP/JSON schema (resourceSpans →
scopeSpans → spans with traceId/spanId/parentSpanId/name/startTimeUnixNano/endTimeUnixNano/
attributes).
otel/export-otlp traces→ the JSON; POST to an OTLP HTTP collector via an injected transport (so it's testable without a live collector). Tests: OTLP shape matches the spec for a known trace; the transport receives the payload. - P8 — context propagation + errors. Parse/emit the W3C
traceparentheader so a trace spans services (fed with the host's inter-service calls); mark error spans (:status :error- an event). Tests: traceparent round-trips; an error thunk yields an error span.
Progress log (newest first)
- 2026-07-01 — P1 done.
lib/host/otel.sx: span dict +otel/with-span(dynamic parent stack builds the trace tree), monotonic id/clock placeholders (P2 replaces now-ns), bounded ring buffer (record!/recent/set-cap!, drop-oldest),current-span/current-trace,reset!. Suitelib/host/tests/otel.sxwired into conformance — 18/18 (nested parent links, attrs, ring caps at N drops oldest). - (append one dated line per iteration)