Files
rose-ash/lib/gitea
giles 8ed44f7770 lib/gitea: fix the fetch-pack-over-HTTP hang — native parse fast path
The sx-forge native-loop blocker: clone! of the live giles/rose-ash
never returned over gitea/http-app. Root cause was NOT the transport —
pack-line-parse ran every pack line through the interpreted spec parser
(~6.6KB/s on the CEK machine; a full-repo pack = hours), and a non-hex
byte in a pkt length header parsed negative (index-of -1), walking the
scan index backwards forever.

- gitea/parse-obj: use the host reader (open-input-string + read,
  ~3700x faster, value-identical) when the host provides it; hosts
  without string ports keep sx-parse. Feature-detected at load.
- pkt-sections-loop: (< n 4) guard — malformed lengths error instead
  of hanging.
- push-cmd!: haves = every advertised remote ref held locally, so a
  NEW branch pushes only its delta, not the whole repo closure.
- tests/wire.sx: malformed-len errors, truncated-pkt clamps, parse-obj
  = sx-parse equivalence (blob/commit + cid). 83/83.
- tests/wire-http.sh + wire-http-client.sx: end-to-end over REAL
  http-listen/http-request on :8943 — ls-remote/clone/push-new-branch/
  fresh-clone-verify/delete. The coverage gap that hid all this.

Proven vs the live forge (in sx-gitea-1): full 4468-file clone in 77s
(was: hang), commit, push heads/sx-smoke-test ok, branch advertised on
sx.sx-web.org. Conformance 620/620.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-07-04 00:30:59 +00:00
..

sx-gitea — a federated git forge in plain SX

A git forge built by composing the x-on-sx subsystems: every phase wires one more substrate onto the forge. No third-party dependencies — the whole stack is SX on the OCaml kernel.

Run the suite: bash lib/gitea/conformance.sh (per-suite scores in scoreboard.md). Suites are independent sx_server sessions; heavyweight substrates (Smalltalk/content, Scheme/flow, APL/feed, Haskell/search) load only for the suites that need them.

Composition map

Phase Module Built on
1 repo repo.sx sx-git (lib/git, native-CID object store), persist kv
2 access access.sx acl (datalog): repo role groups, collaborators, org teams; bearer tokens
3 wire wire.sx git-style smart HTTP: pkt-line framing, upload/receive-pack, CID-verified packs; client (clone!/fetch!/push!) drives any dream app fn
4 issues issues.sx content (Smalltalk): Markdown bodies as block documents; relations (datalog): derived issue graph
5 pr pr.sx sx-git merge-base diffs + 3-way merge; flow (Scheme): durable open→approval→merge lifecycle; merge queue
6 activity activity.sx feed (APL): timelines/dashboard; events (flow): durable at-least-once notifications
7 search search.sx search (Haskell): tf-idf ranked code/issue/PR search, batched evaluations
8 fed fed.sx ForgeFed: AP actors, trust-gated inbox with provenance + materialized federated issues/PRs, mirrors over the wire client, cursor-based delivery
web web.sx dream: routes, auth gating (401/403/404-hides-private), route-pack registry

Architectural rules of thumb

  • The kv store is the source of truth. Owners, repo records, issues, PRs, collaborators, teams, tokens, follows, trust, mirrors — all plain dicts under gitea/... keys on one persist backend per forge. Deleting a repo is a prefix purge (no ghost state on recreate).
  • Derived, not maintained. The acl database and the relations graph are derived from kv state and rebuilt when the derived facts change (cached in the forge handle) — deletions can never dangle.
  • Instrument in the runtime. Activity logging wraps the mutation verbs by redefinition (gitea/base-*! + wrapper), so every caller emits activity with zero call-site edits.
  • Everything is testable without sockets. A forge is a value over a persist/mem-backend; gitea/app is a pure request→response fn; the wire client federates two in-memory forges directly.
  • Trust is re-checked, never cached. Federation operations (inbox, mirror sync, delivery) consult the trust set at use time.

Per-repo git stores

Each repo's objects/refs live in their own git/repo-named namespace forge/<owner>/<name> — identical content still shares CIDs, but repos cannot see each other's objects. All ref moves go through ref-cas!; concurrent pushes surface as stale/non-fast-forward per-ref statuses.

Known limits (deliberate, documented)

  • Wire packs carry one object per pkt line (~64KB); side-band chunking is a future extension (gitea/pkt-fits? reports it). SHA-1/packfile byte compat for stock git clients lives in lib/git/{export,import}.sx and is not yet wired into the HTTP endpoints.
  • Inbox activities are trust-gated but not signature-verified.
  • Reopening a PR restarts its lifecycle flow (a cancelled flow cannot resume); reviews survive.
  • Issue web close/reopen does not emit activity (no actor at the core call sites for issue-close!).