Files
rose-ash/plans/identity-on-sx.md
giles 37b7d1635c identity: PKCE S256 (RFC 7636 §4.2) — now the erlang binary substrate is fixed
oauth.sx routes the PKCE check through pkce_ok: an S256 challenge carried as
{s256, Hash} compares crypto:hash(sha256, Verifier) =:= Hash; a bare
challenge stays plain (§4.1), so both methods coexist with no change to
existing flows (the bare path is the old =:= behaviour). Raw sha256 digests
are compared (base64url is wire encoding, omitted). New tests/pkce.sx (6,
incl. S256 through PAR). Verified pkce 6/6; substrate fix is in the
preceding commit. 239 total.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 14:12:10 +00:00

315 lines
21 KiB
Markdown

# identity-on-sx: OAuth2, sessions & membership on Erlang
> **DRAFT outline.** The identity core `acl-on-sx` assumes already exists. `acl`
> answers "may X do Y"; identity answers "who is X, and how did they prove it."
> Depends on `persist-on-sx` (grant/audit ledger). Pairs with `acl-on-sx`.
rose-ash's `account` domain is the OAuth2 authorization server every other app is
a client of: silent SSO, per-app first-party cookies, grant verification,
membership. Sessions and grants are **long-lived, concurrent, individually
addressable, and expire on their own** — that is the actor model. Erlang's
processes + mailboxes map cleanly: a session is a process, token issue/refresh/
revoke are messages, expiry is a process timeout, and SSO is one process answering
many apps.
End-state: an Erlang-on-SX layer with the OAuth2 authorization-code + silent
(`prompt=none`) flows as message protocols, a session/grant registry, token
lifecycle (issue/refresh/revoke/introspect), and membership state — all auditable
through the event log, all authorization questions delegated to `acl-on-sx`.
## Status (rolling)
`bash lib/identity/conformance.sh`**239/239** (4 phases + 16 ext) — slow (~10min, run in background; internal timeout 1200)
## Ground rules
- **Scope:** only `lib/identity/**` and `plans/identity-on-sx.md`. May **import**
from `lib/erlang/`, and (once they exist) `lib/persist/` + `lib/acl/`. Do not edit
substrates.
- **Architecture:** a session/grant is a process holding its own state; the
registry routes messages by subject/client id. Tokens are opaque + introspected,
not self-validating (revocation must be real). Authorization decisions are NOT
made here — `identity` proves identity, `acl` decides permission.
- **Security:** revocation is immediate (kill the process / tombstone the grant);
no decision relies on a token that outlived its grant. Negative answers are
explicit, never "absence of a yes."
- **Commits:** one feature per commit. Progress log + tick boxes.
## Architecture sketch
```
Auth request Token / session
(authorize client scope subject) {:access :refresh :expires :grant}
│ ▲
▼ │
lib/identity/oauth.sx lib/identity/token.sx
— authz-code + prompt=none flows — issue / refresh / revoke / introspect
— as Erlang message protocols — opaque tokens, grant-backed
│ ▲
▼ │
lib/identity/session.sx lib/identity/registry.sx
— session = process, expiry=timeout — route by subject/client; SSO fan-out
│ │
▼ ▼
lib/identity/api.sx ── (identity/login) (identity/grant?) (identity/revoke) ──┐
│ │
└──────── grant + audit events → persist ; permission? → acl ──────────┘
```
## Phase 1 — Sessions + tokens
- [x] `session.sx` — session process, create/lookup/expire
- [x] `token.sx` — issue/introspect/revoke (opaque, grant-backed)
- [x] `registry.sx` — route by subject/client
- [x] `api.sx` + tests + scoreboard + conformance.sh
## Phase 2 — OAuth2 flows
- [x] authorization-code flow as a message protocol
- [x] refresh + rotation; revocation cascades to issued tokens
- [x] tests: full code exchange, refresh, revoke-then-use (must fail)
## Phase 3 — Silent SSO + membership
- [x] `prompt=none` cross-app login (one session, many clients)
- [x] membership state + per-app grant projection
- [x] grant verification delegated cache (mirror Redis-cache pattern)
## Phase 4 — Audit + federation
- [x] every issue/refresh/revoke is a `persist` event; `(identity/audit subject)`
- [x] federated identity (peer-asserted subject) — advisory, trust-gated stub
- [x] tests: audit completeness, cross-instance subject mapping
## Extensions (base roadmap complete; deepen the engine)
- [x] PKCE S256 method (RFC 7636 §4.2) — substrate fixed; `{s256, sha256(verifier)}` challenge
- [x] access-token TTL / `expires_in` — logical-clock expiry, introspect honours it
- [x] scope as a set + scope narrowing on refresh (RFC 6749 §6)
- [x] client registry: public vs confidential clients, client authentication (RFC 6749 §2)
- [x] client-credentials grant (RFC 6749 §4.4) + device grant (RFC 8628)
- [x] acl-on-sx delegation: identity-gates-before-acl boundary (401 vs 403), stub decider (live Datalog bridge is cross-substrate)
- [~] OAuth `state`/OIDC `nonce` — low value in this server-centric model (client-side echo); skipped
- [x] pushed authorization requests (PAR, RFC 9126): single-use request_uri → consent
- [x] dynamic client registration (RFC 7591): server-generated client_id + secret
- [x] "apps with access": `grants_for(Subject)` / `identity:grants` (per-subject active grants)
- [x] "disconnect app": `revoke_app(Subject, Client)` — revoke all of a subject's grants for a client
- [x] unify `api.sx` over membership + audit (one facade, audited login/logout)
- [x] subject-wide session management: `sessions(Subject)` + `logout_all` (log out everywhere)
- [x] token exchange (RFC 8693): downscope a token into a new independent token
- [x] RFC 7662 full introspection metadata (`introspect_full`: sub/client_id/scope/exp/iat/token_type)
## Progress log
- 2026-06-07 — PKCE S256 UNBLOCKED + implemented (RFC 7636 §4.2). Root-caused
the substrate blocker to ONE bug in `lib/erlang/transpile.sx`
`er-eval-binary-segment`: a string literal in a binary (`<<"abc">>`) was
evaluated as a single integer segment, emitting one `0` byte — so every
binary literal became `{:bytes (0)}` (hence binary `=:=` "always equal" and
crypto:hash input-independent). Fix (user-authorized cross-scope edit on
architecture; loops/erlang should adopt as owner): the integer branch now
expands a string value to per-character bytes (`<<"abc">>``<<97,98,99>>`).
Verified: `byte_size(<<"abc">>)`=3, binary `=:=` correct, sha256 distinct.
Then `oauth.sx`: PKCE check routed through `pkce_ok``{s256, H}` challenge
compares `crypto:hash(sha256, Verifier) =:= H`; bare challenge stays `plain`
(RFC §4.1), so both coexist with zero change to existing flows. New
tests/pkce.sx (6, incl. S256-through-PAR). +6 → 239. Done directly on
architecture (where fixed-erlang + merged-identity coexist); loops/identity
trails until refreshed.
- 2026-06-07 — "disconnect app" (ext): `identity_tokens:revoke_app(Subject,
Client)` revokes every grant a subject holds for one client at once (audited
one revoke per grant), exposed at the facade as `identity:revoke_app`. The
action counterpart to the `grants` view — completes the account-security
view+action pairs: sessions/logout_all, grants/revoke_app, history. Other
subjects' same-client grants are untouched. +4 → account 11, 233/233.
- 2026-06-07 — "apps with access" (ext): `identity_tokens:grants_for(Subject)`
lists a subject's ACTIVE grants as `[{Client, Scope}]` (revoked excluded),
exposed through the facade as `identity:grants(Subject)`. Completes the
per-subject account-security trio: sessions (where), grants (which apps),
history (what happened). New tests/account.sx (7). 222→229. NOTE: conformance
is now slow (~10 min, 22 suites); run it in the background — internal
sx_server timeout raised to 1200s. The suite is at its monolithic-runtime
ceiling; further test growth should consider splitting the harness.
- 2026-06-07 — dynamic client registration (ext, RFC 7591): `register_dynamic`
generates a client_id + secret server-side (make_ref each) and registers the
client, returning {ok, ClientId, Secret} — self-service onboarding distinct
from the manual register_client. A dynamic confidential client can then use
client_credentials; a dynamic public client stays unauthorized_client. New
tests/dynreg.sx (5). 217→222.
- 2026-06-07 — PAR (ext, RFC 9126): `push_authorization_request` lodges the
authorization params under a single-use `request_uri`; `authorize_pushed`
redeems it into the normal consent flow. Pushed requests reuse the pending
store (`{pushed, Rec}` keyed by the request_uri ref — distinct from consent
req_ids, no collision), so no new loop state. The pushed binding (client +
redirect + PKCE) is enforced at exchange. New tests/par.sx (7). 210→217.
- 2026-06-07 — full introspection (ext, RFC 7662 §2.2): `introspect_full`
returns {active, Subject, Client, Scope, Exp, Iat, bearer} for live tokens,
{inactive} otherwise — deepening the opaque-token/live-lookup model the
whole design rests on. Access tokens now carry `Iat` (clock-at-issue);
exp = iat + ttl. Simple `introspect` unchanged. New tests/introspect.sx (9).
201→210. NOTE: conformance now needs an explicit long timeout (>120s, 19
suites) — run with `timeout 580`.
- 2026-06-07 — token exchange (ext, RFC 8693 §2.1): `oauth.sx` gains
`token_exchange(SubjectToken, RequestedScope)` — a valid access token is
downscoped into a NEW independent grant for the same subject (subset only,
else invalid_scope; inactive subject token → invalid_grant). The new token's
lifecycle is independent (revoking either leaves the other active);
exchanges chain. Least-privilege handoff to downstream services. New
tests/exchange.sx (8). 193→201.
- 2026-06-07 — subject-wide session management (ext): `api.sx` gains
`sessions(Subject)` (enumerate) and `logout_all(Subject)` ("log out
everywhere") — revokes + deregisters every session a subject holds,
auditing a logout per session, leaving other subjects untouched. Builds on
registry.sessions_for. New tests/session_mgmt.sx (8). 185→193.
- 2026-06-07 — `delegation.sx` (ext): the identity→acl boundary made concrete.
`check` introspects the token first: inactive → `{error, unauthenticated}`
(401, acl never consulted); active → constructs {Subject, Scope, Action,
Resource} and hands off to acl, which returns permit/deny (`forbidden` =
403). 401 strictly precedes 403 (a revoked token with no scope is still
unauthenticated). acl-on-sx (Datalog) is a different SX guest language —
wired at the integration layer — so the decider here is a labelled stub
(permits when Action ∈ Scope); swap the pid, boundary unchanged. New
tests/delegation.sx (8). 177→185. **Extensions backlog clear.**
- 2026-06-07 — unified facade (ext): `api.sx` coordinator now owns an audit
ledger + a membership registry alongside its token table (started with the
ledger) and session registry. login/logout are audited; new ops
`history`/`enroll`/`member_status`/`member_project` expose the audit +
membership axes through the one `identity` door. identity proves who +
reports membership; acl still decides permission. Existing api behaviour
unchanged (10/10). New tests/facade.sx (9). 168→177.
- 2026-06-07 — `device.sx` (ext, RFC 8628): device authorization grant for
input-constrained devices. authorize → {device_code, user_code}; the human
approve/deny out-of-band by user_code; the device polls by device_code
through the §3.5 status machine (authorization_pending → access_denied /
{ok,Token}). Device code is single-use once a token issues; guarded
transitions (approve-after-deny rejected). Tokens grant-backed. Device-code
expiry + slow_down deferred (no wall clock). New tests/device.sx (10). 158→168.
- 2026-06-07 — client-credentials grant (ext, RFC 6749 §4.4): `oauth.sx` now
owns a client registry (loop/6); `register_client` + `client_credentials`.
A confidential client authenticates and gets a token acting on its own
behalf (subject = the client), no refresh token (§4.4.3). A public client is
`unauthorized_client`; any auth failure (unknown client OR wrong secret) is
`invalid_client` — no client-existence oracle (§5.2). `identity-load-oauth!`
now pulls its deps (token/session/registry/clients). New tests/grants.sx (9).
149→158.
- 2026-06-07 — `clients.sx` (ext): OAuth client registry (RFC 6749 §2). public
vs confidential clients; confidential clients MUST present the right secret
(wrong → invalid_client), public clients are identified but not
authenticated; redirect_uris are allow-listed with exact-match
`valid_redirect` (§3.1.2.2 + Security BCP). Standalone module (no oauth
wiring yet — that's a follow-up). New tests/clients.sx (11). 138→149.
- 2026-06-07 — access-token expiry (ext): logical clock in the token registry
(`advance`/`now`; no wall clock in substrate). Grants carry a Ttl; each
access token carries an Expires (Now-at-issue + Ttl, or infinity); introspect
returns inactive once `Now` reaches it. Refresh mints a fresh short-lived
access token (new Expires) — short access tokens, long refresh tokens. issue/4
+ issue_grant/4 default to infinity, so all prior tests unchanged. New
tests/expiry.sx (8). token loop/6. 130→138.
- 2026-06-07 — scope narrowing (ext): each access token now carries its own
EFFECTIVE scope (<= the grant's max). `refresh/3` requests a narrower scope;
the request must be a subset of the grant scope (RFC 6749 §6) else
`{error, invalid_scope}` and the refresh token is NOT consumed (client may
retry, §5.2). `refresh/2` keeps full scope; scope stays opaque (atom or list)
for issue, so all prior atom-scope tests pass unchanged. token 18→24, 130/130.
Also filed Blocker: PKCE S256 needs SHA256+binary compare, both broken in the
erlang substrate (binary `=:=` always true; crypto:hash ignores binary
content) — deferred, plain method stays.
- 2026-06-07 — `federation.sx`: trust-gated, advisory federated identity.
A peer assertion is accepted only from an explicitly trusted peer
(else `{error, untrusted}`) and is flagged `{peer_asserted, Peer}`, never
promoted to local authority — acl decides what it may do. Cross-instance
subject mapping namespaces remote subjects by peer (`{federated, Peer,
Remote}`) so two peers' "alice" never collide, with optional explicit
aliasing. Added an audit-completeness test (mixed transition stream → no
event dropped). New tests/federation.sx (12). **Phase 4 complete — all four
phases done.** +13 → 124/124.
- 2026-06-07 — `audit.sx`: append-only grant audit ledger (an Erlang
process). `token.sx` gains `start/1(Audit)` and emits issue/refresh/revoke
events (incl. reuse-triggered revoke); `start/0` stays unaudited (no
regression — token.sx has no compile-time dep on the audit module, just
sends to a pid). Ledger queryable per subject — `audit`/`actions`/`count`/
`all`, chronological. In-memory event stream (persist-backing is a future
Erlang↔persist bridge, out of scope per loop allowance). New
tests/audit.sx (10). +10 → 111/111.
- 2026-06-07 — `cache.sx`: delegated grant-verification cache (Redis-cache
pattern) wrapping the token registry. introspect memoised; generation
invalidation keeps revocation real — any revoke/refresh bumps a generation
counter so every cached positive instantly becomes a miss and re-validates
against the live registry. A revoked token never reads valid from cache.
stats() exposes hits/misses. New tests/cache.sx (9). **Phase 3 complete.**
+9 → 101/101.
- 2026-06-07 — `membership.sx`: coop membership as a guarded state machine
(none→pending→active→lapsed⇄active, any→revoked terminal); invalid
transitions are explicit `{error, CurrentStatus}`. `project(Subject, App)`
renders the one canonical state into a per-app claim
({member,Tier,App}/{pending,App}/{lapsed,App}/{denied,App}/{non_member,App})
— identity reports what; acl decides whether. New tests/membership.sx (17).
+17 → 92/92.
- 2026-06-07 — silent SSO (`prompt=none`, OIDC §3.1.2.1): `oauth.sx` now owns
a session registry; `establish` creates a subject session, `silent_authorize`
asks "does this subject have a live session?" → mints a code (skipping
consent) bound to client+redirect+PKCE, else `login_required`. Same machine,
fast-path — one session, many clients; `end_session` closes the path.
New `tests/sso.sx` (10). +10 → 75/75.
- 2026-06-07 — `oauth.sx` refresh wiring + e2e: exchange now issues an
access+refresh pair (RFC 6749 §4.1.4/§5.1) via token.sx issue_grant; added
the refresh grant (§6) delegating to token rotation. End-to-end tests:
code-exchange→refresh→introspect, refresh-reuse rejected, and
revoke-then-refresh blocked by cascade. **Phase 2 complete.** +3 → oauth 17,
65/65.
- 2026-06-07 — `token.sx` grant-centric rewrite: refresh-token rotation
(RFC 6749 §6) + cascading revocation. The grant {Subject,Client,Scope,
Status} is the cascade unit; access + refresh tokens reference it.
`issue_grant` → {ok, Access, Refresh}; `refresh` supersedes the old
refresh + mints a new pair; reusing a superseded refresh token revokes
the whole family (RFC 6819 §5.2.2.3), killing the live descendant.
`revoke` of ANY token (access or refresh) cascades to the grant. All
prior issue/introspect/revoke behaviour preserved. +9 → token 18, 62/62.
- 2026-06-07 — `oauth.sx`: OAuth2 authorization-code flow as a message
protocol (RFC 6749 §4.1) + PKCE (RFC 7636, plain). State machine on one
authz-server process: authorize → {consent_required} → consent →
{code} → exchange → {ok, Token}. Exchange enforces single-use codes
(§10.5; removed on first attempt, replay → invalid_grant), client_id +
redirect_uri binding (§4.1.3), and PKCE verifier match. Issued tokens are
grant-backed so revocation stays real. +14 → 53/53.
- 2026-06-06 — `api.sx`: service facade. `identity:start()` spawns one
coordinator owning the token table + session registry; exposes
login/verify/revoke/logout/session_status. Coordinator is the sessions'
owner, so an expired session deregisters itself (timeout-driven, no
sweep). `verify` answers IDENTITY only ({active, Subject, Client, Scope});
permission is acl's job — explicit delegation boundary. **Phase 1 complete.**
+10 → 39/39.
- 2026-06-06 — `registry.sx`: directory process routing sessions by id and
by (subject, client). Answers the SSO probe `lookup(Subject, Client)` and
the fan-out `sessions_for(Subject)` (one subject, many clients). Routes
only — holds no grant state. Integration-tested end-to-end: register a live
session, route to it, confirm it answers active. +9 → 29/29.
- 2026-06-06 — `token.sx`: opaque grant-backed tokens. Token = `make_ref`
(carries no info); the token table is a process; `introspect` is a live
lookup every time so revocation is real (RFC 7009) — a revoked token reads
`{inactive}` on the next introspection, no validity window. Reply shapes
follow RFC 7662 §2.2 (`{active,...}` / `{inactive}`, never says why). +9 → 20/20.
- 2026-06-06 — `session.sx`: session-as-Erlang-process. create/lookup/touch/
explicit-expire/revoke as messages; idle-timeout self-expiry via
`receive ... after Ttl` notifying the owner then tombstoning. Tombstones
answer lookups with `{error, expired|revoked}` — never a silent dead
mailbox. Established the conformance harness (`conformance.sh`, scoreboard,
`tests/session.sx`). 11/11.
## Blockers
- 2026-06-07 — **RESOLVED.** Both symptoms below were ONE bug —
`er-eval-binary-segment` in `lib/erlang/transpile.sx` emitting a single `0`
byte for a string literal in a binary, so every `<<"...">>` was `{:bytes (0)}`
(equal to each other, and hashing the same). Fixed (string → per-character
bytes); binary `=:=` and `crypto:hash` now correct, and PKCE S256 is
implemented. See Progress log 2026-06-07. (Historical detail below.)
- 2026-06-07 — **PKCE S256 blocked: erlang binary bugs.** Two substrate bugs
in `lib/erlang` make a correct/secure S256 impossible (S256 needs
`BASE64URL(SHA256(verifier))` compared against the stored challenge):
1. **Binary `=:=` always true.** `<<"v1">> =:= <<"v2">>` → `true`;
`<<"abc">> =:= <<"abd">>` → `true`. So a hash comparison can't reject a
wrong verifier.
2. **`crypto:hash` ignores binary-literal content.**
`crypto:hash(sha256, <<"v1">>)` and `crypto:hash(sha256, <<"v2">>)` return
the *identical* 32-byte digest (`6e 34 0b 9c …`), which is also ≠ the
correct SX-level `(crypto-sha256 "abc")` (`ba 78 16 bf …`). The binary
payload isn't reaching the hash. (Atom input → badarg→nil, separate issue.)
Minimal repro (epoch protocol, after loading lib/erlang/runtime.sx):
`(erlang-eval-ast "case <<\"a\">> =:= <<\"b\">> of true -> bug; false -> ok end")`
→ `bug`. Not in scope to fix (lib/erlang is a substrate). PKCE `plain`
remains correct and in use; S256 deferred until the binary path is fixed.