Prism filed a per-recipient-flush-concurrency FR pointing at
serial-per-flush. Investigation surfaced the actual culprit:
`scheduleFlush` was using a 15 s backoff on **both** the success and
failure paths, so envelopes enqueued *during* an in-flight flush
sat ~15 s behind the next drain — visible as "10 s of silence then
25-frame burst" on the receiving side under sustained sender output.
Two fixes:
1. `scheduleFlush` now uses 0 ms delay when `flushOnce` delivered
≥1 envelope and more is queued (network healthy → drain
remainder immediately). 15 s reserved for the actual failure
case where every attempt this round failed. `flushOnce` returns
`{ delivered, remaining } | null` so concurrent-flush early
returns don't double-schedule.
2. `flushOnce` groups the outgoing queue by `recipientAddress` and
drains buckets via `Promise.all`. Per-peer order preserved
(sequential within a bucket); a slow POST to recipient A no
longer head-of-line-blocks frames bound for B.
`Inbox.tick` public shape unchanged. `OutgoingQueueStore`
implementations see the same per-entry list/remove/bumpAttempts/
size contract; only cross-recipient interleaving changes.
Tests cover (1) 25-envelope burst behind a 100 ms slow PUT drains
within 1 s, and (2) carol's PUT lands within 150 ms even when bob's
PUT stalls 200 ms.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
V4.8.3 shipped client-side cross-channel dedup hook
(`Inbox.acceptBridgeFrame`), but recipients that didn't migrate to
the new wiring still observed the same envelope twice — once via
WS bridge push, again ~30 s later via inbox-poll. Prism re-verified
the FR after 4.8.3 and asked for a relay-side enforcement so app
code doesn't have to ack-via-DELETE on every bridge frame.
V4.8.4 adds an in-memory `BridgeDeliveryLog` (default 60 s grace,
8192-per-address cap) that records every successful WS / SSE /
long-poll push of `(address, msgId)`. The `/v1/inbox/:addr/fetch`
route filters out blobs in the log's grace window so a recipient
running both a bridge and the 30 s poll cadence sees exactly one
delivery. Cursor advances over the full fetched window so a poll
that straddles a suppressed blob doesn't stall.
The standalone server auto-wires the log between
`createBridgeRoutes` and `createInboxRoutes`. Custom mounts thread
the same instance through `bridgeDeliveryLog` on both factories.
Tests cover WS-then-poll, SSE-then-poll, and a negative control
(non-bridge-pushed blob still comes through inbox-fetch).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two follow-ups to the V4.8.2 duplicate-fan-out fixes Prism filed.
1. `Inbox.acceptBridgeFrame(blob)` + shared 4096-entry msgId LRU.
The relay durably stores blobs and pushes them to every active
delivery channel; without a cross-channel ack the bridge frame
ran first and the next inbox-poll re-dispatched the same blob
~30 s later, tripping on consumed prekeys. Bridge consumers now
plumb pushed frames through `acceptBridgeFrame`, which shares
the dedup gate + ack path with `pollOnce`. Whichever channel
delivers first wins; the other acks-and-skips. Inbox records
the msgId before the ack so a parallel poll can't observe an
in-flight ack window.
2. `Shade.aliasSession(oldLabel, newLabel)`. First-contact forces
the receiver to label the new session by the relay's sender
fingerprint hint (`fp:<senderfp>`); the post-decrypt plaintext
typically announces the peer's real address. Aliasing moves
session, trusted identity, peer-verification, and identity-
version under the canonical label. Holds the per-peer mutex on
both labels (lexicographic order) so concurrent crypto ops can't
observe a half-moved state. Refuses to overwrite an existing
session at the new label.
Wire change: `IncomingMessage.expiresAt?` now surfaces the relay's
expiry so receivers can pass bridge frames straight to
`acceptBridgeFrame` without inventing a TTL.
Tests cover bridge-then-poll, poll-then-bridge, aliasSession happy
path, refuse-to-overwrite, and same-label no-op.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two interlocking robustness fixes for the duplicate-fan-out / first-contact
class of failures Prism reported.
1. `Shade.receive(from, env)` now queues its `manager.decrypt` step
per `from` so concurrent dispatches can't race the SessionManager
ratchet or the StorageProvider (sqlite "database is locked", IDB
transaction conflicts). User message handlers run *outside* the
queue so streams + file-RPC's nested `shade.receive` calls don't
self-deadlock.
2. Bridge WS + SSE handlers now run a per-connection bounded msgId
LRU as defense-in-depth against any flushTo re-entry (event-storm,
future refactor). Pending-flush chains are wrapped in `.catch(() =>
{})` so a transient `ws.send` rejection no longer poisons the
connection's flush loop.
Tests: storming `inbox.blob_stored` 10× per PUT yields exactly one WS/
SSE frame; 8 concurrent `bob.receive('alice', envelope)` calls keep
the ratchet intact and never surface "database is locked".
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Plumbing fix only — both createPrekeyRoutes and createInboxRoutes
already accepted disableRateLimit; standalone.ts just didn't read
the env. Now SHADE_DISABLE_RATE_LIMIT=1 turns off IP rate-limits on
every prekey + inbox route, with a WARN log on startup so operators
see it.
Single-tenant deployments only — multi-tenant relays must leave it
unset. Documented in docs/DEPLOYMENT.md.
Reported by Prism: ~6 pair attempts/hour from a single dev IP +
the sidecar's register call tripped the 5/hour REGISTER_LIMIT every
dev iteration.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two unblocking changes for first-contact flows.
Sender attribution: relay captures shortHash(senderSigningKey) at
PUT time (after signature verification, no new trust surface) and
surfaces it on bridge push (IncomingMessage.from) + inbox-fetch
(FetchedBlob.from) + DecryptHandler raw arg. Apps receiving a prekey
envelope from a never-before-seen peer can now bootstrap X3DH via
shade.receive('fp:<hex>', env) — pre-4.8 the wire envelope didn't
authenticate the sender and there was no out-of-band hint to use.
Idempotent ALTER TABLE migrations for SQLite + Postgres add a
sender_fp TEXT column; legacy rows surface as from=undefined
(inter-version compat).
Inbox.start() race: pre-4.8 start() called register() fire-and-forget
AND schedulePoll(0) synchronously, so the first poll on a fresh
address often beat the register HTTP RTT and got SHADE_NOT_FOUND.
start() now defers; register() success kicks schedulePoll(0). Manual
tick() is unaffected (deliberate user action, no gating).
Both reported by Prism. Tests cover all five acceptance criteria
from the sender-attribution request (PUT capture, bridge surface,
fetch surface, inter-version compat, end-to-end pair smoke) plus
the three from the race-fix request.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the bridge-connection-lifecycle signal that closes Prism's
~45s revoke window down to one server→client round-trip (~50ms).
Server (`@shade/inbox-server`):
- `inbox.peer_connected` / `inbox.peer_disconnected` events on the
0↔1 boundary across WS + SSE bridges. Long-poll deliberately not
tracked (every poll boundary would flap; push transports are also
the only ones where instant revoke matters).
- `PresenceTracker` collapses two parallel bridges (e.g. WS + SSE
during fallback handover) into one connect/disconnect pair.
- `GET /v1/bridge/presence` SSE endpoint: signed query with
`kind: 'presence'`, `watched: string[]`; on open streams a
per-address snapshot, then change frames filtered server-side.
MAX_WATCHED_ADDRESSES = 64. Subscribing does not itself count as
a peer-bridge connection.
- `createBridgeRoutes` now returns `{ app, websocket, presence }`.
Client (`@shade/transport-bridge`):
- `PresenceBridge.subscribe({ watch, onPresenceChange })` →
`{ addPeer, removePeer, watching, unsubscribe }`. addPeer/removePeer
mutate via reconnect with a fresh signed query.
- `signPresenceQuery` helper for non-PresenceBridge consumers.
Tests cover all four acceptance criteria from the Prism request:
server-event smoke, online→offline subscription, address scoping
(carol invisible to a [alice]-only sub), reconnect, plus an
addPeer/removePeer regression.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Browsers' Window.fetch is a WebIDL bound operation; storing it as
this.fetchImpl / this.fetchFn and calling via the instance receiver
threw "Illegal invocation" on the first request. Bind once at
construction in InboxClient, LongPollBridge, and SseBridge. Reported
by Prism (multi-device E2EE terminal), blocking every browser
consumer of the v4.6 transport stack on inbox.start() / bridge.connect().
WsBridge unaffected (uses WebSocket). Node/Bun fetch tolerates a free
receiver, so the bug never surfaced server-side — added regression
tests that install a strict-receiver globalThis.fetch to catch the
issue without an actual browser harness.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lands the broadcast-channel primitive Prism asked for in
Docs/shade-feature-request-sender-keys.md. The crypto in
@shade/core/sender-keys.ts was already in place; this release wires
it up as a first-class app-facing API, adds the persistence schema
across all six storage backends (memory, sqlite, indexeddb +
encrypted variants), introduces wire type 0x21 in @shade/proto,
and ships Prism's three acceptance tests verbatim.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the foundations Prism's web client (and any future browser-based
Shade app) needs: at-rest-encrypted IndexedDB storage that mirrors the
SQLite backend byte-for-byte at the AAD/nonce level, browser-safe
subpath imports so Vite/webpack/esbuild stop hitting bun:sqlite, and
KeyManager support for argon2id and N-factor composite unlock.
@shade/storage-encrypted
- EncryptedIndexedDBStorage (subpath: /idb) — full StorageProvider
using one object store per _enc table; reuses aeadSeal/aeadOpen +
row-codec sealers so a row sealed under the SQLite or Postgres
backend decrypts under IDB given the same KeyManager.
bumpPeerIdentityVersion is atomic under one IDB transaction.
- KeyManager argon2id source — memory-hard KDF for low-entropy
secrets (PINs). Backed by @noble/hashes/argon2 (already a transitive
dep). DEFAULT_ARGON2ID exported (m=64 MiB, t=3, p=1).
- KeyManager composite source — HKDF-combine N sub-sources into one
master. Every source mandatory; order significant by design;
composite-of-composite rejected; optional info string for app-level
domain separation.
- Subpath exports (/crypto, /sqlite, /postgres, /idb) plus a `browser`
condition on the default import that resolves to a barrel
excluding the Bun- and Postgres-specific entries. Browser bundles
no longer pull bun:sqlite transitively.
Tests
- 73 tests in shade-storage-encrypted (was 31). New coverage:
argon2id determinism + reject paths, composite same-factors → same
master, wrong-PIN/passphrase/order-swap → different master, info
domain separation, all 28 StorageProvider methods on
EncryptedIndexedDBStorage, fingerprint-mismatch rejection, and
cross-impl roundtrip with EncryptedSQLiteStorage proving the AAD/
nonce derivation is implementation-agnostic.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Expose the local device's 32-byte Ed25519 identity public key on Shade
so apps can hand it to their own backend at enrollment time for
signature verification, key pinning or per-device safety-number
computation. Closes the gap that forced consumers to ship placeholder
random bytes their backend could store but never verify against.
- @shade/sdk Shade.identityPublicKey: Promise<Uint8Array> — getter
mirrors the existing fingerprint accessor. Throws pre-init,
reflects the current key after rotate(), retired key preserved in
retired-identities storage per existing grace-period contract.
Private key remains unreachable.
- Test in shade-sdk/tests/sdk.test.ts: round-trip match against the
underlying storage's signingPublicKey, plus value updates after
rotate().
- Lockstep version bump 4.3.0 → 4.4.0 across all 25 packages.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Ship an official IndexedDB-backed StorageProvider so browser-based Shade
consumers persist identity, prekeys, sessions, retired identities,
peer-verification state and stream-resume rows across tab refresh and
browser restart. Closes the gap that forced browser apps onto
storage:"memory" (regenerated identity each load, orphaned device
records server-side).
- New package @shade/storage-indexeddb (4.3.0): full StorageProvider
conformance, schema v1, idb-backed; bumpPeerIdentityVersion is wrapped
in a single readwrite IDB transaction (atomic, vs SQLite's
read-then-upsert race).
- @shade/sdk resolveStorage() accepts { type: 'indexeddb', dbName? } via
dynamic import (lazy, optional dep — same pattern as
@shade/storage-postgres). Named StorageSpec type now reused by
ResolvedConfig.
- Tests: 16 new tests in shade-storage-indexeddb (StorageProvider
surface + peer-verifications + full E2EE conversation surviving a
simulated tab reload). Run on fake-indexeddb.
- Lockstep version bump 4.2.1 → 4.3.0 across all 25 packages.
- Publish scripts updated to include the new package.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>