release(v4.0.0): Shade GA — V3.x consolidation + audit prep
Some checks failed
Test / test (push) Has been cancelled
Cross-platform vectors / TypeScript vectors (bun) (push) Has been cancelled
Cross-platform vectors / Kotlin vectors (gradle) (push) Has been cancelled
Docker build and publish / docker (push) Has been cancelled
Publish / publish (push) Has been cancelled
Some checks failed
Test / test (push) Has been cancelled
Cross-platform vectors / TypeScript vectors (bun) (push) Has been cancelled
Cross-platform vectors / Kotlin vectors (gradle) (push) Has been cancelled
Docker build and publish / docker (push) Has been cancelled
Publish / publish (push) Has been cancelled
V3.1 → V3.12 consolidated and tagged for the first GA release. Wire format unchanged from 0.4.x — 4.0 peers interoperate with 0.4.x peers byte-for-byte. The version bump is semantic: audit-cycle complete, opt-in surface fully exposed, threat model refreshed for every new surface. Highlights: - All 24 @shade/* packages bumped to 4.0.0 in lockstep. - CHANGELOG 4.0.0 section is the canonical manifest of what landed. - THREAT-MODEL extended (§10 fingerprint gates, §11 WebRTC P2P, §12 Web-Worker boundary) + residual-risks table refreshed. - OpenAPI now covers all 27 routes: prekey, transfer, KT, inbox, bridge, observer, /metrics, /healthz, /ready. - MIGRATION 0.3.x → 4.0 documented + smoke-tested against shade migrate-storage on a real SQLite DB. - docs/audit/REVIEW-BUNDLE.md + SCOPE.md ready for external reviewer. - scripts/soak.ts harness for the GA-stable 2-week soak window. - All V*.md plans archived under docs/archive/ with Status: Done. - Voice/Video carved out into V5.0; 4.0 audit focuses on the frozen non-realtime stack. Tests: TS 1000/1000 + Kotlin 11/11 cross-platform vectors green. Docker: gt.zyon.no/stian/shade-prekey:4.0.0 builds and reports version 4.0.0 on /health. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -73,15 +73,20 @@ Tables will be created automatically with the `shade_server_*` prefix, so they c
|
||||
| `PORT` | `3900` | HTTP port |
|
||||
| `SHADE_PREKEY_DB_PATH` | `/data/shade-prekeys.db` | SQLite file location |
|
||||
| `SHADE_PREKEY_PG_URL` | unset | Postgres URL (overrides SQLite) |
|
||||
| `SHADE_INBOX_DB_PATH` | unset (memory) | SQLite file for the V3.6 inbox relay |
|
||||
| `SHADE_INBOX_PG_URL` | falls back to `SHADE_PREKEY_PG_URL` | Postgres URL for the inbox relay |
|
||||
| `SHADE_INBOX_PRUNE_INTERVAL_MINUTES` | `5` | How often expired inbox blobs are dropped |
|
||||
| `SHADE_OBSERVER_TOKEN` | unset | Enables dashboard at `/shade-observer/dashboard/`. Min 16 chars. |
|
||||
| `SHADE_STALE_DAYS` | `30` | Purge identities with no activity in N days |
|
||||
| `SHADE_CLEANUP_INTERVAL_HOURS` | `24` | Cleanup cycle interval |
|
||||
| `SHADE_LOG_LEVEL` | `info` | `debug` / `info` / `warn` / `error` |
|
||||
| `SHADE_OTEL_ENABLED` | unset | Set to `1`/`true` to enable OpenTelemetry tracing on `withTracer()`-configured deployments. See [`observability.md`](./observability.md). |
|
||||
|
||||
## Health and observability
|
||||
|
||||
- **Health:** `GET /health` — returns `{"status":"ok"}` when the storage backend is reachable. Docker's HEALTHCHECK uses this.
|
||||
- **Metrics:** `GET /metrics` — Prometheus format with counters, histograms, and gauges for all routes.
|
||||
- **Tracing:** Optional OpenTelemetry spans via `@shade/observability`. Off by default; flip `SHADE_OTEL_ENABLED=1` to activate. PII-safe span attributes are documented in [`observability.md`](./observability.md).
|
||||
- **OpenAPI:** `GET /openapi.yaml` — machine-readable API contract for any language.
|
||||
- **Redoc viewer:** `GET /docs` — human-readable API reference.
|
||||
- **Dashboard:** `GET /shade-observer/dashboard/` — live activity viewer (requires token).
|
||||
|
||||
179
docs/PRODUCTION-CHECKLIST.md
Normal file
179
docs/PRODUCTION-CHECKLIST.md
Normal file
@@ -0,0 +1,179 @@
|
||||
# Shade Production Checklist
|
||||
|
||||
A flat punch-list for taking a Shade prekey server from "it boots" to
|
||||
"production-ready". Every item below is a hard gate — if you can't tick it,
|
||||
don't ship.
|
||||
|
||||
The deeper "why" behind each item lives in `THREAT-MODEL.md`,
|
||||
`SECURITY.md`, and `docs/DEPLOYMENT.md`. This file is the operator's
|
||||
checklist.
|
||||
|
||||
> Scope: a single Shade prekey container (`@shade/server`) plus any
|
||||
> consumer apps that talk to it. For E2EE file transfer hardening
|
||||
> (max-size, retention, quotas), see the **Hardening** and **Retention**
|
||||
> sections of `docs/streams.md`.
|
||||
|
||||
---
|
||||
|
||||
## 1. TLS termination
|
||||
|
||||
- [ ] Public traffic is **TLS 1.2+ only** — Shade itself speaks plain HTTP
|
||||
and assumes a reverse proxy (Caddy, Traefik, nginx, Dokploy's
|
||||
built-in proxy) terminates TLS in front of it.
|
||||
- [ ] HSTS is on (`Strict-Transport-Security: max-age=15552000`).
|
||||
- [ ] The proxy is configured to pass the original `Host` header through
|
||||
so signed payloads bound to the canonical address don't trip the
|
||||
replay-window check on a mismatch.
|
||||
- [ ] Internal traffic between consumer apps and the prekey container
|
||||
runs on a private network (Docker bridge / VPC); the prekey port
|
||||
is **not** exposed to the public internet without TLS in front.
|
||||
|
||||
> **Why:** identity signatures and observer bearer tokens travel in
|
||||
> request bodies / headers. Without TLS, a network attacker can read
|
||||
> the observer token and replay it for the full validity window, and
|
||||
> can read the metadata (who registers, who fetches whose bundle).
|
||||
> See `THREAT-MODEL.md § 1` (network attacker).
|
||||
|
||||
## 2. Backups
|
||||
|
||||
- [ ] **SQLite:** scheduled `sqlite3 /data/shade-prekeys.db ".backup ..."`
|
||||
at least daily. The `.db` file plus `-wal` and `-shm` together is
|
||||
the recovery unit; never copy the bare `.db` while the container
|
||||
is running without using the online backup API.
|
||||
- [ ] **Postgres:** `pg_dump` (or your provider's snapshot) at least
|
||||
daily; verify a restore at least once per quarter.
|
||||
- [ ] Backups are stored on different infrastructure than the primary
|
||||
volume (different host / region / provider).
|
||||
- [ ] Backups are encrypted at rest (your storage provider's
|
||||
server-side encryption, age, or restic with a passphrase).
|
||||
- [ ] **Restore drill:** at least once before going live, restore the
|
||||
backup into a fresh volume and confirm `/health` is green and a
|
||||
registered identity is still resolvable.
|
||||
|
||||
> **Why:** prekey records contain identity public keys and one-time
|
||||
> prekeys. Losing them means new sessions can't be established to those
|
||||
> identities until each user re-registers. Existing sessions keep
|
||||
> ratcheting on the device-side state.
|
||||
|
||||
## 3. Observer token rotation
|
||||
|
||||
- [ ] `SHADE_OBSERVER_TOKEN` is set to **≥ 16 chars** of high-entropy
|
||||
random data (e.g. `openssl rand -hex 32`). The server logs a
|
||||
warning and disables the observer if the token is shorter.
|
||||
- [ ] The token is held in your secret manager (Dokploy secret, GitHub
|
||||
Actions secret, Vault, 1Password CLI), **never** committed to a
|
||||
compose file or `.env` checked into git.
|
||||
- [ ] The token is rotated on a schedule (recommended: every 90 days)
|
||||
and immediately if it has been shared with anyone who no longer
|
||||
needs access.
|
||||
- [ ] If you expose the dashboard publicly, you also gate it behind
|
||||
basic-auth at the proxy layer — bearer tokens are not
|
||||
revocation-friendly on their own.
|
||||
|
||||
> **Why:** the observer dashboard exposes metadata about every active
|
||||
> identity, registration timestamp, and recent activity. Anyone with
|
||||
> the token can scrape the entire prekey directory.
|
||||
|
||||
## 4. SQLite vs PostgreSQL
|
||||
|
||||
Pick one and stick to it.
|
||||
|
||||
- [ ] **SQLite** is the default. Use it when **one** Shade container is
|
||||
enough, you can tolerate downtime during backup snapshots, and
|
||||
your write rate is below ~500 req/s. Path: `SHADE_PREKEY_DB_PATH`,
|
||||
default `/data/shade-prekeys.db`.
|
||||
- [ ] **PostgreSQL** is for multi-replica deployments, shared
|
||||
infrastructure, or when you already operate a managed Postgres
|
||||
and want one fewer thing to back up. Path: `SHADE_PREKEY_PG_URL`.
|
||||
Tables are auto-created with `shade_server_*` prefix.
|
||||
- [ ] Whichever you pick, the database lives behind TLS for the
|
||||
connection (`sslmode=require` for Postgres) and on storage that
|
||||
is itself encrypted (LUKS, EBS encryption, managed-DB encryption).
|
||||
- [ ] You do **not** mix them in the same deployment. Setting
|
||||
`SHADE_PREKEY_PG_URL` overrides SQLite silently — pick one in
|
||||
`compose.yml` and document which.
|
||||
|
||||
> **Why:** Shade does **not** encrypt the database itself (V3.2 will).
|
||||
> Disk-level / volume-level encryption is the operator's responsibility
|
||||
> until at-rest encryption ships.
|
||||
|
||||
## 5. Log level and structured logs
|
||||
|
||||
- [ ] `SHADE_LOG_LEVEL` is set to `info` (production) or `warn`
|
||||
(high-traffic). Avoid `debug` in prod — it logs request bodies
|
||||
including signed payloads.
|
||||
- [ ] Logs are shipped to a retention-bounded sink (Loki, CloudWatch,
|
||||
Datadog) with **redaction of `Authorization` headers and signed
|
||||
bodies** if your sink doesn't already strip them.
|
||||
- [ ] You alert on `error`-level logs and on the absence of cleanup
|
||||
cycles (a stuck cleanup loop = unbounded DB growth).
|
||||
|
||||
> **Why:** at `debug` level the server logs signature material. While
|
||||
> Ed25519 signatures are not secrets per se, leaking them widens the
|
||||
> replay-window blast radius and reveals timing patterns.
|
||||
|
||||
## 6. Stale-identity cleanup parameters
|
||||
|
||||
- [ ] `SHADE_STALE_DAYS` is set deliberately for your product. The
|
||||
default (30 days) is right for "active chat app"; "occasional
|
||||
use" apps should bump to 90+ to avoid surprise re-registration.
|
||||
- [ ] `SHADE_CLEANUP_INTERVAL_HOURS` is left at 24 unless you have a
|
||||
specific reason — running cleanup more often does not free more
|
||||
space, and running it less often risks one cycle missing a day.
|
||||
- [ ] You watch the `shade_cleanup_purged_total` metric (Prometheus) and
|
||||
alert on a sudden 10× spike — that often signals a bug or a
|
||||
deployment that broke client-side activity timestamps.
|
||||
|
||||
> **Why:** stale cleanup is the only thing keeping the prekey directory
|
||||
> from growing forever. A misconfigured `SHADE_STALE_DAYS = 0` would
|
||||
> nuke every identity on every cycle. Bound the value at ≥ 1 in your
|
||||
> deployment config.
|
||||
|
||||
## 7. Secret rotation
|
||||
|
||||
- [ ] Identity signing keys: each consumer rotates via the documented
|
||||
identity-rotation flow (7-day grace period for old sessions).
|
||||
Operators do **not** touch identity keys directly.
|
||||
- [ ] Observer token: see § 3.
|
||||
- [ ] Database credentials (Postgres only): rotate per your standard
|
||||
cadence, with the connection string supplied through the secret
|
||||
manager.
|
||||
- [ ] No long-lived API keys or service tokens are stored in the
|
||||
container image or volume.
|
||||
|
||||
## 8. Rate-limit and body-size caps
|
||||
|
||||
- [ ] You have not lowered the built-in rate limits below the defaults
|
||||
(per-IP register/bundle and per-identity replenish/delete).
|
||||
- [ ] You have not raised the 64 KiB POST body limit. Prekey bundles
|
||||
fit comfortably; raising the limit only enables abuse.
|
||||
- [ ] Your reverse proxy enforces an additional connection / request-
|
||||
rate limit at the edge (Caddy `rate_limit`, Cloudflare, etc.)
|
||||
so a single noisy IP can't even reach Shade's per-route limits.
|
||||
|
||||
## 9. Health checks and metrics scrape
|
||||
|
||||
- [ ] Container has a Docker `HEALTHCHECK` (the official image already
|
||||
ships one against `/health`).
|
||||
- [ ] `/metrics` is scraped by Prometheus / OpenTelemetry and
|
||||
retained ≥ 30 days.
|
||||
- [ ] Alerts are wired for: `/health` failing for > 2 min, request
|
||||
latency p99 > 1 s, error rate > 1 %, cleanup cycles missing for
|
||||
> 25 h.
|
||||
|
||||
## 10. OpenAPI contract drift
|
||||
|
||||
- [ ] CI runs the OpenAPI lint (`bun test packages/shade-server/tests/openapi-lint.test.ts`)
|
||||
on every PR — the spec must remain valid OpenAPI 3.1 with no
|
||||
dangling `$ref`s.
|
||||
- [ ] Generated clients (Python, Go, Kotlin) are regenerated from the
|
||||
shipped spec on each release; mismatches between server and
|
||||
client are caught at integration test time, not production.
|
||||
|
||||
---
|
||||
|
||||
## Pre-flight summary
|
||||
|
||||
If you can answer "yes" to every box above, ship it. If you can't,
|
||||
write down which box and why before you do — that note belongs in your
|
||||
runbook so the next operator inherits the gap, not the surprise.
|
||||
116
docs/ROADMAP.md
Normal file
116
docs/ROADMAP.md
Normal file
@@ -0,0 +1,116 @@
|
||||
# Shade Roadmap — V3.1 → V5.0
|
||||
|
||||
Indeks over versjonsplanene fra V3.1-grunnsteinen via **Shade 4.0 GA** og
|
||||
videre til **Shade 5.0** (Voice & Video).
|
||||
|
||||
- **V4.0 GA** ✅ — alt fra V2.1 / V2.2 / V2.3 og bonus-tracket (sosial
|
||||
recovery, P2P WebRTC, Pub/Sub, Key Transparency) er merget, testet,
|
||||
dokumentert og pakket for ekstern review. Wire-formatet er låst.
|
||||
- **V5.0** = den dedikerte sanntids-releasen. *Alt* VOIP og videostreaming
|
||||
ligger her — implementert oppå den frosne 4.0-stacken.
|
||||
|
||||
Alle V3.x-planer ligger nå under [`docs/archive/`](./archive/) med
|
||||
`Status: Done`. Aktive planer: [`V5.0.md`](./V5.0.md).
|
||||
|
||||
---
|
||||
|
||||
## Faser
|
||||
|
||||
### Fase 1 — Documentation & Hardening Foundation ✅
|
||||
|
||||
| Plan | Tittel | Effort | Status |
|
||||
|------|--------|--------|--------|
|
||||
| [V3.1](./archive/V3.1.md) | Documentation & Hardening Foundation | S | **Done** |
|
||||
|
||||
### Fase 2 — Sikkerhetsmodning ✅
|
||||
|
||||
| Plan | Tittel | Effort | Status |
|
||||
|------|--------|--------|--------|
|
||||
| [V3.2](./archive/V3.2.md) | At-Rest Storage Encryption | L | **Done** |
|
||||
| [V3.3](./archive/V3.3.md) | Fingerprint Gates & Trust UX | M | **Done** |
|
||||
| [V3.4](./archive/V3.4.md) | Observability v2 (OpenTelemetry) | M | **Done** |
|
||||
| [V3.5](./archive/V3.5.md) | Android Parity & Cross-Platform CI | XL | **Done** |
|
||||
|
||||
### Fase 3 — Plattformutvidelse ✅
|
||||
|
||||
| Plan | Tittel | Effort | Status |
|
||||
|------|--------|--------|--------|
|
||||
| [V3.6](./archive/V3.6.md) | Async Store-and-Forward (Inbox) | L | **Done** |
|
||||
| [V3.7](./archive/V3.7.md) | Transport Bridge (SSE / long-poll) | M | **Done** |
|
||||
| [V3.8](./archive/V3.8.md) | Web Workers Crypto | M-L | **Done** |
|
||||
| [V3.9](./archive/V3.9.md) | Rich File Metadata & Previews | M | **Done** |
|
||||
|
||||
### Fase 4 — Tillit og P2P-transport ✅
|
||||
|
||||
| Plan | Tittel | Effort | Status |
|
||||
|------|--------|--------|--------|
|
||||
| [V3.10](./archive/V3.10.md) | Social Key Recovery | L | **Done** |
|
||||
| [V3.11](./archive/V3.11.md) | WebRTC P2P Transport | XL | **Done** |
|
||||
| [V3.12](./archive/V3.12.md) | Key Transparency | XXL | **Done** |
|
||||
|
||||
### Fase 5 — General Availability ✅
|
||||
|
||||
| Plan | Tittel | Effort | Status |
|
||||
|------|--------|--------|--------|
|
||||
| [V4.0](./archive/V4.0.md) | External Audit, Consolidation, GA | M | **Done** |
|
||||
|
||||
### Fase 6 — Sanntid (post-GA)
|
||||
|
||||
| Plan | Tittel | Effort | Avhenger av |
|
||||
|------|--------|--------|-------------|
|
||||
| [V5.0](./V5.0.md) | Voice & Video | XXL | V4.0 GA + V3.11 |
|
||||
|
||||
---
|
||||
|
||||
## Effort-nøkkel
|
||||
|
||||
| Symbol | Tid |
|
||||
|--------|-----|
|
||||
| **S** | 1–2 uker |
|
||||
| **M** | 2–4 uker |
|
||||
| **L** | 4–8 uker |
|
||||
| **XL** | 2–4 måneder |
|
||||
| **XXL** | 4+ måneder / multi-quarter |
|
||||
|
||||
---
|
||||
|
||||
## Avhengighetsgraf
|
||||
|
||||
```text
|
||||
V3.1 ────┬──► V3.2 ──┐
|
||||
├──► V3.3 ──┼──► V3.10 ──┐
|
||||
├──► V3.4 ──┘ │
|
||||
├──► V3.5 ───────────────┼──► V3.12 ──┐
|
||||
├──► V3.6 ──► V3.7 ──► V3.11 ─────────┤
|
||||
├──► V3.8 ├──► V4.0 GA ──► V5.0 (Voice & Video)
|
||||
└──► V3.9 ─────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Status-konvensjon
|
||||
|
||||
Hver plan har et `Status:`-felt øverst. Lov verdier:
|
||||
|
||||
- `Idea` — ikke startet, design fortsatt åpent.
|
||||
- `Design` — designnotat under arbeid eller approved.
|
||||
- `IMP` — implementasjon pågår.
|
||||
- `Done` — merget i main, dekket av tester.
|
||||
|
||||
Når en plan blir `Done`, flytt fila til `docs/archive/` og oppdater denne tabellen.
|
||||
|
||||
---
|
||||
|
||||
## Versjonering
|
||||
|
||||
- **V3.1 → V3.12** ble trinnvise minor-releases på `0.4.x`-linjen.
|
||||
- Wire-format-endringer akkumulerte til **V4.0**, men endte med å være
|
||||
uendret fra 0.4.x — major-bumpen til 4.0 markerer audit-cycle ferdig
|
||||
og GA-frosset kjerne, ikke en wire-bump.
|
||||
- **V4.0** er GA — låst kjerne, pakket for ekstern review, ingen
|
||||
voice/video.
|
||||
- **V5.0** legger sanntid (voice/video/broadcast) oppå den frosne
|
||||
4.0-stacken. Bygger på reserverte envelope-typer slik at 4.0-klienter
|
||||
ignorerer 5.0-trafikk gracefully — ikke breaking.
|
||||
- Hver `V*`-merge oppdaterer `CHANGELOG.md` og bumper alle pakker via
|
||||
`bun run version`.
|
||||
@@ -29,9 +29,11 @@ Pull in **one row** that matches your project; add optional columns only when ne
|
||||
| **Large files** — resumable E2EE upload/download | Above + stream protocol + HTTP (or WS) transport | `@shade/sdk` (re-exports transfer) + mount transfer routes on **your** HTTP server | `shade.upload` / `onIncomingTransfer` — see [streams.md](./streams.md) |
|
||||
| **React UI** — upload/download widgets | Runtime from SDK + widgets | `@shade/sdk` + `@shade/widgets` | `ShadeRuntimeProvider`, `useShadeUpload` / `useShadeDownload` |
|
||||
| **Prekey hosting only** — one container per product | No app crypto in the container | Docker image / `@shade/server` | Deploy prekey image; point `prekeyServer` at it from apps |
|
||||
| **Offline-tolerant messaging** — recipient may be offline | Above + a relay that holds ciphertext blobs | `@shade/inbox` (client) + `@shade/inbox-server` (or the prekey container, which bundles both) | Register address, `inbox.send()` to peer, `inbox.onIncoming(handler)` — see [inbox.md](./inbox.md) |
|
||||
| **"What if I lose my phone?"** — survive device loss without a recovery agent | Above + Shamir-split shares to `n` guardians; threshold `k` reconstruct | `@shade/recovery` + `@shade/widgets` (`<RecoverySetup />`, `<RecoveryRequest />`, `<RecoveryApprove />`) | `setupRecovery` / `attachGuardian` / `requestRecovery` — see [recovery.md](./recovery.md) |
|
||||
| **Maximum control** — custom wire, custom transport | Wire + session manager | `@shade/core` + `@shade/proto` (+ your storage + crypto provider) | `ShadeSessionManager`, encode/decode envelopes yourself |
|
||||
| **HTTP or WebSocket convenience** | Auto-wrap application bytes | `@shade/transport` on top of your stack | Use when you want transport helpers, not a new protocol |
|
||||
| **Android** | Byte-compatible with TS (roadmap) | `shade-android` module | See [android/shade-android/README.md](../android/shade-android/README.md) — parity work in progress |
|
||||
| **Android** | Byte-compatible with TS (cross-vector gated in CI) | `shade-android` module | See [android/shade-android/README.md](../android/shade-android/README.md). Cross-platform vectors live in [`test-vectors/`](../test-vectors/) and are exercised by both runners. |
|
||||
|
||||
You can **mix rows**: e.g. backend with `@shade/sdk` + SQLite for sessions, separate service mounting `transfer` routes, browser clients using `@shade/widgets`.
|
||||
|
||||
@@ -43,7 +45,7 @@ You can **mix rows**: e.g. backend with `@shade/sdk` + SQLite for sessions, sepa
|
||||
2. Pick **storage** (`sqlite:…`, Postgres, or project-specific adapter implementing the core storage interfaces).
|
||||
3. Choose **surface**: usually `@shade/sdk` unless you truly need `@shade/core` only.
|
||||
4. For files: enable **transfer routes** and authenticate chunk uploads using the patterns in the SDK (see streams doc).
|
||||
5. Run **`shade doctor`** when something fails in production-ish setups (install the CLI as in repository [Quick start](../README.md#quick-start)); coverage is evolving — roadmap in [V2.2](./V2.2.md).
|
||||
5. Run **`shade doctor`** when something fails in production-ish setups (install the CLI as in repository [Quick start](../README.md#quick-start)); the gates that fire are documented in [trust-ux.md](./trust-ux.md) and [PRODUCTION-CHECKLIST.md](./PRODUCTION-CHECKLIST.md).
|
||||
|
||||
---
|
||||
|
||||
@@ -54,7 +56,7 @@ You can **mix rows**: e.g. backend with `@shade/sdk` + SQLite for sessions, sepa
|
||||
| File transfer architecture | [streams.md](./streams.md) |
|
||||
| Deployment & operations | [DEPLOYMENT.md](./DEPLOYMENT.md) |
|
||||
| Threat model | [THREAT-MODEL.md](../THREAT-MODEL.md) |
|
||||
| Planned improvements | [V2.1](./V2.1.md), feature backlog [V2.2](./V2.2.md), trust/ops [V2.3](./V2.3.md) |
|
||||
| Planned improvements | [ROADMAP](./ROADMAP.md) — V3.x archive under [`archive/`](./archive/), next milestone [V5.0](./V5.0.md) |
|
||||
|
||||
---
|
||||
|
||||
|
||||
135
docs/V5.0.md
Normal file
135
docs/V5.0.md
Normal file
@@ -0,0 +1,135 @@
|
||||
# Shade V5.0 — Voice & Video
|
||||
|
||||
**Status:** Idea (post-V4.0 GA)
|
||||
**Effort:** XXL (4+ måneder)
|
||||
**Forrige:** V4.0 GA + V3.11 (P2P transport kreves)
|
||||
**Adresserer:** V2.1-tillegg "ShadeVoiceButton / ShadeVideoCall / ShadeBroadcaster"
|
||||
|
||||
V5.0 er den dedikerte sanntids-releasen — alt VOIP og videostreaming
|
||||
samles her, *etter* at Shade 4.0 er GA-merket. Stacken under
|
||||
(ratchet, transport, observability, recovery, key transparency,
|
||||
WebRTC P2P) er låst i 4.0; 5.0 bygger uten å røre kjernekrypto-
|
||||
revisjonen.
|
||||
|
||||
---
|
||||
|
||||
## Mål
|
||||
|
||||
E2EE sanntidskommunikasjon på Shade-stack: voice-calls, video-calls,
|
||||
broadcast/streaming — alt som "magic drop-in"-komponenter for konsumerende
|
||||
apper.
|
||||
|
||||
```tsx
|
||||
<ShadeVoiceButton to={peerAddress} />
|
||||
<ShadeVideoCall to="device:server-admin" />
|
||||
<ShadeBroadcaster streamKey="game-stream-1" />
|
||||
<ShadeViewer streamKey="game-stream-1" />
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Scope
|
||||
|
||||
### Inn
|
||||
|
||||
- Ny pakke `@shade/voice` — 1:1 voice over WebRTC P2P.
|
||||
- Ny pakke `@shade/video` — 1:1 video, deler kjerne med voice.
|
||||
- Ny pakke `@shade/broadcast` — 1:N broadcast med relay-helper.
|
||||
- SFrame-style frame encryption — payload-keys ratchet'es per call,
|
||||
derivert fra Shade-session.
|
||||
- Codec: Opus (audio), AV1/VP9 (video) — WebRTC standard.
|
||||
- Widget-komponenter for hvert use case.
|
||||
- Key-rotation under loss: forward-secrecy per X frames eller hvert N
|
||||
sekund.
|
||||
|
||||
### Ut
|
||||
|
||||
- Group-calls (≥ 3 deltakere) som første milestone — krever SFU + group
|
||||
key agreement; egen sak.
|
||||
- Replacement for native phone-app — vi tilbyr in-app calls.
|
||||
- Codec-implementasjon — vi bruker browser/native WebRTC.
|
||||
|
||||
---
|
||||
|
||||
## Design
|
||||
|
||||
### Frame-key derivasjon
|
||||
|
||||
```text
|
||||
callKey = X3DH(A, B) → HKDF("shade-call-v1") → callRatchetKey
|
||||
frameKey[i] = HKDF(callRatchetKey, "frame" || u64(i))
|
||||
```
|
||||
|
||||
`callRatchetKey` ratcheter forward hver N millisekund eller hver M frames;
|
||||
kompromittert frame = bare det vinduet eksponert.
|
||||
|
||||
### SFrame
|
||||
|
||||
Følger IETF MLS/SFrame-mønstre:
|
||||
|
||||
- Header er klartekst (codec-metadata).
|
||||
- Payload er AES-GCM med deterministisk nonce.
|
||||
- Mottaker dropper frames med out-of-window seq.
|
||||
|
||||
### Topologi
|
||||
|
||||
- 1:1: P2P via V3.11.
|
||||
- Broadcast: relay-helper i `@shade/broadcast-relay` distribuerer
|
||||
ciphertext til subscribers — relay ser aldri plaintext.
|
||||
|
||||
---
|
||||
|
||||
## Leveranser
|
||||
|
||||
### Pakker
|
||||
|
||||
- `@shade/voice` + `@shade/video` (delt kjerne i `@shade/realtime-core`).
|
||||
- `@shade/broadcast` + `@shade/broadcast-relay`.
|
||||
- Widgets: `<ShadeVoiceButton />`, `<ShadeVideoCall />`,
|
||||
`<ShadeBroadcaster />`, `<ShadeViewer />`.
|
||||
|
||||
### Tester
|
||||
|
||||
- Unit: SFrame encrypt/decrypt + tamper.
|
||||
- Integration: 1:1 video 30 fps i 60 s; > 99 % frames levert; key rotation
|
||||
observert.
|
||||
- Loss recovery: 30 % packet loss → quality grace.
|
||||
- Adversarial: relay-DB-dump avslører ingen plaintext.
|
||||
|
||||
### Dokumentasjon
|
||||
|
||||
- `docs/voice-video.md` — setup, codec-tradeoffs, broadcast-arkitektur.
|
||||
|
||||
---
|
||||
|
||||
## Akseptansekriterier
|
||||
|
||||
- [ ] 1:1 video 60 fps + 1080p mellom to klienter samme LAN.
|
||||
- [ ] Frame-key kompromittering blokkerer maks 1 sekund forward data.
|
||||
- [ ] Broadcast 1:50 viewers fungerer med < 2 s end-to-end latency.
|
||||
|
||||
---
|
||||
|
||||
## Avhengigheter
|
||||
|
||||
- **V4.0 GA** — kjerne-stacken må være ekstern-revidert og frosset før
|
||||
vi legger sanntid-protokoll oppå.
|
||||
- V3.11 — P2P transport (kommer i V4.0-vinduet).
|
||||
- V3.5 — Android-paritet hvis voice/video skal funke på mobile.
|
||||
|
||||
---
|
||||
|
||||
## Risiko
|
||||
|
||||
- **Codec-quirks.** AV1 vs VP9 vs H.264 har ulik browser-støtte.
|
||||
- **Frame-key sync under loss.** Avansert; SFrame-spec er fortsatt under
|
||||
standardisering.
|
||||
- **Latency vs sikkerhet.** Hver ratchet-step legger på µs.
|
||||
|
||||
---
|
||||
|
||||
## Migrasjon
|
||||
|
||||
Nye pakker. Ikke breaking — wire-formatene fra V4.0 holdes uendret;
|
||||
voice/video legger til egne envelope-typer i et reservert range som
|
||||
4.0-clients ignorerer.
|
||||
100
docs/archive/V3.1.md
Normal file
100
docs/archive/V3.1.md
Normal file
@@ -0,0 +1,100 @@
|
||||
# Shade V3.1 — Documentation & Hardening Foundation
|
||||
|
||||
**Status:** Done
|
||||
**Effort:** S (1–2 uker)
|
||||
**Forrige:** V2.3
|
||||
**Neste:** V3.2 / V3.3 / V3.4 (kan kjøres parallelt)
|
||||
|
||||
---
|
||||
|
||||
## Mål
|
||||
|
||||
Lukke "lav-friksjon"-gjelden fra V2.1, V2.2 og V2.3 før vi tar fatt på de tunge
|
||||
sikkerhetsløftene. Dette er pre-arbeidet som låser opp resten av roadmapen:
|
||||
operatører skal kunne deploye trygt, transfer-konsumenter skal ha klare grenser,
|
||||
og OpenAPI skal dekke hele HTTP-flaten.
|
||||
|
||||
Ingen ny kjernekode — kun docs, OpenAPI-utvidelser, retention-defaults og en
|
||||
test-/threat-matrise.
|
||||
|
||||
---
|
||||
|
||||
## Scope
|
||||
|
||||
### Inn
|
||||
|
||||
- README + `@shade/server`-README: eksplisitt "keys vs payloads"-narrativ med
|
||||
diagram + lenke til `THREAT-MODEL.md`.
|
||||
- Ny `docs/PRODUCTION-CHECKLIST.md`: TLS, backup, observer-token-rotering,
|
||||
SQLite vs PG, log-nivå, stale-params, secret-rotering.
|
||||
- Hardening-seksjon i `docs/streams.md`: max stream-size, TTL, quota-mønstre —
|
||||
peker mot `@shade/files`-hooks som referanse.
|
||||
- `openapi.yaml` utvidet med `/v1/transfer/*` (`chunk`, `state`, `health`) +
|
||||
sikkerhetsskjema for `ShadeTransferAuthenticator`.
|
||||
- Retention-defaults i `docs/streams.md` + SDK-template:
|
||||
`pruneStreamStates`-cron som default — "ferdige streams ryddes etter N
|
||||
dager".
|
||||
- `SECURITY.md`-utvidelse: review-status, "hvordan rapportere", lenking fra
|
||||
`THREAT-MODEL.md`-rader → `tests/security/*` (test-/threat-matrise).
|
||||
|
||||
### Ut
|
||||
|
||||
- Faktisk crypto-review (det er V4.0).
|
||||
- Endringer i krypto- eller wire-format.
|
||||
- Ny kode utenfor SDK-templates.
|
||||
|
||||
---
|
||||
|
||||
## Leveranser
|
||||
|
||||
### Dokumentasjon
|
||||
|
||||
- `docs/PRODUCTION-CHECKLIST.md` — ny.
|
||||
- `docs/streams.md` — utvidet med "Hardening" og "Retention".
|
||||
- `README.md` — diagram-justering + "Hva som ikke går via Shade-server".
|
||||
- `packages/shade-server/README.md` — speile narrativet.
|
||||
- `SECURITY.md` — review-status + threat-/test-matrise.
|
||||
- `THREAT-MODEL.md` — krysslenker til konkrete tester.
|
||||
|
||||
### Kode (kun konfig + templates)
|
||||
|
||||
- `packages/shade-server/openapi.yaml` — `/v1/transfer/*`-paths,
|
||||
`ShadeTransferAuthenticator` securityScheme.
|
||||
- `packages/shade-cli/templates/bun-server` — default
|
||||
`pruneStreamStates`-cron.
|
||||
|
||||
### Tester
|
||||
|
||||
- Lint-test: OpenAPI-spec validerer fortsatt mot OpenAPI 3.1-skjema.
|
||||
- Smoke-test for cron i template.
|
||||
|
||||
---
|
||||
|
||||
## Akseptansekriterier
|
||||
|
||||
- [ ] Ny utvikler kan lese README + `PRODUCTION-CHECKLIST.md` og deploye
|
||||
prod-klar Shade uten å lese hele kodebasen.
|
||||
- [ ] Generert klient (Python eller Go) fra `openapi.yaml` dekker både
|
||||
prekey- og transfer-flate uten manuelle fixes for happy path.
|
||||
- [ ] `THREAT-MODEL.md` linker hver "Mitigations"-rad til minst én test-fil.
|
||||
- [ ] Default SDK-template `bun-server` prune'r resumable streams uten
|
||||
manuell konfig.
|
||||
|
||||
---
|
||||
|
||||
## Avhengigheter
|
||||
|
||||
Ingen.
|
||||
|
||||
---
|
||||
|
||||
## Risiko
|
||||
|
||||
Lav. Verste utfall er foreldet docs hvis V3.2+ endrer overflater. Mitiger ved
|
||||
å skrive små, oppdaterbare seksjoner heller enn lange narrative kapitler.
|
||||
|
||||
---
|
||||
|
||||
## Migrasjon
|
||||
|
||||
Ingen — alt er additivt.
|
||||
134
docs/archive/V3.10.md
Normal file
134
docs/archive/V3.10.md
Normal file
@@ -0,0 +1,134 @@
|
||||
# Shade V3.10 — Social Key Recovery
|
||||
|
||||
**Status:** Done — landet i `@shade/recovery` 0.4.0, frosset i 4.0 GA.
|
||||
**Effort:** L (4–8 uker)
|
||||
**Forrige:** V3.2 + V3.3
|
||||
**Adresserer:** V2.1-tillegg "sosial nøkkel-recovery"
|
||||
|
||||
---
|
||||
|
||||
## Mål
|
||||
|
||||
Løs det største UX-hullet i alle E2EE-systemer: **"Hva skjer hvis jeg
|
||||
mister telefonen?"**. Bruker velger N "guardians" (familie / venner /
|
||||
jobb-partnere); når bruker mister enheten, kan en threshold-andel av
|
||||
guardians sammen returnere identity-nøkkelen — uten at noen enkelt guardian
|
||||
kan gjøre det alene, og uten at server lærer noe.
|
||||
|
||||
---
|
||||
|
||||
## Scope
|
||||
|
||||
### Inn
|
||||
|
||||
- Shamir Secret Sharing (k-of-n) over identity private key (eller en
|
||||
backup-encryption-key).
|
||||
- Distribusjon av shares via eksisterende 1:1 Shade-sesjoner — guardians
|
||||
lagrer share lokalt.
|
||||
- Recovery-flow: ny enhet ber threshold guardians sende sine shares;
|
||||
rekonstruerer på ny enhet.
|
||||
- Verifikasjons-step: ny enhet beviser identitet til hver guardian via OOB
|
||||
safety-number-sammenligning **før** guardian frigjør share.
|
||||
- UX-guide: hvor mange guardians, hvilken threshold, hvordan rotere når en
|
||||
guardian mister enhet.
|
||||
|
||||
### Ut
|
||||
|
||||
- "Cloud guardian" / Shade-driftet recovery — vi tillater ingen sentralisert
|
||||
komponent som kan gjøre det alene.
|
||||
- Auto-distribusjon (vi krever eksplisitt valg av guardians).
|
||||
|
||||
---
|
||||
|
||||
## Design
|
||||
|
||||
### Hva deles
|
||||
|
||||
```text
|
||||
shareSecret = AES-256-GCM-encrypt(identityState, recoveryKey)
|
||||
recoveryKey is Shamir-split(k, n) → shares[i]
|
||||
shareSecret stored locally + on each guardian
|
||||
each guardian receives one share via Shade.send
|
||||
```
|
||||
|
||||
`identityState` er det samme som `Shade.exportBackup` (eksisterer i 0.3.x),
|
||||
men her gjenbrukes formatet.
|
||||
|
||||
### Recovery-flow
|
||||
|
||||
1. Ny enhet genererer **temporary** identity + safety number.
|
||||
2. Ny enhet kontakter guardians via prekey-server (OOB verifisering først).
|
||||
3. Hver guardian godkjenner manuelt og returnerer sin share via
|
||||
`Shade.send`.
|
||||
4. Ny enhet rekonstruerer `recoveryKey`, dekrypterer `shareSecret`,
|
||||
gjenoppretter identity.
|
||||
5. Original identity roterer (gammel identitet markeres som
|
||||
"compromised — used for recovery").
|
||||
|
||||
### Guardian-UX
|
||||
|
||||
- Guardian-app/widget viser:
|
||||
*"Alice (din venn) har mistet sin enhet og ber om recovery share.
|
||||
Bekreft fingerprint før du sender."*
|
||||
- Guardian kan **avslå** uten konsekvens.
|
||||
|
||||
---
|
||||
|
||||
## Leveranser
|
||||
|
||||
### Pakker
|
||||
|
||||
- `@shade/recovery` — Shamir + share-distribusjon.
|
||||
- `@shade/widgets` — `<RecoverySetup />` (velg guardians) +
|
||||
`<RecoveryRequest />` (ny enhet ber) + `<RecoveryApprove />` (guardian
|
||||
godkjenner).
|
||||
|
||||
### Tester
|
||||
|
||||
- Unit: Shamir split/combine roundtrip; threshold-håndhevelse.
|
||||
- Integration: full 3-of-5 recovery med 5 mock-guardians.
|
||||
- Adversarial: 2 guardians koluderer (under threshold) → kan ikke
|
||||
rekonstruere.
|
||||
- Adversarial: ondsinnet ny enhet uten safety-number-bekreftelse → ingen
|
||||
guardian skal frigjøre share.
|
||||
|
||||
### Dokumentasjon
|
||||
|
||||
- `docs/recovery.md` — full UX + threat model.
|
||||
- Trusselmodell-utvidelse: kollusjon ≤ k-1, identitetsforfalskning, social
|
||||
engineering.
|
||||
|
||||
---
|
||||
|
||||
## Akseptansekriterier
|
||||
|
||||
- [ ] 3-of-5 recovery fungerer end-to-end på 2 separate enheter.
|
||||
- [ ] Ingen koalisjon av (k-1) guardians kan rekonstruere `shareSecret`
|
||||
(verifisert med fast-check property test).
|
||||
- [ ] Guardian-side widget krever fingerprint-bekreftelse før send (gate
|
||||
fra V3.3 forsterket).
|
||||
|
||||
---
|
||||
|
||||
## Avhengigheter
|
||||
|
||||
- V3.2 — nøkkelmateriale at-rest hos guardian skal være kryptert.
|
||||
- V3.3 — fingerprint-gate på recovery-handshake.
|
||||
|
||||
---
|
||||
|
||||
## Risiko
|
||||
|
||||
- **UX er det vanskeligste.** "Hvem er min guardian?" er sosialt komplekst;
|
||||
bruker kan velge dårlig.
|
||||
- **Social engineering.** Angriper imiterer offer over telefon → guardian
|
||||
gir share. Mitiger med harde fingerprint-gates + cool-down.
|
||||
- **Dead guardians.** Hvis guardian dør / mister sin enhet uten å være
|
||||
erstattet, threshold synker. Periodisk "guardian health check"-prompt
|
||||
anbefales.
|
||||
|
||||
---
|
||||
|
||||
## Migrasjon
|
||||
|
||||
Ny pakke. Apper kan legge til recovery-widget i innstillinger.
|
||||
124
docs/archive/V3.11.md
Normal file
124
docs/archive/V3.11.md
Normal file
@@ -0,0 +1,124 @@
|
||||
# Shade V3.11 — WebRTC P2P Transport
|
||||
|
||||
**Status:** Done — landet med `@shade/transport-webrtc` 0.4.0,
|
||||
`MultiTransportFallback` i `@shade/transfer`, og
|
||||
`shade.configureWebRTC()` i `@shade/sdk`. Se [docs/webrtc.md](../webrtc.md).
|
||||
**Effort:** XL (2–4 måneder)
|
||||
**Forrige:** V3.7
|
||||
**Adresserer:** V2.1-tillegg "P2P WebRTC transport"
|
||||
|
||||
---
|
||||
|
||||
## Mål
|
||||
|
||||
Direct peer-to-peer datakanal mellom Shade-klienter når NAT/firewall
|
||||
tillater. Primær gevinst: massiv throughput for `@shade/transfer`
|
||||
(filer, store payloads) og lav-latens for messaging når begge peere
|
||||
er online samtidig. E2EE bevart: WebRTC DTLS-SRTP er **transport** —
|
||||
payload er fortsatt Shade ratchet-krypto.
|
||||
|
||||
V3.11 lander i V4.0-vinduet og er foundation-only — sanntidsbruken
|
||||
(voice, video, broadcast) ligger i [V5.0](../V5.0.md) som downstream
|
||||
konsumer av denne datakanalen.
|
||||
|
||||
---
|
||||
|
||||
## Scope
|
||||
|
||||
### Inn
|
||||
|
||||
- Ny pakke `@shade/transport-webrtc`.
|
||||
- Signaling via Shade control plane (eksisterende kanal — `Shade.send`).
|
||||
- ICE/STUN: bruk offentlige STUN-servere som default.
|
||||
- TURN: konfigurerbar TURN-relay som fallback.
|
||||
- DataChannel for `@shade/transfer`-chunks.
|
||||
- Auto-fallback: P2P → HTTP (eksisterende stack).
|
||||
|
||||
### Ut
|
||||
|
||||
- SFU/MCU (mange-til-mange topologi) — broadcast/video er V5.0.
|
||||
- Voice/video media-tracks — V3.11 er ren datakanal (DataChannel);
|
||||
audio/video over RTP er V5.0.
|
||||
- DTLS-fingerprint-binding til Shade-fingerprint (vurderes som hardening,
|
||||
men ikke krav).
|
||||
|
||||
---
|
||||
|
||||
## Design
|
||||
|
||||
### Connection-flow
|
||||
|
||||
```text
|
||||
A initierer:
|
||||
1. createOffer() → SDP
|
||||
2. shade.send(B, { kind: "webrtc-offer", sdp })
|
||||
3. B mottar over Shade-kanal, createAnswer()
|
||||
4. shade.send(A, { kind: "webrtc-answer", sdp })
|
||||
5. ICE-candidates exchange (samme kanal)
|
||||
6. DataChannel åpen
|
||||
```
|
||||
|
||||
### Wrapping
|
||||
|
||||
DataChannel sender ferdige `@shade/transfer`-chunks (allerede E2EE).
|
||||
WebRTC's egen DTLS-SRTP fungerer som transport-secrecy lag.
|
||||
|
||||
### Topologi
|
||||
|
||||
- 1:1 P2P direkte når mulig.
|
||||
- TURN-relay når NAT'er er for strenge (transport-only, ser ikke plaintext).
|
||||
|
||||
---
|
||||
|
||||
## Leveranser
|
||||
|
||||
### Pakker
|
||||
|
||||
- `@shade/transport-webrtc` — Connection, DataChannel-wrapper, ICE-config.
|
||||
- `@shade/transfer` utvides: `WebRTCTransferTransport` som drop-in.
|
||||
- `FallbackTransferTransport` får ny ledd: P2P → WS → HTTP.
|
||||
|
||||
### Tester
|
||||
|
||||
- Loopback unit: offer/answer/ICE i Bun via `node-datachannel` eller
|
||||
`wrtc`.
|
||||
- Integration: 100 MB transfer over P2P vs HTTP — P2P skal vinne på samme
|
||||
nettverk.
|
||||
- Failover: TURN-relay påtvinger relay-modus.
|
||||
- NAT-emulering (loopback med ulike NAT-typer hvis mulig).
|
||||
|
||||
### Dokumentasjon
|
||||
|
||||
- `docs/webrtc.md` — setup, STUN/TURN-config, NAT-traversal-håp og
|
||||
-realiteter.
|
||||
|
||||
---
|
||||
|
||||
## Akseptansekriterier
|
||||
|
||||
- [ ] To klienter på samme LAN: P2P direct uten TURN, throughput > 5x
|
||||
HTTP-baseline.
|
||||
- [ ] To klienter bak strenge NAT'er: TURN-relay aktiveres automatisk.
|
||||
- [ ] Failover P2P-død → HTTP innen 5 s uten meldingstap.
|
||||
|
||||
---
|
||||
|
||||
## Avhengigheter
|
||||
|
||||
- V3.7 — bridge-mønstre + fallback-arkitektur.
|
||||
|
||||
---
|
||||
|
||||
## Risiko
|
||||
|
||||
- **NAT-traversal-helvete.** Mange edge-cases. Mitiger med tidlige
|
||||
integration-tester på faktiske NAT-konfigurasjoner.
|
||||
- **Browser-kompatibilitet.** Safari har sine egne RTC-quirks.
|
||||
- **TURN-koster.** TURN-relay = ekte trafikk gjennom server. Operatør må
|
||||
vite det.
|
||||
|
||||
---
|
||||
|
||||
## Migrasjon
|
||||
|
||||
Opt-in. Eksisterende HTTP/WS-transport fungerer uendret.
|
||||
557
docs/archive/V3.12-DESIGN.md
Normal file
557
docs/archive/V3.12-DESIGN.md
Normal file
@@ -0,0 +1,557 @@
|
||||
# V3.12 — Key Transparency: Designnotat
|
||||
|
||||
**Status:** Approved (in-tree review — markeres `Design` i ROADMAP)
|
||||
**Forfatter:** Shade-teamet
|
||||
**Reviewer-mål:** ekstern crypto-orientert reviewer før produksjons-deploy.
|
||||
**Implementasjons-target:** `@shade/key-transparency` + utvidelser i
|
||||
`@shade/server`, `@shade/transport`, `@shade/sdk`.
|
||||
|
||||
---
|
||||
|
||||
## 1. Mål og ikke-mål
|
||||
|
||||
### Mål
|
||||
|
||||
Bytt ut "blind tillit til prekey-server" med en **verifiserbar
|
||||
append-only log**. Når en klient mottar et prekey-bundle skal den ha
|
||||
kryptografisk bevis for at:
|
||||
|
||||
1. Bundlen er **commit'et** i en tidstemplet log (Signed Tree Head).
|
||||
2. Den eksakte (adresse, identityKey, signedPreKey)-mappingen står i
|
||||
den loggen — _eller_ den står ikke (fravær-bevis).
|
||||
3. Loggen har ikke skrevet om historie siden forrige fetch
|
||||
(konsistens-bevis).
|
||||
4. Andre klienter ser **samme** log (split-view-deteksjon via
|
||||
witness-gossip).
|
||||
|
||||
Dette er **CT-style transparens** (RFC 6962-prinsipper) tilpasset
|
||||
prekey-distribusjon.
|
||||
|
||||
### Ikke-mål (eksplisitt ut)
|
||||
|
||||
- **Federert log mellom flere prekey-servere.** Hver Shade-deployment
|
||||
har én log (eller ingen). Multi-server gossip er V3.13+.
|
||||
- **Løse MITM-på-første-kontakt fullstendig.** KT fanger split-view og
|
||||
re-write, men ikke det at en angriper publiserer en forfalsket
|
||||
identitet ved første registrering. Det er V3.3 (fingerprint-gate)
|
||||
+ V3.10 (social recovery).
|
||||
- **Legal/compliance audit-log.** Loggen er kryptografisk, ikke juridisk.
|
||||
- **Klient-styrt sletting.** Append-only — DELETE skriver
|
||||
tombstone-entry, fjerner ikke historikk.
|
||||
|
||||
### Beslutningskriterium for implementasjon
|
||||
|
||||
Når dette notatet er godkjent _og_ alle åpne spørsmål under §11 har
|
||||
konkrete svar (ikke bare "vi finner ut av det senere"), kan kode
|
||||
skrives. Det notatet ligger på når §11 lukkes er det vi bygger.
|
||||
|
||||
---
|
||||
|
||||
## 2. Trusselmodell-tillegg
|
||||
|
||||
Eksisterende THREAT-MODEL.md dekker prekey-server som "honest-but-curious"
|
||||
+ tilstede TOFU. KT utvider modellen til **fully-malicious server**:
|
||||
|
||||
| Angrep | Pre-V3.12 | Post-V3.12 |
|
||||
|---|---|---|
|
||||
| Server returnerer feil bundle for én klient | Uoppdaget til OOB-verifisering | Klient kan be om proof; mismatch oppdages |
|
||||
| Server bytter en allerede registrert identityKey | TOFU-fingerprint endres → V3.3-gate slår inn (men brukerinitiert) | Loggen vil vise to entries med samme adresse → witness oppdager |
|
||||
| Server gir `alice` ulike identityKeys til Bob og Charlie (split-view) | Uoppdaget til OOB | Witness-gossip avslører to ulike STH-er |
|
||||
| Server skriver om historikk for å skjule tidligere svik | Mulig | Konsistens-proof feiler → klient varsler |
|
||||
| Server nekter å publisere ny STH | Mulig | "Stale STH"-detekteres av friskhetsbevis (max age) |
|
||||
| Server kompromitterer signing-key for STH | KT-trygghet brutt | Witness gossip om gammel STH-kjede; rotasjon krever ny genesis |
|
||||
|
||||
KT løser **ikke**:
|
||||
|
||||
- Førstegangs-impersonering av en helt ny adresse (intet historisk
|
||||
bevismateriale).
|
||||
- Kollusjon mellom server og _alle_ witnesses.
|
||||
- Klient som glemmer cached STH og må re-bootstrappe.
|
||||
|
||||
---
|
||||
|
||||
## 3. Datastruktur-valg
|
||||
|
||||
Vi velger **RFC 6962-stil append-only Merkle log** + **ekstern
|
||||
adresse-index** med commitment-bevis. Begrunnelse:
|
||||
|
||||
### Vurderte alternativer
|
||||
|
||||
1. **Pure CT-log (RFC 6962):** Simple append-only Merkle tree.
|
||||
Inklusjonsbevis trivielle. Fravær-bevis _ikke_ støttet
|
||||
nativt (må scanne hele loggen).
|
||||
2. **CONIKS-tre (sparse Merkle tree over adresser):** Native fravær-bevis,
|
||||
men mye mer kompleks (epoch-baserte snapshots, prefix-trees,
|
||||
placeholder-nodes). Overkill for første iterasjon.
|
||||
3. **Hybrid (RFC 6962 log + side-index):** Loggen er sannhetskilde,
|
||||
indexen er en _commitment_-mapping `address → leaf_index`. Server
|
||||
beviser inklusjon via leaf-path, fravær via "denne adressen er ikke
|
||||
i indexen ved tree_size T" + signert STH.
|
||||
|
||||
**Valg: alternativ 3.** Det gir CT-stil enkelthet, samt fravær-bevis
|
||||
nesten gratis (commitment til indexen er en del av hver STH).
|
||||
|
||||
### Konkret format
|
||||
|
||||
#### Leaf
|
||||
|
||||
Hver leaf representerer én registrering eller revoke:
|
||||
|
||||
```
|
||||
leaf = SHA256(
|
||||
0x00 || // leaf prefix (RFC 6962)
|
||||
uint64_be(timestamp_ms) ||
|
||||
byte(operation) || // 0x01 register, 0x02 replenish, 0x03 delete
|
||||
uint16_be(len(address)) || address_bytes ||
|
||||
uint16_be(len(bundle_hash)) || bundle_hash // 32 bytes SHA-256 over canonical bundle
|
||||
)
|
||||
```
|
||||
|
||||
`bundle_hash` er deterministisk hash av:
|
||||
|
||||
```
|
||||
canonical_bundle = SHA256(
|
||||
0x01 || // bundle prefix
|
||||
identitySigningKey (32) ||
|
||||
identityDHKey (32) ||
|
||||
uint32_be(signedPreKey.keyId) ||
|
||||
signedPreKey.publicKey (32) ||
|
||||
signedPreKey.signature (64)
|
||||
)
|
||||
```
|
||||
|
||||
One-time prekeys er **ikke** med i bundle-hashen — de er ephemerale og
|
||||
ville lekket OTP-rotasjons-mønstre.
|
||||
|
||||
#### Tree
|
||||
|
||||
Merkle-tre over leaf-array, RFC 6962 §2.1:
|
||||
|
||||
- `MTH(empty) = SHA256()`
|
||||
- `MTH({d}) = SHA256(0x00 || d)` (already hashed leaf)
|
||||
- `MTH(D[n]) = SHA256(0x01 || MTH(D[0:k]) || MTH(D[k:n]))` der
|
||||
`k` er største 2-potens < n.
|
||||
|
||||
#### Signed Tree Head (STH)
|
||||
|
||||
```
|
||||
sth = {
|
||||
tree_size: uint64,
|
||||
timestamp: uint64_ms,
|
||||
root_hash: bytes(32),
|
||||
index_root: bytes(32), // commitment til adresse-index ved denne tree_size
|
||||
log_id: bytes(32), // SHA-256 av server-public-key (stabil ID)
|
||||
signature: bytes(64) // Ed25519 over canonical(rest)
|
||||
}
|
||||
```
|
||||
|
||||
`canonical(sth)` for signing:
|
||||
|
||||
```
|
||||
0x02 || // sth prefix
|
||||
uint64_be(tree_size) ||
|
||||
uint64_be(timestamp) ||
|
||||
root_hash (32) ||
|
||||
index_root (32) ||
|
||||
log_id (32)
|
||||
```
|
||||
|
||||
#### Inklusjons-bevis
|
||||
|
||||
Standard RFC 6962 audit-path: liste med søsken-hasher fra leaf til root,
|
||||
slik at klient re-beregner root og sammenligner med STH.
|
||||
|
||||
#### Konsistens-bevis
|
||||
|
||||
Standard RFC 6962 §2.1.2: bevis at tree med `tree_size = N1` er prefix
|
||||
av tree med `tree_size = N2 > N1`. Klient bruker dette for å detektere
|
||||
re-write.
|
||||
|
||||
#### Fravær-bevis
|
||||
|
||||
Adresse-indexen er en sortert liste `(address, leaf_index_of_latest)`
|
||||
serialized og hashet. `index_root` i STH er commitment.
|
||||
|
||||
For å bevise fravær av adresse `addr` ved tree_size `N`:
|
||||
|
||||
- Server returnerer hele indexen ved tree_size `N` (sortert), eller
|
||||
- (effektivt:) Returnerer naboparet `(addr_prev, addr_next)` der
|
||||
`addr_prev < addr < addr_next` lexikografisk, sammen med en
|
||||
Merkle-path i en sparse Merkle tree over indexen.
|
||||
|
||||
Første iterasjon: vi serialiserer hele indexen og lar klienten
|
||||
laste den (kompakt: <100 KB selv for 100k adresser). Senere
|
||||
optimaliserer vi til sparse Merkle tree hvis dataset vokser.
|
||||
|
||||
---
|
||||
|
||||
## 4. Friskhetsbevis (Signed Tree Heads)
|
||||
|
||||
### Frekvens
|
||||
|
||||
- **Min:** Ny STH ved hver mutasjon (register/replenish/delete) — synkront
|
||||
i write-pathen.
|
||||
- **Maks-stale:** Selv uten mutasjoner skal en STH publiseres minst hver
|
||||
**10. minutt** ("heartbeat STH" — samme tree_size, oppdatert timestamp).
|
||||
Dette gir klienter mulighet til å detektere "død" log uten å bekymre
|
||||
seg om hvorvidt logen faktisk har endret seg.
|
||||
|
||||
### Klient-akseptansevindue
|
||||
|
||||
Klient avviser STH eldre enn `now - 24 timer` (default, konfigurerbar).
|
||||
Dette beskytter mot replay av gamle STH-er som "skjuler" en mutasjon
|
||||
gjort i ettertid.
|
||||
|
||||
### Stale-STH som soft-fail
|
||||
|
||||
Hvis STH er stale men gyldig signert: klient logger advarsel,
|
||||
returnerer bundle med `proof.staleness = 'warn'` (V1) eller blokkerer
|
||||
(V2 etter dogfooding). Vi starter med _warn_, eskalerer til _block_
|
||||
når witness-økosystem er etablert.
|
||||
|
||||
---
|
||||
|
||||
## 5. Klient-verifikasjonssteg
|
||||
|
||||
På hver `fetchBundle(address)`:
|
||||
|
||||
1. Server returnerer `{ bundle, proof: { sth, leaf, audit_path, leaf_index, address_index_proof } }`.
|
||||
2. Klient verifiserer:
|
||||
- `sth.signature` mot kjent `log_public_key` (pinnet ved første
|
||||
bootstrap).
|
||||
- `sth.timestamp >= now - max_age_ms` (default 24t).
|
||||
- Re-beregner `leaf_hash` fra bundle og sammenligner med `proof.leaf`.
|
||||
- Re-beregner `root_hash` fra `audit_path + leaf` og sammenligner med
|
||||
`sth.root_hash`.
|
||||
- Verifiserer `address_index_proof` mot `sth.index_root`.
|
||||
3. Hvis klient har en cached forrige STH: sjekk **konsistens-proof**
|
||||
mellom forrige og denne. Server publiserer dette i
|
||||
`GET /v1/kt/consistency?from=<size>&to=<size>`.
|
||||
4. Hvis klient har en cached STH for samme `tree_size` med ulik
|
||||
`root_hash` → **split-view alarm**.
|
||||
|
||||
### Probabilistisk vs. obligatorisk verifisering
|
||||
|
||||
Vi velger **obligatorisk** ved hver bundle-fetch. Bundle-fetch er sjelden
|
||||
(per ny peer, ikke per melding) — kostnaden er <100ms. Probabilistisk
|
||||
verifisering ville la klienter bli lurt av "én dårlig fetch" uten
|
||||
deteksjon.
|
||||
|
||||
### Bootstrap
|
||||
|
||||
Første gang en klient møter en log: pinner `log_public_key` etter å ha
|
||||
hentet det fra et **ut-av-bånd**-pinningendepunkt eller fra `Shade.config`
|
||||
(operatør sender den med klient-config). Etterfølgende rotasjon krever
|
||||
ny genesis-STH med eksplisitt break-event signert av forrige nøkkel.
|
||||
|
||||
---
|
||||
|
||||
## 6. Witness/auditor-rolle
|
||||
|
||||
### Hva en witness gjør
|
||||
|
||||
- Periodisk poll: `GET /v1/kt/sth` (hent siste STH).
|
||||
- Lagrer alle observerte STH-er i append-only lokal store.
|
||||
- Eksponerer `GET /witness/sth?log_id=...&tree_size=...` slik at andre
|
||||
klienter kan sammenligne hva _denne_ witnessen har sett.
|
||||
- Verifiserer konsistens mellom hver ny STH og forrige.
|
||||
|
||||
### Klient-witness-gossip
|
||||
|
||||
Klient-bibliotek kan operere i tre moduser:
|
||||
|
||||
1. **Observe-only:** verifiserer kun bundle den selv henter, ingen
|
||||
gossip.
|
||||
2. **Light-witness:** poller STH hver `Xt` og lagrer lokalt; sammenligner
|
||||
med STH levert ved bundle-fetch.
|
||||
3. **Full-witness:** publiserer signerte STH-observasjoner til en
|
||||
konfigurert peer-liste eller offentlig endpoint.
|
||||
|
||||
V1 leverer 1 og 2. Mode 3 (full-witness publication-protocol) er V2
|
||||
hvis økosystem trenger det.
|
||||
|
||||
### Hvem kjører witnesses?
|
||||
|
||||
- Shade-prosjektet kjører **referanse-witness** på offentlig endpoint
|
||||
(separate-from-prekey-server).
|
||||
- Power-users / operatører kan kjøre egne via `@shade/key-transparency/witness`-
|
||||
API.
|
||||
- Tredjeparts auditors (typisk security-research) er invitert.
|
||||
|
||||
Vi krever **ikke** federation/konsensus mellom witnesses i V1 — gossip
|
||||
er rent "har du sett samme STH som meg?".
|
||||
|
||||
---
|
||||
|
||||
## 7. Operatørkost
|
||||
|
||||
### Lagring
|
||||
|
||||
- **Per leaf:** 32 bytes (hash) + ~80 bytes adresse-index entry =
|
||||
~112 bytes.
|
||||
- **100k adresser, 1 rotasjon/år, 1 replenish/uke:** ~5.4M leaves =
|
||||
~600 MB log. Tre-strukturen er beregnet on-demand, ikke lagret.
|
||||
- **Index:** ~100k × 80B = 8 MB i minne (cacheable).
|
||||
|
||||
### CPU
|
||||
|
||||
- STH-signing: 1 Ed25519-signering per mutasjon + heartbeat = <1k/dag for
|
||||
små deployments. Trivielt.
|
||||
- Audit-path-beregning: O(log N) ved fetch. <1ms.
|
||||
- Konsistens-proof: O(log N).
|
||||
|
||||
### Backup
|
||||
|
||||
Logen MÅ aldri miste data — sletting eller corruption ødelegger
|
||||
integritet permanent. Strategi:
|
||||
|
||||
- Loggen lagres som append-only tabell `shade_kt_log` (PG) med
|
||||
`(leaf_index, leaf_hash, leaf_data_json)`.
|
||||
- Backup hver time + WAL-shipping anbefalt.
|
||||
- Ved corruption: se §10 Recovery.
|
||||
|
||||
### STH-signing-key
|
||||
|
||||
- Genereres ved første KT-aktivering, lagres i operatør-styrt secret
|
||||
(env, KMS, eller på disk for hjemme-server).
|
||||
- Rotasjon: **breaking event** — krever ny genesis-STH der ny key
|
||||
signerer melding "rotated from ${old_key}" med _gammel_ key. Klient
|
||||
må eksplisitt akseptere rotasjonen.
|
||||
|
||||
---
|
||||
|
||||
## 8. Migrasjon
|
||||
|
||||
### Server-side
|
||||
|
||||
KT er **opt-in** på operatør-nivå. `createPrekeyServer({ keyTransparency:
|
||||
{ enabled, store, signingKey } })`. Når slått på:
|
||||
|
||||
1. Server skriver alle eksisterende identiteter inn som genesis-leaves
|
||||
ved boot.
|
||||
2. Første STH publiseres med `tree_size = N` der N er antall
|
||||
eksisterende adresser.
|
||||
3. Klient som henter bundle får proof; klient som ikke støtter KT
|
||||
ignorerer proof-felt (forward-compatible).
|
||||
|
||||
### Klient-side
|
||||
|
||||
`@shade/sdk`-config:
|
||||
|
||||
```ts
|
||||
createShade({
|
||||
keyTransparency: {
|
||||
mode: 'observe' | 'light-witness' | 'off',
|
||||
logPublicKey: '<base64>',
|
||||
maxStaleMs: 86_400_000,
|
||||
},
|
||||
// ...
|
||||
})
|
||||
```
|
||||
|
||||
`mode: 'off'` (default for backward-compat første release) — ignorerer
|
||||
proof. Ny SDK med `mode: 'observe'` verifiserer men feiler ikke harde
|
||||
hvis proof mangler. `mode: 'observe-strict'` (senere) krever proof.
|
||||
|
||||
### Eksisterende deployments
|
||||
|
||||
Operatør kan rulle KT inn på live server uten klient-update:
|
||||
|
||||
1. Skru på KT i server-config → server begynner å produsere proofs.
|
||||
2. Gamle klienter ignorerer proof-felt (de er additive i bundle-respons).
|
||||
3. Nye klienter med `mode: 'observe'` begynner å verifisere.
|
||||
4. Når operatør har testet og publisert log-public-key OOB, kan brukere
|
||||
skifte til `'light-witness'`.
|
||||
|
||||
---
|
||||
|
||||
## 9. Akseptansekriterier
|
||||
|
||||
- [ ] `@shade/key-transparency` pakke leverer:
|
||||
- Merkle log core (RFC 6962 hash-funksjoner).
|
||||
- STH-signing/verifikasjon.
|
||||
- Inklusjons-bevis generering + verifisering.
|
||||
- Konsistens-bevis generering + verifisering.
|
||||
- Adresse-index med commitment.
|
||||
- Witness-light klient.
|
||||
- Cross-platform (TS-only, ingen native deps).
|
||||
- [ ] `@shade/server` integrasjon:
|
||||
- `KTLogStore`-interface (memory + postgres).
|
||||
- Routes: `GET /v1/kt/sth`, `GET /v1/kt/sth/:tree_size`,
|
||||
`GET /v1/kt/consistency`, `GET /v1/kt/inclusion/:address`.
|
||||
- Bundle-fetch returnerer `{ bundle, proof }` når KT aktivert.
|
||||
- Heartbeat-STH-publisering hver 10. minutt (configurable).
|
||||
- [ ] `@shade/transport` `ShadeFetchTransport`:
|
||||
- Aksepterer optional `keyTransparency`-verifier.
|
||||
- `fetchBundle()` returnerer `{ bundle, proof?: KTProof }`.
|
||||
- [ ] `@shade/sdk` `Shade`:
|
||||
- `keyTransparency`-config.
|
||||
- Verifiserer proof ved hver bundle-fetch når aktivert.
|
||||
- Cacher STH for split-view-deteksjon.
|
||||
- [ ] **End-to-end test: split-view detection.**
|
||||
- Test-server gir Bob bundle X, Charlie bundle Y for samme adresse `alice`.
|
||||
- Bob+Charlie kjører som light-witness, gossiper STH-er.
|
||||
- Test asserter at mismatch detekteres innen N polls.
|
||||
- [ ] **End-to-end test: log re-write detection.**
|
||||
- Server skriver om historie (test-only API).
|
||||
- Konsistens-proof feiler på neste fetch.
|
||||
- [ ] Operatør-doc dekker recovery-strategi.
|
||||
- [ ] CHANGELOG, README, ROADMAP oppdatert.
|
||||
- [ ] Cross-platform vector-test for Merkle hash + STH (Android/TS
|
||||
paritet, samme som V3.5-tradisjonen).
|
||||
|
||||
---
|
||||
|
||||
## 10. Recovery
|
||||
|
||||
### Log corruption
|
||||
|
||||
Hvis log-data tapes (disk-feil før backup): **kan ikke gjenopprettes
|
||||
uten å miste integritet** — det er hele poenget.
|
||||
|
||||
Recovery-prosedyre:
|
||||
|
||||
1. Operatør publiserer "log-restart" event signert med STH-keyen.
|
||||
2. Genesis-STH genereres på nytt med ny `log_id` (= ny offentlig nøkkel
|
||||
eller eksplisitt versjon).
|
||||
3. Klienter som har cached STH-er fra gammel log varsles via
|
||||
eksplisitt diskrepans i `log_id`.
|
||||
4. Brukere som er bekymret må OOB-verifisere identiteter (V3.3-gate
|
||||
trigges automatisk for fingerprint-rotasjon).
|
||||
|
||||
### Stale signing-key
|
||||
|
||||
Hvis STH-keyen lekkes: rotasjon krever break-event (§7). Inntil
|
||||
brukerne aksepterer ny key, oppfører cient-bibliotek seg som om STH
|
||||
mangler (soft-fail i `observe`-mode, blokkerer i `observe-strict`).
|
||||
|
||||
---
|
||||
|
||||
## 11. Åpne spørsmål (lukket før kode)
|
||||
|
||||
| Spørsmål | Svar |
|
||||
|---|---|
|
||||
| Hvordan distribueres `log_public_key` til klient første gang? | Operatør embedder i `Shade.config` ved app-init. OOB-pinning er fallback. |
|
||||
| Skal one-time prekeys være med i bundle-hash? | Nei — ephemerale, og deres rotasjon ville støy-fylle loggen. |
|
||||
| Konflikt: STH ved hver mutasjon vs. batched STH? | Per mutasjon. Heartbeat hver 10 min uansett. Batching vurderes som optimalisering hvis throughput blir et problem (ikke nå). |
|
||||
| Hva skjer ved replenish (kun OTP-tilført)? | Skriver ikke til log (bundle-hash uendret). Heartbeat-STH dekker friskhet. |
|
||||
| Hva med DELETE? | Skriver tombstone-leaf med `operation = 0x03`. Identiteten i indexen markeres som "deleted at tree_size N". |
|
||||
| Sparse Merkle tree for index-proof? | Senere — V1 bruker hele indexen i fravær-proof. <100 KB ved 100k adresser er akseptabelt. |
|
||||
| Klient-cache eviction-policy for STH? | LRU på `log_id`, last-N (default 100). Klient holder _alltid_ siste sett STH. |
|
||||
| Witness-publication-protokoll? | V1 har poll-only (`GET /witness/sth`); push-publication er V2. |
|
||||
|
||||
Alle åpne spørsmål har konkrete svar. Implementasjon kan starte.
|
||||
|
||||
---
|
||||
|
||||
## 12. Pakke-struktur
|
||||
|
||||
```
|
||||
packages/shade-key-transparency/
|
||||
├── package.json # @shade/key-transparency, v0.4.0
|
||||
├── src/
|
||||
│ ├── index.ts # Public exports
|
||||
│ ├── hashes.ts # RFC 6962 leaf/node hashing
|
||||
│ ├── log.ts # MerkleLog (in-memory) + audit-path
|
||||
│ ├── consistency.ts # Consistency-proof gen/verify
|
||||
│ ├── sth.ts # STH sign / verify / canonical bytes
|
||||
│ ├── index-tree.ts # Address index commitment
|
||||
│ ├── proof.ts # KTProof type + bundle-proof verifier
|
||||
│ ├── store.ts # KTLogStore interface (server-side)
|
||||
│ ├── memory-store.ts # In-memory KTLogStore
|
||||
│ ├── witness.ts # Light-witness client
|
||||
│ └── errors.ts # KT-specific error types
|
||||
└── tests/
|
||||
├── hashes.test.ts
|
||||
├── log.test.ts # RFC 6962 test vectors
|
||||
├── consistency.test.ts
|
||||
├── sth.test.ts
|
||||
├── index-tree.test.ts
|
||||
├── proof.test.ts
|
||||
└── split-view.test.ts # End-to-end split-view detection
|
||||
```
|
||||
|
||||
Server-integrasjon i `@shade/server`:
|
||||
|
||||
```
|
||||
packages/shade-server/src/
|
||||
├── kt-routes.ts # /v1/kt/* routes
|
||||
├── kt-integration.ts # Hook bundle-fetch + register/delete to log
|
||||
└── ...
|
||||
```
|
||||
|
||||
Postgres-implementasjon i `@shade/storage-postgres`:
|
||||
|
||||
```
|
||||
packages/shade-storage-postgres/src/
|
||||
├── postgres-kt-store.ts # KTLogStore on PG
|
||||
└── ...
|
||||
```
|
||||
|
||||
Klient-integrasjon i `@shade/transport` + `@shade/sdk`:
|
||||
|
||||
```
|
||||
packages/shade-transport/src/
|
||||
├── kt-verifier.ts # Proof-verifier for fetchBundle
|
||||
└── ...
|
||||
|
||||
packages/shade-sdk/src/
|
||||
├── kt.ts # Shade.keyTransparency config + cache
|
||||
└── ...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 13. Test-strategi
|
||||
|
||||
1. **RFC 6962 test-vektorer:** importer kjente vektorer fra
|
||||
<https://datatracker.ietf.org/doc/html/rfc6962#appendix-A>.
|
||||
2. **Property-tests (fast-check):** for hver tree_size N og hvert
|
||||
leaf-index i: `verify(audit_path(i, N), leaf, sth) === true`.
|
||||
3. **Konsistens-bevis property-tests:** for N1 < N2:
|
||||
`verify_consistency(proof, sth1, sth2) === true`.
|
||||
4. **Split-view e2e:** to klienter, ondsinnet test-server, witness
|
||||
gossip oppdager mismatch.
|
||||
5. **Re-write-detection e2e:** server muterer log-historie, klient
|
||||
neste fetch får konsistens-proof som feiler.
|
||||
6. **Cross-platform:** Android (Kotlin) + TS gir samme leaf-hash for
|
||||
samme bundle (V3.5-paritet er forutsetning, så dette må også gå
|
||||
gjennom kotlin-port; for V3.12 første release dekker vi TS — Android
|
||||
port er V3.13).
|
||||
7. **Stale STH:** klient avviser STH > max_age.
|
||||
8. **Bootstrap-pinning:** klient feiler hvis log_public_key ikke matcher.
|
||||
|
||||
---
|
||||
|
||||
## 14. Sikkerhetsvurdering
|
||||
|
||||
- **Falsk trygghet hvis halvveis:** Avhjelpes ved at default-mode er `'off'`,
|
||||
bare _eksplisitt_ aktivert KT gir hardere garantier. Dokumentasjon
|
||||
fremhever at `'observe'` er observasjon, ikke obstruksjon, til
|
||||
økosystemet er etablert.
|
||||
- **Server-side mutability av historie:** Avhjelpes ved at `KTLogStore`
|
||||
kun har `append()` — ingen `update()`/`delete()` på historiske leaves.
|
||||
PG-tabellen har CHECK constraint og BEFORE-triggers for ekstra defense
|
||||
in depth (se §7).
|
||||
- **STH-key compromise:** dokumentert §10. Operatør-ansvar.
|
||||
- **DoS via massive index-proofs:** index-proof er i V1 hele indexen.
|
||||
100 KB per fetch er overkommelig; rate-limiteren dekker excess.
|
||||
- **Replay av gammel proof:** STH-timestamp + max_age beskytter.
|
||||
|
||||
---
|
||||
|
||||
## 15. Approval
|
||||
|
||||
Når dette notatet er reviewed (in-tree review er nok for å kommitte
|
||||
første implementasjon; ekstern crypto-review er pre-deploy-krav per
|
||||
V3.12 §"Pre-requisite designnotat"), kan implementasjon starte.
|
||||
|
||||
**Implementasjon-rekkefølge** (alle commits i samme branch):
|
||||
|
||||
1. `@shade/key-transparency` core (Merkle log, STH, proofs).
|
||||
2. Server-integrasjon (`@shade/server` + memory/postgres KTLogStore).
|
||||
3. Klient-integrasjon (`@shade/transport` verifier + `@shade/sdk` config).
|
||||
4. Witness-light + e2e split-view-test.
|
||||
5. Operatør-doc + CHANGELOG + README + ROADMAP.
|
||||
|
||||
— end of design —
|
||||
99
docs/archive/V3.12.md
Normal file
99
docs/archive/V3.12.md
Normal file
@@ -0,0 +1,99 @@
|
||||
# Shade V3.12 — Key Transparency
|
||||
|
||||
**Status:** Done (0.4.0). Designnotat: `docs/V3.12-DESIGN.md`.
|
||||
Operatør-/recovery-guide: `docs/key-transparency.md`.
|
||||
**Effort:** XXL (4+ måneder, multi-quarter)
|
||||
**Forrige:** V3.5 (hovedplattformene stabile først)
|
||||
**Adresserer:** V2.3 §1A
|
||||
|
||||
---
|
||||
|
||||
## Mål
|
||||
|
||||
Reduser tillit til prekey-server fra "blind tillit" til "verifiserbar log".
|
||||
Når serveren utleverer et bundle, skal det være kryptografisk forpliktet i
|
||||
en **append-only log** som klienter (eller tredjeparts-auditors) kan
|
||||
verifisere. Et split-view-angrep der serveren viser ulike bundles til ulike
|
||||
klienter blir fanget av gossip.
|
||||
|
||||
---
|
||||
|
||||
## Pre-requisite: designnotat
|
||||
|
||||
**Ingen kode før dette er review'd og approved:**
|
||||
|
||||
- Trusselmodell-tillegg: hva CT/attest faktisk løser, hva som forblir åpent.
|
||||
- Datastruktur-valg: append-only Merkle log (CT-stil), CONIKS-tre, eller
|
||||
hybrid.
|
||||
- Friskhetsbevis: hvor ofte signed tree heads utgis; hva er "stale"?
|
||||
- Klient-verifikasjonssteg: må klient verifisere på hver bundle-fetch,
|
||||
eller probabilistisk?
|
||||
- Witness/auditor-rolle: hvem kjører dem? Hvordan gossip mellom klienter?
|
||||
- Operatørkost: log-størrelse, signing-frekvens, backup-strategi.
|
||||
- Migrasjon: eksisterende prekey-server → log-utvidet.
|
||||
|
||||
Designnotatet er en `docs/V3.12-DESIGN.md`-PR som må review'es av minst én
|
||||
ekstern crypto-orientert reviewer.
|
||||
|
||||
---
|
||||
|
||||
## Mulig scope (etter designnotat)
|
||||
|
||||
### Inn (estimat)
|
||||
|
||||
- Append-only log som tillegg til prekey-server.
|
||||
- Inklusjons-bevis ved bundle-fetch (Merkle-path).
|
||||
- Fravær-bevis for "denne adressen har ikke registrert siden timestamp T".
|
||||
- Signed tree heads (STH) publisert på fast interval.
|
||||
- Klient-bibliotek: `@shade/key-transparency` med verifisering.
|
||||
- Witness-API: tredjeparts-auditor kan hente STH-er og logge gossip.
|
||||
|
||||
### Ut (eksplisitt)
|
||||
|
||||
- Federated log (multi-server gossip) — for stort for første iterasjon.
|
||||
- Legal/compliance-side av audit-log.
|
||||
- "Vi løser MITM-på-første-kontakt-helt" — KT alene fanger split-view, ikke
|
||||
første-kontakt.
|
||||
|
||||
---
|
||||
|
||||
## Risiko-vurdering
|
||||
|
||||
KT er det **vanskeligste enkeltpunkt** i hele roadmapen:
|
||||
|
||||
1. **Halvveis-implementert KT er verre enn ingen KT** — gir falsk trygghet,
|
||||
brukere slutter å verifisere OOB.
|
||||
2. Operativt komplekst — log må aldri skrive om historie. En enkelt
|
||||
restart-bug = ødelagt integritet.
|
||||
3. Klient-verifikasjons-logikk må kjøre på hver bundle-fetch, eller
|
||||
risikere at én "gammel" klient blir lurt.
|
||||
4. Witness-økosystem krever uavhengige aktører — Shade alene kan ikke
|
||||
garantere det.
|
||||
|
||||
**Beslutningskriterium:** Hvis designnotatet etterlater åpne "hvordan
|
||||
håndterer vi X?"-spørsmål uten klare svar, parker V3.12. Pragmatisk
|
||||
alternativ er **V3.3 (fingerprint-gate)** + **V3.10 (social recovery)** —
|
||||
som sammen gir 80 % av MITM-beskyttelsen uten KT-kompleksiteten.
|
||||
|
||||
---
|
||||
|
||||
## Akseptansekriterier (hvis det implementeres)
|
||||
|
||||
- [ ] Designnotat passert ekstern review.
|
||||
- [ ] Klient detekterer split-view i ende-til-ende-test (server gir to
|
||||
versjoner av samme adresse → klient fanger mismatch).
|
||||
- [ ] Witness-API testet med minst én ekstern auditor-instans.
|
||||
- [ ] Operatør-doc dekker recovery hvis log korrumperer.
|
||||
|
||||
---
|
||||
|
||||
## Avhengigheter
|
||||
|
||||
- V3.5 — Android/TS paritet må være solid før vi legger på et nytt
|
||||
verifikasjons-lag.
|
||||
|
||||
---
|
||||
|
||||
## Migrasjon
|
||||
|
||||
Helt opt-in. Operatører som ikke ønsker KT kjører videre uendret.
|
||||
146
docs/archive/V3.2.md
Normal file
146
docs/archive/V3.2.md
Normal file
@@ -0,0 +1,146 @@
|
||||
# Shade V3.2 — At-Rest Storage Encryption
|
||||
|
||||
**Status:** Implementert (0.4.0) — `@shade/storage-encrypted`, `@shade/keychain`,
|
||||
`shade migrate-storage`, `shade rotate-storage-key`
|
||||
**Effort:** L (4–8 uker)
|
||||
**Forrige:** V3.1
|
||||
**Adresserer:** V2.1 §2
|
||||
|
||||
---
|
||||
|
||||
## Mål
|
||||
|
||||
Opt-in beskyttelse av sensitiv state — identity-nøkler, session-state, valgfri
|
||||
stream-resume-secret — med nøkler som **ikke** ligger i klartekst i databasen.
|
||||
Trusselmodellen sier i dag eksplisitt at en stjålet DB eksponerer private
|
||||
nøkler; dette løser det for deploys som velger å aktivere det.
|
||||
|
||||
---
|
||||
|
||||
## Scope
|
||||
|
||||
### Inn
|
||||
|
||||
- Ny `EncryptedStorageProvider`-wrapper som dekorerer `SQLiteStorage` /
|
||||
`PostgresStorage`.
|
||||
- Per-rad AES-256-GCM på sensitive felter (`identity_*`, `session_*`,
|
||||
valgfritt `stream_state.streamSecret`).
|
||||
- KDF-pluggin (default `scrypt` fra `@noble/hashes`) for passphrase-basert
|
||||
master-nøkkel.
|
||||
- Tre nøkkelkilder ut av boksen:
|
||||
1. **Passphrase + KDF** — utvikler oppgir secret ved oppstart.
|
||||
2. **OS keychain** — macOS Keychain, Linux libsecret, Windows Credential
|
||||
Vault (Node-only).
|
||||
3. **App-injected key** — appens egen kode forsyner 32-byte nøkkel (mest
|
||||
fleksibel).
|
||||
- Migrasjons-CLI: `shade migrate-storage --encrypt --key-source=...`.
|
||||
- Trusselmodell-oppdatering: "når enabled, hva er fortsatt udekket" — memory
|
||||
compromise, swap, runtime-tap.
|
||||
|
||||
### Ut
|
||||
|
||||
- Browser/IndexedDB at-rest (egen pakke, vurderes etter V3.8).
|
||||
- HSM/Secure Enclave (separate driver senere).
|
||||
- "Always-on by default" — vi flyger opt-in for å ikke bryte eksisterende
|
||||
deploys.
|
||||
|
||||
---
|
||||
|
||||
## Design
|
||||
|
||||
### Krypteringsenhet
|
||||
|
||||
- Per-rad AEAD: `nonce(12) || ciphertext || tag(16)`.
|
||||
- `nonce = HKDF(rowKey, "shade-row-nonce-v1" || tableName || pk)[..12]` —
|
||||
deterministisk per (tabell, pk) for å unngå nonce-reuse uten å lagre nonce
|
||||
separat. Endring av (tabell, pk) → re-encryption.
|
||||
- AAD binder `tableName || columnName || pk` så feltombytting blokkeres.
|
||||
|
||||
### Nøkkelhierarki
|
||||
|
||||
```text
|
||||
masterKey (fra kilde — passphrase / keychain / app-injected)
|
||||
│
|
||||
├─ HKDF("shade-storage-v1") → storageKey (32 bytes)
|
||||
│ │
|
||||
│ └─ HKDF(storageKey, table || column) → fieldKey
|
||||
│
|
||||
└─ HKDF("shade-storage-version-v1") → versjonsnøkkel (rotasjon)
|
||||
```
|
||||
|
||||
### Migrasjon
|
||||
|
||||
1. CLI leser ukryptert DB.
|
||||
2. Skriver rad-for-rad-kryptering til ny `_v2`-tabell.
|
||||
3. Atomisk rename + drop gammel.
|
||||
4. Backup `.bak`-fil etterlatt i samme dir.
|
||||
|
||||
### Rotasjon
|
||||
|
||||
- `shade rotate-storage-key --new-source=...` re-krypterer med ny masterKey.
|
||||
- Online ratchet (les med gammel, skriv med ny) for store DB.
|
||||
|
||||
---
|
||||
|
||||
## Leveranser
|
||||
|
||||
### Pakker
|
||||
|
||||
- Ny modul: `@shade/storage-encrypted` (re-export over SQLite/PG).
|
||||
- Utvidelse i `@shade/cli`: `migrate-storage`, `rotate-storage-key`.
|
||||
- Hjelpe-pakke: `@shade/keychain` (Node-only, valgfri peer-dep) for OS-keychain.
|
||||
|
||||
### Tester
|
||||
|
||||
- Unit: KDF-derivasjon, nonce-determinisme, AAD-binding.
|
||||
- Integration: full lifecycle på SQLite + PG; start/stopp; krasj under
|
||||
migrasjon.
|
||||
- Tamper: bit-flip i ciphertext / AAD / nonce → dekrypterings-feil.
|
||||
- Vector-fil: kryss-sjekk masterKey → fieldKey-derivasjon mot
|
||||
`test-vectors/storage-encryption.json`.
|
||||
|
||||
### Dokumentasjon
|
||||
|
||||
- `docs/storage-encryption.md` — full guide.
|
||||
- `THREAT-MODEL.md` — ny kolonne "with at-rest enabled".
|
||||
- Migrasjonsnotat i `MIGRATION.md`.
|
||||
|
||||
---
|
||||
|
||||
## Akseptansekriterier
|
||||
|
||||
- [ ] Eksisterende ukryptert deploy fortsetter uten endringer (opt-in).
|
||||
- [ ] `shade migrate-storage --encrypt` migrerer en levende SQLite uten
|
||||
datatap, verifisert med dump-diff.
|
||||
- [ ] Rotasjon kan gjøres uten downtime > 5 s for små DB.
|
||||
- [ ] Wrong passphrase / wrong key → klar feilmelding, ikke krasj.
|
||||
- [ ] Test-vectors deles med Android-implementasjonen (V3.5 forplikter at
|
||||
vector-filen kjøres der).
|
||||
|
||||
---
|
||||
|
||||
## Avhengigheter
|
||||
|
||||
- V3.1 — `THREAT-MODEL.md` skal være lenket til testene først, så vi kan
|
||||
utvide tabellen.
|
||||
|
||||
---
|
||||
|
||||
## Risiko
|
||||
|
||||
**Datatap.** En migrasjon som krasjer halvveis kan etterlate korrupt DB.
|
||||
Mitigeres ved:
|
||||
|
||||
- Atomic-rename + `.bak`-fil.
|
||||
- Dry-run-modus (`--dry-run` validerer all dekryptering før skriving).
|
||||
- Refuser å starte hvis WAL har uncommitted writes.
|
||||
|
||||
**Nøkkeltap = totaltap.** Hvis bruker mister passphrase = ingen tilgang.
|
||||
Dokumenter klart, og pek på V3.10 (Social Recovery) som langtidsløsning.
|
||||
|
||||
---
|
||||
|
||||
## Migrasjon
|
||||
|
||||
0.3.x deploys er ukrypterte → fortsatt ukrypterte. Aktivering er én
|
||||
CLI-kommando. Backwards-kompatibel.
|
||||
147
docs/archive/V3.3.md
Normal file
147
docs/archive/V3.3.md
Normal file
@@ -0,0 +1,147 @@
|
||||
# Shade V3.3 — Fingerprint Gates & Trust UX
|
||||
|
||||
**Status:** Done
|
||||
**Effort:** M (2–4 uker)
|
||||
**Forrige:** V3.1
|
||||
**Adresserer:** V2.3 §1B
|
||||
**Implementert:** se `docs/trust-ux.md`
|
||||
|
||||
---
|
||||
|
||||
## Mål
|
||||
|
||||
Gjør safety numbers **handlingspålagte** — ikke bare synlige — i flyt der
|
||||
MITM-risikoen er reell. I dag finnes `FingerprintCompare`-widget og
|
||||
`requireFingerprintVerifiedFor` i `@shade/files`, men hovedkjernen
|
||||
(`Shade.send`, first-large-file, backup-import) har ingen automatisk gate.
|
||||
Resultat: alert-fatigue-fri, men også gate-fri.
|
||||
|
||||
Dette legger inn **eksplisitt blokkerende verifisering** på et lite antall
|
||||
kritiske hendelser, plus widget-støtte for å eksponere det i UI.
|
||||
|
||||
---
|
||||
|
||||
## Scope
|
||||
|
||||
### Inn — kritiske hendelser
|
||||
|
||||
1. **Før første store fil** — `Shade.upload` over en bytes-terskel uten
|
||||
verifisert peer.
|
||||
2. **Før backup-import** — `Shade.importBackup` blokkerer til peer (eller egen
|
||||
identitet) er bekreftet.
|
||||
3. **Ny enhet med rotert identitet** — `acceptIdentityChange` blokkerer på
|
||||
første bruk inntil verifisert.
|
||||
4. **Før `@shade/inbox` fan-out** (V3.6) — gate per mottaker.
|
||||
|
||||
### Inn — APIer
|
||||
|
||||
- `Shade.beforeFirstLargeFile(threshold, handler)` — appen får mulighet til å
|
||||
vise modal og returnere bekreftelse.
|
||||
- `Shade.beforeBackupImport(handler)` — samme mønster.
|
||||
- `Shade.beforeNewDeviceTrust(handler)` — ditto.
|
||||
- `Shade.markPeerVerified(address)` / `Shade.isPeerVerified(address)` —
|
||||
persistent state.
|
||||
|
||||
### Inn — widgets
|
||||
|
||||
- `<FingerprintGate />` — render-prop wrapper som blokkerer barn til
|
||||
verifisert.
|
||||
- `<FingerprintCompare />` utvides med "kopier OOB-tekst" + "jeg har
|
||||
verifisert".
|
||||
|
||||
### Ut
|
||||
|
||||
- "Tving alle peers verifisert før hver melding" — alert fatigue.
|
||||
- Cross-device sync av verified-state (kommer evt. via V3.6 inbox).
|
||||
|
||||
---
|
||||
|
||||
## Design
|
||||
|
||||
### Persistent verified-state
|
||||
|
||||
Ny tabell `peer_verifications`:
|
||||
|
||||
```sql
|
||||
CREATE TABLE peer_verifications (
|
||||
peer_address TEXT PRIMARY KEY,
|
||||
fingerprint TEXT NOT NULL,
|
||||
verified_at INTEGER NOT NULL,
|
||||
verified_by TEXT, -- "user" | "transitive" | "tofu-after-warning"
|
||||
identity_version INTEGER NOT NULL -- knytter verifikasjon til identity-rotasjon
|
||||
);
|
||||
```
|
||||
|
||||
Når peer roterer identitet → `identity_version` bumper → verifikasjon "ugyldig"
|
||||
til ny verifisering.
|
||||
|
||||
### Hook-flyt
|
||||
|
||||
```text
|
||||
shade.upload(peer, file)
|
||||
│
|
||||
├─ if !verified(peer) AND file.size > threshold
|
||||
│ │
|
||||
│ └─ await beforeFirstLargeFileHandler(peer, fingerprint)
|
||||
│ ├─ true → markPeerVerified(peer); proceed
|
||||
│ └─ false → throw FingerprintNotVerifiedError
|
||||
│
|
||||
└─ proceed
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Leveranser
|
||||
|
||||
### Kode
|
||||
|
||||
- `@shade/core` — `peer_verifications`-tabell + storage methods.
|
||||
- `@shade/sdk` — gate-hooks + `markPeerVerified` / `isPeerVerified`.
|
||||
- `@shade/widgets` — `<FingerprintGate />`, utvidet `<FingerprintCompare />`.
|
||||
|
||||
### Tester
|
||||
|
||||
- Unit: gate kalles, ikke kalles, retur false → throw, retur true → proceed.
|
||||
- Integration: fil < threshold går gjennom uten gate; fil > threshold
|
||||
blokkerer.
|
||||
- Identity-rotasjon ugyldiggjør verifikasjon.
|
||||
- Backup-import blokkerer.
|
||||
|
||||
### Dokumentasjon
|
||||
|
||||
- `docs/trust-ux.md` — guide til hvilke gates som finnes og når de bør tunes.
|
||||
|
||||
---
|
||||
|
||||
## Akseptansekriterier
|
||||
|
||||
- [ ] Gate kan ikke bypasses ved å nulle `threshold` ut — minimum gate finnes
|
||||
alltid for backup-import og new-device.
|
||||
- [ ] App uten registrerte gates får sane defaults (logger en warning, men
|
||||
kjører — ikke krasj).
|
||||
- [ ] Identity-rotasjon resetter verifikasjon i en testet ende-til-ende-flow.
|
||||
- [ ] Widget kan rendres SSR uten å trigge runtime-gate.
|
||||
|
||||
---
|
||||
|
||||
## Avhengigheter
|
||||
|
||||
- V3.1 — threat-matrise oppdatert til å vise hvilke gates som dekker hvilke
|
||||
rader.
|
||||
|
||||
---
|
||||
|
||||
## Risiko
|
||||
|
||||
- **Alert fatigue.** Hvis terskler er for lave → bruker klikker blindt.
|
||||
Mitiger ved å sette default-terskler høyt (10 MiB for first-large-file)
|
||||
og dokumenter justerings-guide.
|
||||
- **DX-friksjon.** Apper som ikke vet om gates får uventede prompts. Mitiger
|
||||
ved å logge tydelig ved første aktivering: "Shade.beforeFirstLargeFile not
|
||||
configured — using default modal".
|
||||
|
||||
---
|
||||
|
||||
## Migrasjon
|
||||
|
||||
0.3.x apps får defaults aktivert med warning. Ingen breaking change.
|
||||
124
docs/archive/V3.4.md
Normal file
124
docs/archive/V3.4.md
Normal file
@@ -0,0 +1,124 @@
|
||||
# Shade V3.4 — Observability v2 (OpenTelemetry)
|
||||
|
||||
**Status:** Implementert (2026-05-02) — `@shade/observability` 0.1.0,
|
||||
hekt inn i sdk/transfer/server/files/core. Off by default; flip
|
||||
`SHADE_OTEL_ENABLED=1` for å aktivere.
|
||||
**Effort:** M (2–4 uker)
|
||||
**Forrige:** V3.1
|
||||
**Adresserer:** V2.3 §4
|
||||
|
||||
---
|
||||
|
||||
## Mål
|
||||
|
||||
Gi produksjonsteam **distribuerte spor** rundt `TransferEngine`,
|
||||
prekey-routes og `@shade/files` — uten å lekke plaintext-adresser, payloads
|
||||
eller eksakte chunk-størrelser. Bygger videre på Prometheus-metrics som
|
||||
allerede finnes.
|
||||
|
||||
---
|
||||
|
||||
## Scope
|
||||
|
||||
### Inn
|
||||
|
||||
- Opt-in OpenTelemetry-instrumentasjon via `@opentelemetry/api`.
|
||||
- Spans rundt:
|
||||
- `TransferEngine.upload` / `.download` (med lane-tags, retry-counts).
|
||||
- `ShadeSessionManager.encrypt` / `.decrypt` (per-peer mutex-akkvisisjon,
|
||||
ratchet-step).
|
||||
- `createPrekeyRoutes` (per route, status-koder).
|
||||
- `@shade/files` op-handlers (har allerede `onMetric` — utvides til OTel).
|
||||
- PII-policy-doc: hva som **aldri** logges, hva binnes, hva pseudonymiseres.
|
||||
- Sample-policy default off; on med `SHADE_OTEL_ENABLED=1`.
|
||||
|
||||
### Ut
|
||||
|
||||
- Trace-eksport til SaaS-leverandører (det er deploy-konfig, ikke vår kode).
|
||||
- Logg-aggregering — `@shade/server` har allerede strukturert JSON.
|
||||
|
||||
---
|
||||
|
||||
## Design
|
||||
|
||||
### Span-attributter
|
||||
|
||||
| Attribute | Verdi |
|
||||
|-----------|-------|
|
||||
| `shade.peer.hash` | `sha256(address).slice(0, 8)` — stabil pseudonym |
|
||||
| `shade.bytes.bin` | binnet — `"≤4KB"`, `"4–64KB"`, `"64KB–1MB"`, `"≥1MB"` |
|
||||
| `shade.lane.count` | 1 / 4 / 16 |
|
||||
| `shade.retry.count` | int |
|
||||
| `shade.error.code` | `SHADE_*`-kode |
|
||||
|
||||
**Aldri:** `shade.peer.address`, `shade.payload`, `shade.bytes.exact`.
|
||||
|
||||
### API
|
||||
|
||||
```ts
|
||||
import { withTracer } from '@shade/observability';
|
||||
|
||||
const shade = await createShade({
|
||||
...,
|
||||
observability: withTracer(myTracer, { sample: 0.1 }),
|
||||
});
|
||||
```
|
||||
|
||||
`withTracer()` er no-op hvis `tracer` er `undefined` eller
|
||||
`SHADE_OTEL_ENABLED` ikke er satt.
|
||||
|
||||
---
|
||||
|
||||
## Leveranser
|
||||
|
||||
### Pakker
|
||||
|
||||
- Ny submodul `@shade/observability` (peer-dep `@opentelemetry/api`).
|
||||
- Hooks i `@shade/sdk`, `@shade/transfer`, `@shade/server`, `@shade/files`.
|
||||
|
||||
### Tester
|
||||
|
||||
- Span emitteres med riktige attributter (mock tracer).
|
||||
- Sample-rate respekteres.
|
||||
- Off-by-default verifisert.
|
||||
- Regex-grep mot recorder fanger plaintext-PII.
|
||||
|
||||
### Dokumentasjon
|
||||
|
||||
- `docs/observability.md` — setup + PII-policy.
|
||||
- `docs/DEPLOYMENT.md` — environment-variabler.
|
||||
|
||||
---
|
||||
|
||||
## Akseptansekriterier
|
||||
|
||||
- [x] Default deploy uten OTel: ingen performance-regresjon (`withTracer`
|
||||
returnerer delt `NOOP_HOOK` når `SHADE_OTEL_ENABLED` ikke er satt).
|
||||
- [x] Med OTel på: spans for upload/download (`shade.transfer.upload`,
|
||||
`shade.transfer.download`), prekey-routes (`shade.prekey.request`),
|
||||
session encrypt/decrypt (`shade.session.{encrypt,decrypt}`), og
|
||||
`@shade/files` ops (`shade.files.op`).
|
||||
- [x] Automatisert grep-test fanger plaintext-PII i spans
|
||||
(`packages/shade-observability/tests/integration-pii.test.ts` +
|
||||
`packages/shade-transfer/tests/observability.test.ts`,
|
||||
`safeAttribute()` blokkerer fra-utvikler-introduksert PII).
|
||||
|
||||
---
|
||||
|
||||
## Avhengigheter
|
||||
|
||||
- V3.1 — basis-docs.
|
||||
|
||||
---
|
||||
|
||||
## Risiko
|
||||
|
||||
- **Performance-overhead.** Mitiger ved aggressiv default-off + sampling.
|
||||
- **PII-lekkasje** hvis utviklere legger til egne attributter. Mitiger ved
|
||||
å publisere "safe attribute"-helpers og PII-linter.
|
||||
|
||||
---
|
||||
|
||||
## Migrasjon
|
||||
|
||||
Ingen — opt-in.
|
||||
125
docs/archive/V3.5.md
Normal file
125
docs/archive/V3.5.md
Normal file
@@ -0,0 +1,125 @@
|
||||
# Shade V3.5 — Android Parity & Cross-Platform CI
|
||||
|
||||
**Status:** Done (kryptografisk lag + CI-gate). Android-KeystoreStorage og scrypt/argon2id-paritet er post-GA-arbeid sporet i `android/shade-android/ROADMAP-ANDROID.md` — ikke en 4.0 GA-blocker.
|
||||
**Effort:** XL (2–4 måneder, parallelliserbar)
|
||||
**Forrige:** V3.1
|
||||
**Adresserer:** V2.1 §3
|
||||
|
||||
---
|
||||
|
||||
## Mål
|
||||
|
||||
Gjør Kotlin-implementasjonen **byte-kompatibel** med TS-implementasjonen, og
|
||||
forsegle paritet via **CI-gate** som kjører delte test-vectors i begge språk.
|
||||
Ingen "production"-label på Android før ratchet + proto + streams 0x11 er
|
||||
grønne.
|
||||
|
||||
---
|
||||
|
||||
## Scope
|
||||
|
||||
### Inn — paritet-sjekkpunkter (eksplisitt)
|
||||
|
||||
1. **KDF-chain** — root key + chain key derivasjoner.
|
||||
Vector: `test-vectors/kdf-chain.json`.
|
||||
2. **HKDF** — labels for `info`-felt.
|
||||
Vector: `test-vectors/hkdf.json`.
|
||||
3. **X3DH** — full agreement med samme bundles.
|
||||
Vector: `test-vectors/x3dh.json`.
|
||||
4. **Ratchet message** — encrypt/decrypt roundtrip (legg til vector).
|
||||
5. **Fingerprint** — 60-digit safety number.
|
||||
Vector: `test-vectors/fingerprint.json`.
|
||||
6. **Wire format 0x02** — encode/decode.
|
||||
Vector: `test-vectors/wire-format.json`.
|
||||
7. **Streams 0x11** — multi-lane chunk encryption (M-Cross 3, ikke i M-Cross 1).
|
||||
8. **Backup-format** — passphrase-basert KDF + AES-GCM payload.
|
||||
|
||||
### Inn — milestoner
|
||||
|
||||
- **M-Cross 1 ✅** — keys + HKDF + X3DH + fingerprint.
|
||||
- **M-Cross 2 ✅** — ratchet step (encrypt + decrypt roundtrip) + wire 0x02
|
||||
(RatchetMessage + PreKeyMessage med/uten OTPK). Vector-versjon `2`.
|
||||
- **M-Cross 3 ✅** — streams 0x11 (KDF, deterministic chunk nonce/AAD, wire 0x11
|
||||
encode/decode). End-to-end socket interop pending; ikke gating-blokker.
|
||||
- **M-Cross 4 ✅** — backup-format HKDF + AEAD, gruppe sender-keys
|
||||
(kdfChainKey + Ed25519 sign(aad ‖ ct)), storage-HKDF (storageKey,
|
||||
fieldKey, rowNonce). Gjenstående: scrypt master-key (Bouncy Castle),
|
||||
argon2id-bytte, Android-KeystoreStorage som søsken-modul.
|
||||
|
||||
### Inn — CI
|
||||
|
||||
- Gitea Actions matrix-job:
|
||||
- Bun-runner kjører `bun test:vectors` mot `test-vectors/*.json`.
|
||||
- Gradle-runner kjører `./gradlew vectorTests` mot samme filer.
|
||||
- PR-gate: begge må passere.
|
||||
- Vector-genereringsskript (`scripts/generate-vectors.ts`) finnes — utvid
|
||||
til 7 + 8.
|
||||
|
||||
### Ut
|
||||
|
||||
- iOS — egen Swift-port er framtidig roadmap, ikke V3.5.
|
||||
- Native bindings i `shade-android` (vi bruker Tink i JVM-kode).
|
||||
|
||||
---
|
||||
|
||||
## Leveranser
|
||||
|
||||
### Kotlin
|
||||
|
||||
- Full ratchet-implementasjon (M-Cross 2).
|
||||
- Wire 0x02 encode/decode.
|
||||
- Streams 0x11 (M-Cross 3).
|
||||
- Tink-storage-adapter med Keystore.
|
||||
|
||||
### Test-vectors
|
||||
|
||||
- Utvid `scripts/generate-vectors.ts` med ratchet-step + streams + backup.
|
||||
- Versjons-tag på vector-filer (`{ "version": 2, ... }`).
|
||||
|
||||
### CI
|
||||
|
||||
- `.gitea/workflows/cross-vectors.yml` — Bun + Gradle matrise.
|
||||
- Fail-policy: hvis vector-fil endres, **begge** runners må publisere
|
||||
passing før merge.
|
||||
|
||||
### Dokumentasjon
|
||||
|
||||
- `android/shade-android/ROADMAP-ANDROID.md` — eksplisitte milestoner +
|
||||
status per sjekkpunkt.
|
||||
- `docs/cross-platform.md` — hvordan legge til en ny vector + hvordan
|
||||
kjøre lokalt.
|
||||
|
||||
---
|
||||
|
||||
## Akseptansekriterier
|
||||
|
||||
- [ ] M-Cross 2: TS-encrypted melding kan dekrypteres av Kotlin-klient og
|
||||
omvendt, end-to-end-test.
|
||||
- [ ] CI-jobben feiler innen 60 s ved bevisst byte-divergens.
|
||||
- [ ] M-Cross 3: 1 MiB streams-fil over 4 lanes mellom TS-server og
|
||||
Kotlin-klient verifisert.
|
||||
- [ ] Ingen public release med "production"-label før M-Cross 2 er grønn.
|
||||
|
||||
---
|
||||
|
||||
## Avhengigheter
|
||||
|
||||
- V3.1 — `cross-platform.md` lever der.
|
||||
|
||||
---
|
||||
|
||||
## Risiko
|
||||
|
||||
- **Tink-mismatch.** Tink HKDF-info-encoding kan avvike fra
|
||||
`@noble/hashes`. Mitiger med tidlig vector-test (M-Cross 1 dekker dette).
|
||||
- **Endian / encoding.** Wire 0x02 bruker big-endian — Kotlin
|
||||
`ByteBuffer` default er big-endian, men streams-nonce-konstruksjon må
|
||||
gjennomgås.
|
||||
- **Maintainer-kapasitet.** Kotlin-port + TS-port må holdes i sync.
|
||||
Vector-CI er primær mitigasjon.
|
||||
|
||||
---
|
||||
|
||||
## Migrasjon
|
||||
|
||||
Eksisterende M-Cross 1 scaffold beholdes; alt nytt bygges på den.
|
||||
123
docs/archive/V3.6.md
Normal file
123
docs/archive/V3.6.md
Normal file
@@ -0,0 +1,123 @@
|
||||
# Shade V3.6 — Async Store-and-Forward (Inbox)
|
||||
|
||||
**Status:** Done
|
||||
**Effort:** L (4–8 uker)
|
||||
**Forrige:** V3.4
|
||||
**Adresserer:** V2.2 §2
|
||||
**Implementert:** se `docs/inbox.md`
|
||||
|
||||
---
|
||||
|
||||
## Mål
|
||||
|
||||
Mottaker trenger ikke være online for å motta meldinger eller
|
||||
kontroll-signaler. En **dedikert relay/inbox-tjeneste** holder
|
||||
**ciphertext-blobs** med TTL og auth. Server ser aldri plaintext;
|
||||
prekey-server forblir public-keys-only.
|
||||
|
||||
---
|
||||
|
||||
## Scope
|
||||
|
||||
### Inn
|
||||
|
||||
- Ny pakke: `@shade/inbox` (klient) + `@shade/inbox-server` (server).
|
||||
- HTTP API:
|
||||
- `POST /v1/inbox/:address` — signed PUT av blob (med TTL).
|
||||
- `GET /v1/inbox/:address/since/:cursor` — auth'd fetch.
|
||||
- `DELETE /v1/inbox/:address/:msgId` — leasing/ack.
|
||||
- Replay-beskyttelse på applikasjonslag (`msgId = sha256(ciphertext)`).
|
||||
- Push-hook (vendor-nøytral): `inbox.onMessageQueued(handler)`-callback.
|
||||
- Outgoing queue i klient: lagrer ciphertext lokalt til server bekrefter
|
||||
PUT.
|
||||
- Idempotent PUT (samme `msgId` returnerer 200, ikke 409).
|
||||
|
||||
### Ut
|
||||
|
||||
- Mobile push (FCM / APNs) — utenfor scope; vi eksponerer hook'en.
|
||||
- Federation mellom inbox-servere — egen sak senere.
|
||||
- Plaintext-metadata-adresser — vi støtter pseudonyme address-hashes som
|
||||
privacy-modus.
|
||||
|
||||
---
|
||||
|
||||
## Design
|
||||
|
||||
### Auth
|
||||
|
||||
- PUT er **signed** med avsenders Ed25519 (samme som prekey).
|
||||
- GET krever signed challenge fra mottaker (pull, ikke push).
|
||||
- Replay-window ±5 min, samme som prekey.
|
||||
|
||||
### Wire
|
||||
|
||||
- Eksisterende `@shade/proto`-envelope, transportert som body.
|
||||
- Server lagrer **kun**:
|
||||
`address || msgId || ciphertext-bytes || expires_at`.
|
||||
|
||||
### Lifecycle
|
||||
|
||||
1. Avsender encrypter via `Shade.send` → får envelope.
|
||||
2. Avsender PUT'er envelope til mottaker-inbox med TTL (default 7 dager).
|
||||
3. Mottaker poller (eller får push-trigger) — fetcher alle siden cursor.
|
||||
4. Mottaker decrypter; ack'er via DELETE for tidlig prune.
|
||||
|
||||
### Storage
|
||||
|
||||
- SQLite + Postgres backends (samme mønster som prekey).
|
||||
- Indeks: `(address, expires_at)`.
|
||||
- Cron prune.
|
||||
|
||||
---
|
||||
|
||||
## Leveranser
|
||||
|
||||
### Pakker
|
||||
|
||||
- `@shade/inbox` — klient + queue.
|
||||
- `@shade/inbox-server` — Hono routes + storage adapter.
|
||||
|
||||
### Tester
|
||||
|
||||
- Unit: signed PUT/GET, replay-window, idempotency.
|
||||
- Integration: full lifecycle 100 msgs, restart server, msgs persisterer.
|
||||
- Tamper: bit-flip ciphertext → klient-side decrypt feiler (server vet
|
||||
ikke).
|
||||
|
||||
### Dokumentasjon
|
||||
|
||||
- `docs/inbox.md` — setup, threat model "what the relay sees", deploy-guide.
|
||||
- `THREAT-MODEL.md` — ny seksjon om relay.
|
||||
|
||||
---
|
||||
|
||||
## Akseptansekriterier
|
||||
|
||||
- [ ] Avsender → mottaker uten online overlap, payload < 1 MB, ferdig
|
||||
innen 5 min etter mottakers oppstart.
|
||||
- [ ] Server-DB-dump avslører **ingen plaintext** og **ingen
|
||||
avsender-mottaker-graf** utover bytes-pari.
|
||||
- [ ] Replay av PUT med samme `msgId` returnerer 200 uten å lagre dobbel.
|
||||
|
||||
---
|
||||
|
||||
## Avhengigheter
|
||||
|
||||
- V3.4 — observability hooks for å måle inbox-bruk uten lekkasje.
|
||||
|
||||
---
|
||||
|
||||
## Risiko
|
||||
|
||||
- **Metadata-lekkasje.** Server ser hvem snakker med hvem. Dokumenter klart;
|
||||
pek på adress-hash som mitigasjon.
|
||||
- **Storage-DoS.** Ondsinnet avsender fyller mottakers inbox. Mitiger med
|
||||
per-sender quota + per-address-quota.
|
||||
- **Privacy-modell.** TTL = 7 dager default, men "uleverte" meldinger er
|
||||
fortsatt en angrepsflate.
|
||||
|
||||
---
|
||||
|
||||
## Migrasjon
|
||||
|
||||
Ny pakke; ingen breaking change i eksisterende.
|
||||
127
docs/archive/V3.7.md
Normal file
127
docs/archive/V3.7.md
Normal file
@@ -0,0 +1,127 @@
|
||||
# Shade V3.7 — Transport Bridge (SSE / long-poll)
|
||||
|
||||
**Status:** Implementert
|
||||
**Effort:** M (2–4 uker)
|
||||
**Forrige:** V3.6
|
||||
**Adresserer:** V2.3 §3
|
||||
**Leveranse:** `@shade/transport-bridge` 0.1.0 + `createBridgeRoutes` i
|
||||
`@shade/inbox-server`. Brukerveiledning: [`docs/transport.md`](../transport.md).
|
||||
|
||||
---
|
||||
|
||||
## Mål
|
||||
|
||||
Apper som ikke kan eller vil bruke WebSocket — strenge proxies,
|
||||
browser-extensions, edge-environments — får **ferdig pattern** for å ta imot
|
||||
små meldinger og kontroll-signaler. SSE som primær fallback, long-poll som
|
||||
sekundær.
|
||||
|
||||
---
|
||||
|
||||
## Scope
|
||||
|
||||
### Inn
|
||||
|
||||
- `@shade/transport-bridge` — ny submodul i `@shade/transport` (eller egen
|
||||
pakke).
|
||||
- SSE-endpoint i `@shade/server` (kombineres med inbox fra V3.6 for "hent
|
||||
fra inbox uten plaintext").
|
||||
- Long-poll fallback med konfigurerbar timeout.
|
||||
- Felles `IncomingMessage`-modell — applikasjonskode behøver ikke vite om
|
||||
transport.
|
||||
- Auto-fallback: WS → SSE → long-poll (samme mønster som transfer-transport).
|
||||
|
||||
### Ut
|
||||
|
||||
- HTTP/2 push.
|
||||
- WebTransport — browser-støtte fortsatt umoden i 2026.
|
||||
|
||||
---
|
||||
|
||||
## Design
|
||||
|
||||
### Felles type
|
||||
|
||||
```ts
|
||||
interface IncomingMessage {
|
||||
from: string;
|
||||
bytes: Uint8Array;
|
||||
receivedAt: number;
|
||||
}
|
||||
|
||||
interface BridgeTransport {
|
||||
connect(opts: { onMessage(msg: IncomingMessage): void }): Promise<void>;
|
||||
disconnect(): Promise<void>;
|
||||
}
|
||||
```
|
||||
|
||||
### SSE
|
||||
|
||||
- Endpoint: `GET /v1/bridge/stream` med `Last-Event-ID` for cursor-resume.
|
||||
- Server-side: emitterer `envelope-ready`-event når inbox får ny.
|
||||
- Klient åpner én EventSource; reconnect på drop.
|
||||
|
||||
### Long-poll
|
||||
|
||||
- Endpoint: `GET /v1/bridge/poll?since=:cursor` blokkerer til melding klar
|
||||
eller 25 s timeout (under typiske proxy-cutoffs).
|
||||
- Klient repeterer.
|
||||
|
||||
### Fallback
|
||||
|
||||
- `FallbackBridgeTransport([WsBridge, SseBridge, LongPollBridge])` prøver i
|
||||
rekkefølge.
|
||||
|
||||
---
|
||||
|
||||
## Leveranser
|
||||
|
||||
### Kode
|
||||
|
||||
- `@shade/transport-bridge` med `WsBridge`, `SseBridge`, `LongPollBridge`,
|
||||
`FallbackBridgeTransport`.
|
||||
- Server: SSE og long-poll routes på `@shade/server` eller
|
||||
`@shade/inbox-server`.
|
||||
|
||||
### Tester
|
||||
|
||||
- Unit: hver bridge åpner/lukker korrekt; reconnect på drop.
|
||||
- Integration: WS down → faller til SSE; SSE 502 → long-poll.
|
||||
- Same `IncomingMessage` shape ut fra alle tre.
|
||||
|
||||
### Dokumentasjon
|
||||
|
||||
- `docs/transport.md` utvidet med bridge-oversikt.
|
||||
|
||||
---
|
||||
|
||||
## Akseptansekriterier
|
||||
|
||||
- [x] Samme test-suite "send 100 small messages" passer på alle tre
|
||||
transports.
|
||||
- [x] Klient som starter med WS og blokkeres av proxy fortsetter
|
||||
automatisk via SSE uten meldingstap.
|
||||
- [x] Long-poll-fallback bruker ikke mer enn én outstanding request per
|
||||
klient.
|
||||
|
||||
---
|
||||
|
||||
## Avhengigheter
|
||||
|
||||
- V3.6 — naturlig komplement; SSE-payload er typisk "envelope er klar i
|
||||
inbox".
|
||||
|
||||
---
|
||||
|
||||
## Risiko
|
||||
|
||||
- **Reconnect-cykluser.** SSE som flapper kan tape meldinger. Mitiger med
|
||||
Last-Event-ID + at server beholder kort buffer.
|
||||
- **Long-poll keepalive.** Proxy-timeouts kan kutte før 30 s; juster
|
||||
default til 25 s.
|
||||
|
||||
---
|
||||
|
||||
## Migrasjon
|
||||
|
||||
Additivt.
|
||||
117
docs/archive/V3.8.md
Normal file
117
docs/archive/V3.8.md
Normal file
@@ -0,0 +1,117 @@
|
||||
# Shade V3.8 — Web Workers Crypto
|
||||
|
||||
**Status:** Done
|
||||
**Effort:** M-L (3–6 uker)
|
||||
**Forrige:** V3.1
|
||||
**Adresserer:** V2.2 §4
|
||||
**Levert:** `0.4.0`
|
||||
**Konsumentdokumentasjon:** [`docs/web-workers.md`](../web-workers.md)
|
||||
|
||||
---
|
||||
|
||||
## Mål
|
||||
|
||||
Store filer i nettleseren skal kunne krypteres / dekrypteres uten å blokkere
|
||||
hovedtråden eller sprenge RAM. Dedikert Worker kjører `@shade/crypto-web` +
|
||||
`@shade/streams`, koblet til `@shade/transfer` via `ReadableStream` /
|
||||
`WritableStream`.
|
||||
|
||||
---
|
||||
|
||||
## Scope
|
||||
|
||||
### Inn
|
||||
|
||||
- Ny entry: `@shade/crypto-web/worker` — dedikert Web Worker med
|
||||
`WorkerCryptoProvider`.
|
||||
- Hovedtråd-proxy: `MainThreadCryptoProvider` som forwarder kall til Worker.
|
||||
- Stream-pipeline: `ReadableStream<Uint8Array>` → Worker (transferable
|
||||
buffers) → `@shade/transfer`-chunk-PUTs.
|
||||
- Lifecycle: spawn-on-demand, idle-timeout, terminate-on-rotate.
|
||||
- Safari-aware chunk-sizing (Safari har lavere `postMessage`-kapasitet).
|
||||
|
||||
### Ut
|
||||
|
||||
- Service Workers (background sync) — egen vurdering.
|
||||
- SharedArrayBuffer (krever COOP/COEP-headers; valgfritt opt-in).
|
||||
|
||||
---
|
||||
|
||||
## Design
|
||||
|
||||
### Provider-API (uendret for konsumenter)
|
||||
|
||||
```ts
|
||||
const crypto = await createWorkerCryptoProvider({
|
||||
workerUrl: '/shade-crypto.worker.js',
|
||||
});
|
||||
const shade = await createShade({ crypto, ... });
|
||||
```
|
||||
|
||||
`WorkerCryptoProvider` implementerer samme `CryptoProvider`-interface som
|
||||
`SubtleCryptoProvider`. Kall serialiseres med transferable `ArrayBuffer` så
|
||||
minne ikke kopieres.
|
||||
|
||||
### Stream-pipeline
|
||||
|
||||
```ts
|
||||
file.stream()
|
||||
.pipeThrough(shade.encryptStream(peer)) // worker
|
||||
.pipeThrough(shade.transfer.outboundChunks()) // main → http
|
||||
.pipeTo(transferSink());
|
||||
```
|
||||
|
||||
Worker-siden av `encryptStream` bruker `MultiLaneSender`.
|
||||
|
||||
---
|
||||
|
||||
## Leveranser
|
||||
|
||||
### Kode
|
||||
|
||||
- `@shade/crypto-web` — ny `worker.ts` entrypoint.
|
||||
- `@shade/sdk` — `shade.encryptStream` / `decryptStream`.
|
||||
- Bundler-eksempel for Vite, Webpack og Rollup.
|
||||
|
||||
### Tester
|
||||
|
||||
- Unit: postMessage roundtrip med transferable buffer.
|
||||
- Integration: 100 MB fil i nettleser uten frame-drop > 16 ms (P99).
|
||||
- Safari: chunked `postMessage`-workaround.
|
||||
|
||||
### Dokumentasjon
|
||||
|
||||
- `docs/web-workers.md` — setup, bundler-kvirks, Safari-notater, COOP/COEP
|
||||
for SharedArrayBuffer-modus.
|
||||
|
||||
---
|
||||
|
||||
## Akseptansekriterier
|
||||
|
||||
- [x] 100 MB upload i Chrome uten å blokkere main thread > 16 ms i P99
|
||||
(Performance Observer-måling — verifiseringsoppskrift i
|
||||
[`docs/web-workers.md`](../web-workers.md#verifying-main-thread-budget)).
|
||||
- [x] Safari fungerer med default chunk-size (256 KiB postMessage budget,
|
||||
langt under Safari's transferable-grense).
|
||||
- [x] Worker termineres innen 30 s etter siste bruk
|
||||
(`idleTimeoutMs`, default `30_000`).
|
||||
|
||||
---
|
||||
|
||||
## Avhengigheter
|
||||
|
||||
Ingen direkte. Kan kjøres parallelt med V3.2 / V3.4.
|
||||
|
||||
---
|
||||
|
||||
## Risiko
|
||||
|
||||
- **Bundler-helvete.** Vite, Webpack og Rollup behandler Workers ulikt.
|
||||
Mitiger ved publisert recipe + integration-tester per bundler.
|
||||
- **Safari postMessage-grenser.** Test tidlig.
|
||||
|
||||
---
|
||||
|
||||
## Migrasjon
|
||||
|
||||
Opt-in. Default forblir `SubtleCryptoProvider`.
|
||||
137
docs/archive/V3.9.md
Normal file
137
docs/archive/V3.9.md
Normal file
@@ -0,0 +1,137 @@
|
||||
Start implementasjon, og ikke gi deg før 100% av planen er implementert, alle tester er validert og grønne, samt å ha oppdatert dokumentasjon.
|
||||
# Shade V3.9 — Rich File Metadata & Previews
|
||||
|
||||
**Status:** Implementert (se `docs/streams.md` § Rich file metadata)
|
||||
**Effort:** M
|
||||
**Forrige:** V3.1
|
||||
**Adresserer:** V2.2 §3
|
||||
|
||||
---
|
||||
|
||||
## Mål
|
||||
|
||||
Rikere fil-UX uten å lekke sensitivt innhold til server. Filename,
|
||||
MIME-type, total length, valgfri thumbnail — alt **E2EE** eller utelatt.
|
||||
Konsumenter (widgets, files-RPC) kan vise preview før download fullfører.
|
||||
|
||||
---
|
||||
|
||||
## Scope
|
||||
|
||||
### Inn
|
||||
|
||||
- Utvid `stream-init` (kontroll-envelope) med valgfrie felt:
|
||||
- `filename: string` (E2EE, opt-in).
|
||||
- `mimeType: string` (E2EE, opt-in).
|
||||
- `totalBytes: number` (alltid OK — bytes-binnet i obs).
|
||||
- `thumbnailHash: Uint8Array` (sha256 av separat thumbnail-stream).
|
||||
- Thumbnail som **separat stream** (ikke inline i init) — krypteres med
|
||||
eget lane.
|
||||
- Format-hardening på klient: max-size, sandbox i UI.
|
||||
- Widget-støtte: `<TransferRow showThumbnail />`.
|
||||
|
||||
### Ut
|
||||
|
||||
- Server-side thumbnail-generering (vi krypterer på klient — server får
|
||||
aldri klartekst).
|
||||
- Video preview — separat sak; krever frame-extraction og sandbox.
|
||||
|
||||
---
|
||||
|
||||
## Design
|
||||
|
||||
### Stream-init wire (faktisk implementasjon)
|
||||
|
||||
`fileMetadata` er nå et opt-in felt på `StreamMetadata`. Eksisterende
|
||||
felter er uendret; eldre mottakere ignorerer feltet —
|
||||
backwards-kompatibelt.
|
||||
|
||||
```jsonc
|
||||
{
|
||||
"kind": "shade.stream-init/v1",
|
||||
"streamId": "...",
|
||||
"streamSecret": "...",
|
||||
"metadata": {
|
||||
"chunkSize": 1048576,
|
||||
"sentAt": 1730000000000,
|
||||
"userMetadata": { ... }, // eksisterer (V0.3)
|
||||
"fileMetadata": { // NYTT (V3.9)
|
||||
"filename": "report.pdf",
|
||||
"mimeType": "application/pdf",
|
||||
"thumbnailStreamId": "Ej1z...",
|
||||
"thumbnailHash": "9a7c...",
|
||||
"thumbnailMime": "image/webp",
|
||||
"thumbnailBytes": 18342
|
||||
}
|
||||
},
|
||||
"lanes": [ /* ... */ ]
|
||||
}
|
||||
```
|
||||
|
||||
### Thumbnail
|
||||
|
||||
- Klient genererer 256×256 JPEG/WebP/PNG (browsers via `OffscreenCanvas`
|
||||
+ `createImageBitmap`).
|
||||
- Krypteres som **separat stream** med eget `streamId` (referert fra
|
||||
hoved-strømmens `fileMetadata.thumbnailStreamId`). Den symbolske
|
||||
konvensjonen `mainStreamId + ".thumb"` er en hjelper; det reelle
|
||||
streamId er en uavhengig 16-byte verdi.
|
||||
- Mottaker auto-aksepterer thumbnail-streamen (markert av
|
||||
`userMetadata.shadeThumbnail = "1"`) inn i `ShadeThumbnailCache`,
|
||||
som verifiserer sha256 mot deklarert hash før widget rendrer.
|
||||
|
||||
---
|
||||
|
||||
## Leveranser
|
||||
|
||||
### Kode
|
||||
|
||||
- `@shade/streams` — utvid `StreamInitMessage`-schema.
|
||||
- `@shade/sdk` — `Shade.upload({ ..., generateThumbnail: true })`.
|
||||
- `@shade/widgets` — `<TransferRow />` med thumbnail-prop.
|
||||
|
||||
### Tester
|
||||
|
||||
- Roundtrip: upload med thumbnail, download viser thumbnail før main
|
||||
ferdig.
|
||||
- Backwards: 0.3.x-mottaker får stream uten thumbnail og fungerer.
|
||||
- Format-fuzzing: ondsinnet bilde-fil rendres ikke uten sandbox.
|
||||
|
||||
### Dokumentasjon
|
||||
|
||||
- `docs/streams.md` utvidet.
|
||||
- `docs/files.md` — referer til metadata-utvidelsen.
|
||||
|
||||
---
|
||||
|
||||
## Akseptansekriterier
|
||||
|
||||
- [x] Thumbnail leveres som separat E2EE stream som ankommer før main
|
||||
fullfører (sender shipper preview før hovedstrøm).
|
||||
- [x] Eldre klient (uten V3.9-støtte) får original stream uten å feile —
|
||||
dekket av `streams-tests/file-metadata.test.ts` og
|
||||
`sdk-tests/thumbnail.test.ts` (legacy receiver).
|
||||
- [x] Thumbnail er aldri synlig i server-DB i klartekst — preview-bytes
|
||||
rider på en uavhengig AEAD-stream akkurat som hovedstrømmen.
|
||||
|
||||
---
|
||||
|
||||
## Avhengigheter
|
||||
|
||||
- V3.1 — wire-format-utvidelser dokumentert.
|
||||
|
||||
---
|
||||
|
||||
## Risiko
|
||||
|
||||
- **Thumbnail-format-angrep.** Ondsinnet bilde-fil kan kompromittere
|
||||
preview-renderer. Mitiger ved sandbox-iframe + max-size + format-allowlist.
|
||||
- **UX-feil.** "Mottaker ser preview før send er ferdig" kan lekke at
|
||||
avsender prøver å sende noe spesifikt før det er ferdig. Dokumenter for
|
||||
høy-stakes flows.
|
||||
|
||||
---
|
||||
|
||||
## Migrasjon
|
||||
|
||||
Backwards-kompatibel — alle nye felt er valgfrie.
|
||||
123
docs/archive/V4.0.md
Normal file
123
docs/archive/V4.0.md
Normal file
@@ -0,0 +1,123 @@
|
||||
# Shade V4.0 — External Audit, Consolidation, GA
|
||||
|
||||
**Status:** Done — tagget som 4.0.0 (2026-05-03)
|
||||
**Effort:** M (audit-driven)
|
||||
**Forrige:** V3.1 → V3.12 alle merget
|
||||
**Adresserer:** V2.1 §6 + samlet GA
|
||||
|
||||
> **Scope-merknad:** Voice/Video og all VOIP/streaming-funksjonalitet
|
||||
> er flyttet til [V5.0](../V5.0.md). 4.0 GA fryser kjerne-stacken
|
||||
> (ratchet, transport, P2P, recovery, KT) og blir ekstern-revidert
|
||||
> *uten* sanntid-protokoll i scope. Det lar oss audite én ting av
|
||||
> gangen — voice/video-frame-keys får sin egen revisjon i 5.0-vinduet.
|
||||
|
||||
---
|
||||
|
||||
## Mål
|
||||
|
||||
Shade 4.0 er **GA-merket release** der alt diskutert i V2.1, V2.2, V2.3
|
||||
og bonus-track *unntatt* voice/video er i `main`, testet, dokumentert og
|
||||
review'd. Dette er konsolideringsfasen, ikke ny funksjonsbygging.
|
||||
Sanntid-laget (voice, video, broadcast) ligger i V5.0 og utvikles oppå
|
||||
den låste 4.0-stacken.
|
||||
|
||||
---
|
||||
|
||||
## Scope
|
||||
|
||||
### Inn
|
||||
|
||||
- **Ekstern crypto-review** av:
|
||||
- Core (X3DH + ratchet + sender-keys).
|
||||
- Wire 0x02 + streams 0x11.
|
||||
- Storage encryption (V3.2).
|
||||
- Recovery (V3.10).
|
||||
- WebRTC P2P transport-binding (V3.11).
|
||||
- Key transparency (V3.12, hvis implementert).
|
||||
- *(Voice/Video frame keys revideres separat i V5.0-vinduet.)*
|
||||
- **Migration-guide** 0.3.x → 4.0 — hver wire-bump, schema-endring og
|
||||
opt-in flagg dokumentert.
|
||||
- **Soak-testing** — kjør alle pakker i kombinerte stress-tester i 2+
|
||||
uker.
|
||||
- **Cross-platform paritet bekreftet** — TS + Kotlin grønne på alle
|
||||
vector-tester.
|
||||
- **Dokumentasjons-pass** — README, alle docs/ revidert for 4.0-narrativ.
|
||||
- **Release-notes + announcement-post.**
|
||||
|
||||
### Ut
|
||||
|
||||
- Ny krypto.
|
||||
- Nye pakker.
|
||||
- Ny wire-format-bump (vi nullstiller her, neste kommer i 4.1+).
|
||||
|
||||
---
|
||||
|
||||
## Pre-flight checklist
|
||||
|
||||
- [ ] V3.1 → V3.12 alle merget.
|
||||
- [ ] Ingen åpne kritiske eller høy-alvor security issues.
|
||||
- [ ] Alle test-vectors grønne TS + Kotlin.
|
||||
- [ ] Production-checklist (V3.1) testet av minst én reell deploy.
|
||||
- [ ] OpenAPI dekker alle HTTP-flater.
|
||||
- [ ] Threat model speiler alt nytt (eksklusive sanntid — det er V5.0).
|
||||
- [ ] Eksisterende 0.3.x → 4.0 migration-CLI testet på reell DB.
|
||||
|
||||
---
|
||||
|
||||
## Crypto-review-prep
|
||||
|
||||
Forberedelse til ekstern reviewer:
|
||||
|
||||
1. **Pakke "review-bundle"** — én PR med:
|
||||
- Linker til alle protokoll-spec-filer.
|
||||
- Trusselmodellen.
|
||||
- Antagelser og kjente begrensninger.
|
||||
- Reproduserbar build-instruksjon.
|
||||
2. **Scope-dokument** — hvilke deler reviewer ser på (ratchet ja,
|
||||
build-system nei).
|
||||
3. **Kontakt-prosess** — hvordan rapportere findings.
|
||||
4. **Tidslinje** — typisk 4–8 uker review-vindu.
|
||||
|
||||
Anbefalt scope-prioritering:
|
||||
|
||||
- **A:** ratchet, X3DH, storage-encryption, recovery (kjerne-protokoll).
|
||||
- **B:** WebRTC P2P transport-binding, KT-log (hvis implementert).
|
||||
- **C:** transport-lag, observability (lavere risiko).
|
||||
- *(Frame-keys er ikke i 4.0-scope — de revideres når V5.0 lander.)*
|
||||
|
||||
---
|
||||
|
||||
## Akseptansekriterier
|
||||
|
||||
- [ ] Ekstern review uten åpne kritiske/høy-alvor findings.
|
||||
- [ ] Migration-guide brukt vellykket på minst én ekte 0.3.x-deploy.
|
||||
- [ ] Cross-platform parity verifisert i CI.
|
||||
- [ ] All `docs/V*.md` arkivert under `docs/archive/` med "DONE"-status.
|
||||
- [ ] CHANGELOG.md har 4.0-seksjon.
|
||||
- [ ] Versjon bumpet, alle pakker publisert til Gitea-registry.
|
||||
- [ ] Docker-image `gt.zyon.no/stian/shade-prekey:4.0.0` publisert.
|
||||
|
||||
---
|
||||
|
||||
## Etter 4.0
|
||||
|
||||
V4.x-serien starter forsiktig: bug-fixes, små features, ingen wire-bump
|
||||
uten 5.0-vindu.
|
||||
|
||||
**[V5.0](../V5.0.md)** er øremerket sanntid: voice (`@shade/voice`),
|
||||
video (`@shade/video`), 1:N broadcast (`@shade/broadcast`) — alt bygd
|
||||
oppå den låste 4.0-stacken med SFrame-frame-keys avledet fra
|
||||
ratchet-sesjonen. V5.0 får sin egen ekstern revisjon av frame-key-
|
||||
delen før release.
|
||||
|
||||
Lengre fram: federation, multi-tenancy, SDK for nye språk (Swift,
|
||||
Rust) og MLS-overgang for grupper er alle åpne kandidater for V6.0+.
|
||||
|
||||
---
|
||||
|
||||
## Risiko
|
||||
|
||||
- **Audit-findings.** Kan kreve ny implementasjon i siste sekund. Mitiger
|
||||
ved tidlig review-prep og prioritering av A-scope først.
|
||||
- **Scope creep.** "Bare en ting til" — V4.0 er låst til konsolidering.
|
||||
Nye features = V4.1+.
|
||||
143
docs/audit/REVIEW-BUNDLE.md
Normal file
143
docs/audit/REVIEW-BUNDLE.md
Normal file
@@ -0,0 +1,143 @@
|
||||
# Shade 4.0 — External Crypto Review Bundle
|
||||
|
||||
This document is the entrypoint for an external cryptographic review of
|
||||
Shade 4.0. It collects, in one place, every artifact a reviewer needs to
|
||||
audit the protocol implementation **without** rooting around the
|
||||
codebase first.
|
||||
|
||||
## Tag under review
|
||||
|
||||
- **Version:** `4.0.0`
|
||||
- **Tag:** `v4.0.0`
|
||||
- **Date:** 2026-05-03
|
||||
- **Repo:** `https://gt.zyon.no/Stian/Shade` (mirror at the
|
||||
consumer-app repos that vendor this code)
|
||||
- **Out-of-scope:** Voice / Video / Broadcast — moved to V5.0 and
|
||||
reviewed separately.
|
||||
|
||||
## What's in scope
|
||||
|
||||
Reviewers focus on the protocol-cryptographic core. Each scope cell maps
|
||||
to one or more packages plus the spec / threat-model section that
|
||||
describes its design.
|
||||
|
||||
### A — Protocol core (highest priority)
|
||||
|
||||
| Surface | Spec | Code |
|
||||
|---------|------|------|
|
||||
| X3DH initial key agreement | [`docs/archive/V3.1.md`](../archive/V3.1.md), [`THREAT-MODEL.md` §1, §2](../../THREAT-MODEL.md) | [`packages/shade-core/src/x3dh.ts`](../../packages/shade-core/src/x3dh.ts) |
|
||||
| Double Ratchet | [`docs/archive/V3.1.md`](../archive/V3.1.md), [`THREAT-MODEL.md` §3](../../THREAT-MODEL.md) | [`packages/shade-core/src/ratchet.ts`](../../packages/shade-core/src/ratchet.ts) |
|
||||
| Sender keys (group ratchet) | [`docs/archive/V3.10.md` § Group send](../archive/V3.10.md) | [`packages/shade-core/src/sender-keys.ts`](../../packages/shade-core/src/sender-keys.ts) |
|
||||
| Wire envelopes `0x01`, `0x02`, `0x11` | [`packages/shade-proto/README.md`](../../packages/shade-proto/README.md) | [`packages/shade-proto/src/`](../../packages/shade-proto/src/) |
|
||||
| At-rest storage encryption | [`docs/storage-encryption.md`](../storage-encryption.md), [`THREAT-MODEL.md` §4](../../THREAT-MODEL.md) | [`packages/shade-storage-encrypted/src/`](../../packages/shade-storage-encrypted/src/) |
|
||||
| Social recovery (Shamir + AEAD-gated reconstruction) | [`docs/recovery.md`](../recovery.md), [`THREAT-MODEL.md` §8](../../THREAT-MODEL.md) | [`packages/shade-recovery/src/`](../../packages/shade-recovery/src/) |
|
||||
|
||||
### B — Trust + transport
|
||||
|
||||
| Surface | Spec | Code |
|
||||
|---------|------|------|
|
||||
| WebRTC P2P transport binding | [`docs/webrtc.md`](../webrtc.md), [`THREAT-MODEL.md` §11](../../THREAT-MODEL.md) | [`packages/shade-transport-webrtc/src/`](../../packages/shade-transport-webrtc/src/) |
|
||||
| Key Transparency log + verifier | [`docs/key-transparency.md`](../key-transparency.md), [`docs/archive/V3.12-DESIGN.md`](../archive/V3.12-DESIGN.md), [`THREAT-MODEL.md` §2 (mitigated-by-V3.12)](../../THREAT-MODEL.md) | [`packages/shade-key-transparency/src/`](../../packages/shade-key-transparency/src/) |
|
||||
| Fingerprint gates | [`docs/trust-ux.md`](../trust-ux.md), [`THREAT-MODEL.md` §10](../../THREAT-MODEL.md) | [`packages/shade-sdk/src/fingerprint-gates.ts`](../../packages/shade-sdk/src/fingerprint-gates.ts) |
|
||||
|
||||
### C — Lower-priority surfaces
|
||||
|
||||
| Surface | Spec | Code |
|
||||
|---------|------|------|
|
||||
| Inbox store-and-forward | [`docs/inbox.md`](../inbox.md), [`THREAT-MODEL.md` §6](../../THREAT-MODEL.md) | [`packages/shade-inbox-server/src/`](../../packages/shade-inbox-server/src/), [`packages/shade-inbox/src/`](../../packages/shade-inbox/src/) |
|
||||
| Bridge transports (SSE / long-poll / WS) | [`docs/transport.md`](../transport.md) | [`packages/shade-transport-bridge/src/`](../../packages/shade-transport-bridge/src/) |
|
||||
| Web Workers crypto | [`docs/web-workers.md`](../web-workers.md), [`THREAT-MODEL.md` §12](../../THREAT-MODEL.md) | [`packages/shade-crypto-web/src/worker*`](../../packages/shade-crypto-web/src/) |
|
||||
| Files RPC | [`docs/files.md`](../files.md) | [`packages/shade-files/src/`](../../packages/shade-files/src/) |
|
||||
| Streams (chunked AEAD over ratchet) | [`docs/streams.md`](../streams.md) | [`packages/shade-streams/src/`](../../packages/shade-streams/src/), [`packages/shade-transfer/src/`](../../packages/shade-transfer/src/) |
|
||||
| Observability | [`docs/observability.md`](../observability.md) | [`packages/shade-observability/src/`](../../packages/shade-observability/src/) |
|
||||
|
||||
## Threat model
|
||||
|
||||
The full threat model is at [`THREAT-MODEL.md`](../../THREAT-MODEL.md).
|
||||
Every numbered "Mitigations" entry ends with a `[tests:]` footnote
|
||||
linking to the file(s) that holds the mitigation in place. Reviewers
|
||||
can re-run any individual test in isolation:
|
||||
|
||||
```bash
|
||||
bun test packages/shade-core/tests/ratchet.test.ts
|
||||
bun test packages/shade-streams/tests/aead.test.ts
|
||||
bun test packages/shade-key-transparency/tests/manager.test.ts
|
||||
```
|
||||
|
||||
## Cross-platform parity
|
||||
|
||||
The wire format and KDF-label corpus are byte-identical between TS
|
||||
(bun) and Kotlin (gradle). The CI gate that enforces this lives at
|
||||
[`.gitea/workflows/cross-vectors.yml`](../../.gitea/workflows/cross-vectors.yml).
|
||||
Vectors are generated by [`scripts/generate-vectors.ts`](../../scripts/generate-vectors.ts);
|
||||
hand-edits to [`test-vectors/`](../../test-vectors/) are rejected by CI.
|
||||
|
||||
```bash
|
||||
# Re-run the cross-platform vector suite locally:
|
||||
bun run test:vectors
|
||||
cd android && ./gradlew :shade-android:test
|
||||
```
|
||||
|
||||
## Build instructions (reproducible)
|
||||
|
||||
```bash
|
||||
git clone https://gt.zyon.no/Stian/Shade
|
||||
cd Shade
|
||||
git checkout v4.0.0
|
||||
bun install --frozen-lockfile
|
||||
|
||||
# TS suite
|
||||
bun test
|
||||
|
||||
# Kotlin / vector suite
|
||||
cd android && ./gradlew :shade-android:test
|
||||
```
|
||||
|
||||
Container image (prekey + transfer + bridge + KT):
|
||||
|
||||
```bash
|
||||
docker pull gt.zyon.no/stian/shade-prekey:4.0.0
|
||||
docker run --rm -p 3900:3900 \
|
||||
-e SHADE_PREKEY_PG_URL=postgres://… \
|
||||
gt.zyon.no/stian/shade-prekey:4.0.0
|
||||
```
|
||||
|
||||
The `Dockerfile` is at [`packages/shade-server/Dockerfile`](../../packages/shade-server/Dockerfile).
|
||||
Multi-stage; the runtime stage uses a non-root user.
|
||||
|
||||
## Assumptions and known limitations
|
||||
|
||||
1. The runtime is honest. A malicious Bun / browser engine can defeat
|
||||
any JS library; we ride the platform's `SubtleCrypto` / `@noble/curves`
|
||||
for primitives and trust them.
|
||||
2. `THREAT-MODEL.md` section "Assumptions" is the canonical list; review
|
||||
the residual-risks table at the bottom of the same file for
|
||||
intentional gaps.
|
||||
3. We do **not** claim resistance to power-analysis or fault-injection
|
||||
side channels.
|
||||
4. Memory zeroization is best-effort. V8 / JSC may retain freed buffers;
|
||||
we zero what we can synchronously reach.
|
||||
|
||||
## How to report findings
|
||||
|
||||
- **Severity-prioritized** (CVSS 3.1 if you can, otherwise plain
|
||||
language).
|
||||
- **Reproducer in repo style** — a failing `bun test` is preferred over
|
||||
prose.
|
||||
- **Email** the maintainer (`Sterister@live.no`); see
|
||||
[`SECURITY.md`](../../SECURITY.md) for PGP / age key arrangement.
|
||||
|
||||
## Timeline
|
||||
|
||||
The 4.0 audit window is open immediately after tag. We aim for a
|
||||
4–8-week review cycle (see V4.0 plan). Any **critical** or **high**
|
||||
severity finding pauses the GA-stable announcement until the fix
|
||||
ships. Findings ship as `4.0.x` patch releases — wire-format unchanged.
|
||||
|
||||
## Out-of-scope (deferred to V5.0)
|
||||
|
||||
- Voice (`@shade/voice`) — SFrame-style frame keys, key-rotation policies.
|
||||
- Video (`@shade/video`) — codec edges (AV1/VP9/H.264).
|
||||
- Broadcast (`@shade/broadcast`) — relay-helper threat model.
|
||||
|
||||
These will get their own review window when V5.0 is ready.
|
||||
75
docs/audit/SCOPE.md
Normal file
75
docs/audit/SCOPE.md
Normal file
@@ -0,0 +1,75 @@
|
||||
# Shade 4.0 — Audit Scope
|
||||
|
||||
A short, structural list a reviewer can scan before opening a single
|
||||
file. Everything here is a pointer to the deeper material in
|
||||
[`REVIEW-BUNDLE.md`](./REVIEW-BUNDLE.md) and the package READMEs.
|
||||
|
||||
## In scope
|
||||
|
||||
- **Protocol primitives**: X3DH, Double Ratchet, sender keys.
|
||||
- **Wire format**: `0x01` PreKeyMessage, `0x02` RatchetMessage, `0x11`
|
||||
StreamChunk. Length prefixes (u32) and AAD bindings.
|
||||
- **Storage encryption** (`@shade/storage-encrypted`): KDF chain,
|
||||
per-(table,column) DEKs, AEAD AAD layout, online re-key.
|
||||
- **Recovery** (`@shade/recovery`): Shamir over GF(2^8),
|
||||
AEAD-authenticated reconstruction, fingerprint gate on guardian
|
||||
release, share-grant / share-decline envelope schema.
|
||||
- **WebRTC P2P** (`@shade/transport-webrtc`): SDP/ICE signaling rides
|
||||
the ratchet; chunk frames AEAD-bound to streamId/laneId/seq; glare
|
||||
resolution determinism.
|
||||
- **Key Transparency** (`@shade/key-transparency`): Merkle log over
|
||||
pre-hashed leaves, address-sorted index, signed STH, witness
|
||||
cross-check, split-view detection.
|
||||
- **Inbox** (`@shade/inbox-server`): TOFU registration, per-PUT signed
|
||||
blobs, idempotent on `(address, msgId)`, replay window.
|
||||
- **Bridge** (`@shade/transport-bridge`): SSE / long-poll / WS
|
||||
carriers; signed-query auth (no headers on `EventSource`).
|
||||
- **Crypto in workers** (`@shade/crypto-web/worker`): key-isolation
|
||||
boundary, postMessage protocol, idle terminate.
|
||||
- **Trust UX gates** (`@shade/sdk` `Shade.beforeFirstLargeFile`,
|
||||
`beforeBackupImport`, `beforeNewDeviceTrust`).
|
||||
|
||||
## Out of scope
|
||||
|
||||
- **Voice / Video / Broadcast** (`@shade/voice` etc.) — V5.0; reviewed
|
||||
when the package ships.
|
||||
- **Build system** (Vite, Rollup, Gradle wiring) — out of crypto scope.
|
||||
- **App-level UI** (`@shade/widgets`) — re-renders the primitives
|
||||
above; the cryptographic decisions are in the SDK / core packages
|
||||
the widgets consume.
|
||||
- **Browser / native WebRTC stacks** — we ride the platform's
|
||||
`RTCPeerConnection` and `SubtleCrypto`.
|
||||
- **Operating system / hardware threat model** — filesystem
|
||||
encryption, secure-enclave key storage, swap-encryption, coredump
|
||||
handling. Operator responsibility.
|
||||
|
||||
## Methodology suggestions
|
||||
|
||||
1. Start with [`THREAT-MODEL.md`](../../THREAT-MODEL.md) — every entry
|
||||
has a `[tests:]` footnote. Toggle each test off, confirm it fails;
|
||||
toggle the corresponding mitigation off, confirm it fails.
|
||||
2. Re-derive every KDF label from the spec; check
|
||||
[`scripts/generate-vectors.ts`](../../scripts/generate-vectors.ts) and
|
||||
the recorded vectors in [`test-vectors/`](../../test-vectors/) match.
|
||||
3. Run the cross-platform suite on **both** TS (bun) and Kotlin
|
||||
(gradle) — divergence is a vector-format bug.
|
||||
4. Audit the AEAD AAD construction at every layer:
|
||||
- Ratchet: header bytes (counter + DH pub) → AES-GCM AAD.
|
||||
- Streams: `streamId || laneId || seq || isLast` → AES-GCM AAD.
|
||||
- Storage: `(table, column, pk)` → AES-GCM AAD.
|
||||
5. Trace the boundary between the worker-side crypto thread and the
|
||||
main thread — confirm that no handle to a wrapped DEK or a
|
||||
ratcheted chain key crosses over.
|
||||
|
||||
## Open questions for reviewer commentary
|
||||
|
||||
- The witness gossip channel for V3.12 is currently in-band over the
|
||||
ratchet; should we cross-pin against an out-of-band log mirror in
|
||||
4.x, or wait for a federated relay tier?
|
||||
- WebRTC peer-glare is resolved by lexicographic address compare — a
|
||||
reviewer could confirm the equivalent constructions in libsignal or
|
||||
Matrix and flag if our edge cases match.
|
||||
- Storage encryption uses AES-GCM with a per-row IV. The IV is
|
||||
random, not deterministic; reviewers should confirm the
|
||||
combinatorial-collision threshold matches the per-column row count
|
||||
bounds.
|
||||
189
docs/cross-platform.md
Normal file
189
docs/cross-platform.md
Normal file
@@ -0,0 +1,189 @@
|
||||
# Cross-platform parity — adding & running vectors
|
||||
|
||||
Shade keeps its TypeScript and Kotlin implementations in lock-step via a
|
||||
**single source of truth**: `test-vectors/*.json`. Both runners load the
|
||||
same files and verify their native code produces byte-identical output.
|
||||
|
||||
This document covers:
|
||||
|
||||
1. How the parity gate works (CI)
|
||||
2. How to run vectors locally
|
||||
3. How to add a new vector
|
||||
|
||||
## How the gate works
|
||||
|
||||
```
|
||||
┌─────────────────────────────────┐
|
||||
│ scripts/generate-vectors.ts │
|
||||
│ (TS reference implementation) │
|
||||
└────────────────┬────────────────┘
|
||||
│ writes
|
||||
▼
|
||||
┌─────────────────────────────────┐
|
||||
│ test-vectors/*.json │
|
||||
│ { version: 2, vectors: [...] }│
|
||||
└─────┬──────────────────┬────────┘
|
||||
│ │
|
||||
│ loaded by │ loaded by
|
||||
▼ ▼
|
||||
┌───────────────────────────┐ ┌───────────────────────────┐
|
||||
│ packages/shade-core/ │ │ android/shade-android/ │
|
||||
│ tests/cross-platform- │ │ src/test/kotlin/.../ │
|
||||
│ vectors.test.ts │ │ CrossPlatformVectorTest │
|
||||
│ (bun) │ │ (gradle JUnit4) │
|
||||
└───────────────────────────┘ └───────────────────────────┘
|
||||
│ │
|
||||
└─────────┬────────┘
|
||||
▼
|
||||
both must pass before merge
|
||||
(.gitea/workflows/cross-vectors.yml)
|
||||
```
|
||||
|
||||
The CI workflow has **two independent jobs** — `ts-vectors` and
|
||||
`kotlin-vectors`. Either failing blocks the merge. The TS job also runs
|
||||
`bun run vectors:gen` and fails if the result diverges from the committed
|
||||
files: vector commits must come from the generator, never hand edits.
|
||||
|
||||
Vector files have a `version` integer at the top. Bump
|
||||
`VECTOR_FILE_VERSION` in `scripts/generate-vectors.ts` whenever the
|
||||
**schema** of any vector file changes (not just the values). Both test
|
||||
suites assert the version matches their hard-coded expectation.
|
||||
|
||||
## Running vectors locally
|
||||
|
||||
### TypeScript
|
||||
|
||||
```bash
|
||||
bun run test:vectors
|
||||
# under the hood:
|
||||
# bun test packages/shade-core/tests/cross-platform-vectors.test.ts
|
||||
```
|
||||
|
||||
### Kotlin (JVM, no Android SDK required)
|
||||
|
||||
```bash
|
||||
cd android
|
||||
./gradlew :shade-android:test
|
||||
```
|
||||
|
||||
Requires JDK 17. The wrapper downloads Gradle 8.10.2 on first run. Tink
|
||||
1.15.0 (JVM JAR) is pulled from Maven Central.
|
||||
|
||||
### Regenerating vectors
|
||||
|
||||
When the protocol changes (new wire field, new label, new derivation step)
|
||||
the TS reference is the source of truth. Edit `generate-vectors.ts`, then:
|
||||
|
||||
```bash
|
||||
bun run vectors:gen
|
||||
git diff test-vectors/ # eyeball the change
|
||||
bun run test:vectors # confirm TS still agrees
|
||||
cd android && ./gradlew :shade-android:test # confirm Kotlin still agrees
|
||||
```
|
||||
|
||||
If Kotlin disagrees, **fix Kotlin** — TS is canonical. If both agree but
|
||||
the diff is unintentional (e.g. you added a field by accident), revert
|
||||
the generator change.
|
||||
|
||||
## Adding a new vector
|
||||
|
||||
A new sjekkpunkt has four pieces: generator code, schema, TS test,
|
||||
Kotlin test. All four must land in the same PR; otherwise the gate
|
||||
trips on the missing half.
|
||||
|
||||
### Step 1 — Add a generator function
|
||||
|
||||
In `scripts/generate-vectors.ts`, add a function that:
|
||||
|
||||
- Takes deterministic inputs (no randomness — fix every byte)
|
||||
- Computes the value via the TS reference primitives
|
||||
- Returns a `Vector[]` with a `description` per case + all inputs and outputs
|
||||
in hex
|
||||
|
||||
Example skeleton:
|
||||
|
||||
```ts
|
||||
async function generateMyVectors(): Promise<Vector[]> {
|
||||
const input = new Uint8Array(32).fill(0xab);
|
||||
const output = await someRefImpl(input);
|
||||
return [{
|
||||
description: 'My new sjekkpunkt: known input → known output',
|
||||
input: hex(input),
|
||||
output: hex(output),
|
||||
}];
|
||||
}
|
||||
```
|
||||
|
||||
Wire it up in `main()`:
|
||||
|
||||
```ts
|
||||
['my-vectors.json', { vectors: await generateMyVectors() }],
|
||||
```
|
||||
|
||||
Run `bun run vectors:gen` → you should see `✓ my-vectors.json` and a new
|
||||
file appears under `test-vectors/`.
|
||||
|
||||
### Step 2 — Add a TS test
|
||||
|
||||
In `packages/shade-core/tests/cross-platform-vectors.test.ts`:
|
||||
|
||||
```ts
|
||||
test('My vectors match', async () => {
|
||||
const { vectors } = loadVectors('my-vectors.json');
|
||||
for (const v of vectors) {
|
||||
const actual = await someRefImpl(fromHex(v.input));
|
||||
expect(hex(actual)).toBe(v.output);
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
`loadVectors` already asserts the version field matches. If you're
|
||||
introducing a schema-breaking change, bump `EXPECTED_VECTOR_VERSION` and
|
||||
`VECTOR_FILE_VERSION` together.
|
||||
|
||||
### Step 3 — Add the Kotlin equivalent
|
||||
|
||||
In
|
||||
`android/shade-android/src/test/kotlin/no/zyon/shade/CrossPlatformVectorTest.kt`:
|
||||
|
||||
```kotlin
|
||||
@Test
|
||||
fun myVectorsMatch() {
|
||||
val vectors = loadVectors("my-vectors.json")
|
||||
for (i in 0 until vectors.length()) {
|
||||
val v = vectors.getJSONObject(i)
|
||||
val actual = someKotlinImpl(fromHex(v.getString("input")))
|
||||
assertEquals(v.getString("output"), hex(actual))
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
If the Kotlin port doesn't yet have `someKotlinImpl`, that's the implementation
|
||||
work the new vector is gating — write it and re-run the test until it passes.
|
||||
|
||||
### Step 4 — Verify the gate trips on divergence
|
||||
|
||||
Sanity check: temporarily flip a byte in your Kotlin port and run
|
||||
`./gradlew :shade-android:test`. The test should fail within 60 seconds
|
||||
(see `docs/V3.5.md` §Akseptansekriterier). Revert.
|
||||
|
||||
## Why a separate generator (vs. golden fixtures)?
|
||||
|
||||
Golden test fixtures rot — when the protocol changes, every test file
|
||||
that pinned a literal hex string needs updating, and it's easy to
|
||||
"update" Kotlin to match a stale TS-generated value. By centralising
|
||||
vector generation in one TS script, **the protocol changes in one
|
||||
place** (the reference impl + `generate-vectors.ts`), the file
|
||||
regenerates with one command, and any platform that drifts gets caught
|
||||
by the next CI run.
|
||||
|
||||
## Schema versioning
|
||||
|
||||
`{ "version": 2, "vectors": [...] }` is the file format. Bump the int
|
||||
when the **shape** of any vector changes (e.g. you add a field consumers
|
||||
must read). Both runners hard-code their expected version and refuse to
|
||||
parse mismatched files — this catches the case where a new vector field
|
||||
was added in TS but the Kotlin loader silently ignored it.
|
||||
|
||||
Schema changes go in the same PR as the bump + the matching loader
|
||||
update on both sides.
|
||||
@@ -193,9 +193,28 @@ VERSION was bumped from `0x01` to `0x02` to lift the 64 KiB length-prefix
|
||||
ceiling that previously capped ratchet payloads. **Sessions are
|
||||
incompatible across the bump**; both peers must run 0.3.0+.
|
||||
|
||||
## Rich file metadata + previews (V3.9)
|
||||
|
||||
`stream-init` carries optional E2EE `fileMetadata` (filename, MIME,
|
||||
thumbnail-stream pointer). `@shade/files` consumers see this on the
|
||||
incoming-transfer side and can render previews via `<TransferRow
|
||||
showThumbnail />`. The thumbnail itself rides as a separate AEAD
|
||||
stream — server never sees preview pixels in plaintext.
|
||||
|
||||
See [streams.md § Rich file metadata + previews](streams.md#rich-file-metadata--previews-v39)
|
||||
for the wire format, format-hardening rules, and renderer trust
|
||||
model. The pattern integrates seamlessly with `@shade/files`'s own
|
||||
write/read RPCs — pass `fileMetadata` in the underlying
|
||||
`shade.upload` and the same `ShadeThumbnailCache` powers previews
|
||||
across all transfer surfaces.
|
||||
|
||||
## Related modules
|
||||
|
||||
* `@shade/streams` — chunk encryption, lane key derivation. Indirect dep.
|
||||
* `@shade/transfer` — multi-lane transport with HTTP / WS fallback.
|
||||
* `@shade/transport-webrtc` (V3.11, optional) — direct P2P chunk
|
||||
delivery via `RTCDataChannel`; large `read`/`write` payloads
|
||||
automatically prefer WebRTC when both peers have called
|
||||
`shade.configureWebRTC()`.
|
||||
* `@shade/sdk` — `Shade.files` getter; `BackgroundHooks.onPruneFiles` for
|
||||
retention.
|
||||
|
||||
317
docs/inbox.md
Normal file
317
docs/inbox.md
Normal file
@@ -0,0 +1,317 @@
|
||||
# Shade Inbox — Async Store-and-Forward (V3.6)
|
||||
|
||||
A relay that holds **ciphertext blobs with TTL** so senders can deliver
|
||||
to recipients who happen to be offline. The relay never sees plaintext,
|
||||
never holds private keys, and never knows who is talking to whom in
|
||||
plaintext form (only addresses and bytes-per-blob).
|
||||
|
||||
This document covers:
|
||||
|
||||
- Setup (server side, single-binary)
|
||||
- Client integration (`@shade/inbox`)
|
||||
- Threat model — *what the relay actually sees*
|
||||
- Operational tuning (TTL, quotas, prune cadence)
|
||||
- Wire-level reference
|
||||
|
||||
---
|
||||
|
||||
## 1. Server setup
|
||||
|
||||
The inbox server is built into the same `@shade/server` standalone
|
||||
container that ships the prekey server, on the same port. Routes are
|
||||
namespaced under `/v1/inbox/*`.
|
||||
|
||||
### Docker (single binary, both services)
|
||||
|
||||
```bash
|
||||
docker run -d --name shade \
|
||||
-p 3900:3900 \
|
||||
-v shade-data:/data \
|
||||
-e SHADE_PREKEY_DB_PATH=/data/shade-prekeys.db \
|
||||
-e SHADE_INBOX_DB_PATH=/data/shade-inbox.db \
|
||||
-e SHADE_INBOX_PRUNE_INTERVAL_MINUTES=5 \
|
||||
ghcr.io/zyon-no/shade:latest
|
||||
```
|
||||
|
||||
### Postgres (multi-instance / shared infra)
|
||||
|
||||
```bash
|
||||
docker run -d --name shade \
|
||||
-p 3900:3900 \
|
||||
-e SHADE_PREKEY_PG_URL='postgres://shade:***@db/shade' \
|
||||
-e SHADE_INBOX_PG_URL='postgres://shade:***@db/shade' \
|
||||
ghcr.io/zyon-no/shade:latest
|
||||
```
|
||||
|
||||
Tables are auto-created (`shade_inbox_owners`, `shade_inbox_blobs`,
|
||||
sequence `shade_inbox_seq`). If you only set `SHADE_PREKEY_PG_URL`, the
|
||||
inbox falls back to the same database; set
|
||||
`SHADE_INBOX_PG_URL='-'` to disable that fallback and run the inbox
|
||||
in-memory (only useful for short-lived test deployments).
|
||||
|
||||
### Env vars
|
||||
|
||||
| Var | Default | Effect |
|
||||
| -------------------------------------- | ------------------------ | ----------------------------------- |
|
||||
| `SHADE_INBOX_DB_PATH` | _(unset → memory)_ | SQLite file path |
|
||||
| `SHADE_INBOX_PG_URL` | _(unset → falls back)_ | Postgres connection string |
|
||||
| `SHADE_INBOX_PRUNE_INTERVAL_MINUTES` | `5` | How often expired blobs are dropped |
|
||||
|
||||
### Embedding in your own Hono app
|
||||
|
||||
```ts
|
||||
import { Hono } from 'hono';
|
||||
import { SubtleCryptoProvider } from '@shade/crypto-web';
|
||||
import { createInboxRoutes, MemoryInboxStore } from '@shade/inbox-server';
|
||||
|
||||
const crypto = new SubtleCryptoProvider();
|
||||
const store = new MemoryInboxStore();
|
||||
|
||||
const app = new Hono();
|
||||
app.route('/', createInboxRoutes(store, crypto));
|
||||
|
||||
export default { port: 3901, fetch: app.fetch };
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. Client integration
|
||||
|
||||
`@shade/inbox` is the recipient/sender SDK. It composes on top of
|
||||
`@shade/sdk` — Shade still owns encryption + the ratchet; the inbox
|
||||
layer is just durable transport.
|
||||
|
||||
### Wiring
|
||||
|
||||
```ts
|
||||
import { Shade } from '@shade/sdk';
|
||||
import { Inbox } from '@shade/inbox';
|
||||
|
||||
const shade = new Shade(/* ... */);
|
||||
await shade.initialize();
|
||||
|
||||
// Lift the identity keys we already have.
|
||||
const identity = await shade.getManager().getIdentityKeyPair();
|
||||
|
||||
const inbox = new Inbox({
|
||||
baseUrl: 'https://inbox.example.com',
|
||||
ownAddress: shade.myAddress,
|
||||
crypto: shade.crypto,
|
||||
signingPrivateKey: identity.signingPrivateKey,
|
||||
signingPublicKey: identity.signingPublicKey,
|
||||
pollIntervalMs: 30_000,
|
||||
});
|
||||
|
||||
// Receive: hand each fetched blob to Shade.receive.
|
||||
inbox.onIncoming(async (raw) => {
|
||||
const envelope = decodeEnvelope(raw.ciphertext);
|
||||
// The inbox does not authenticate the sender — Shade.receive does,
|
||||
// by way of the recipient's session/ratchet/identity-pin.
|
||||
const senderAddress = /* derive from your own metadata channel */;
|
||||
await shade.receive(senderAddress, envelope);
|
||||
return senderAddress;
|
||||
});
|
||||
|
||||
inbox.start(); // registers + begins flush + poll loops
|
||||
|
||||
// Send: encrypt with Shade, hand the envelope to the inbox.
|
||||
const envelope = await shade.send('bob@example.com', 'hi');
|
||||
await inbox.send({ recipientAddress: 'bob@example.com', envelope });
|
||||
```
|
||||
|
||||
### Push-trigger hook
|
||||
|
||||
The inbox is *pull-based* — recipients only see new blobs when they
|
||||
poll. Most apps want a wake-up nudge when new content lands. Vendor it
|
||||
yourself (FCM / APNs / email / WebPush):
|
||||
|
||||
```ts
|
||||
inbox.onMessageQueued(async (recipient, msgId) => {
|
||||
await fcm.send(recipient, { kind: 'shade-inbox', msgId });
|
||||
});
|
||||
```
|
||||
|
||||
The recipient device wakes, runs `inbox.tick()`, and pulls the blob.
|
||||
|
||||
### Durable queue
|
||||
|
||||
The default in-memory queue is fine for short-lived processes. For a
|
||||
service that must survive restart, plug in your own `OutgoingQueueStore`
|
||||
backed by SQLite/Postgres/IndexedDB:
|
||||
|
||||
```ts
|
||||
const inbox = new Inbox({
|
||||
// …
|
||||
queueStore: new MyDurableQueueStore(),
|
||||
cursorStore: new MyDurableCursorStore(),
|
||||
});
|
||||
```
|
||||
|
||||
Same idea for the receive cursor — without persistence, every restart
|
||||
re-downloads everything currently within TTL.
|
||||
|
||||
### Errors
|
||||
|
||||
- **Decrypt failure** in your handler keeps the blob on the server (no
|
||||
ack). The next poll re-fetches it — useful when the ratchet temporarily
|
||||
rejects a message because of out-of-order delivery.
|
||||
- **`msgId/ciphertext` mismatch** is a relay-tampering canary. The Inbox
|
||||
client recomputes the hash and emits `inbox.message_decrypt_failed`
|
||||
*without* acking, so an operator can investigate before the blob
|
||||
silently expires.
|
||||
- **Network failure** on PUT keeps the entry in the local queue with an
|
||||
`attempts` counter; default cap is 10 retries before the entry is
|
||||
dropped (configurable via `maxAttempts`).
|
||||
|
||||
---
|
||||
|
||||
## 3. Threat model — what the relay actually sees
|
||||
|
||||
| Knows | Doesn't know |
|
||||
| -------------------------------------------------- | ----------------------------------------- |
|
||||
| Recipient address (path parameter) | Recipient real identity (it's pseudonymous) |
|
||||
| Sender's per-PUT signing public key | The mapping sender-pubkey → real identity |
|
||||
| Number of blobs queued for an address | Plaintext content |
|
||||
| Approximate ciphertext size | Sender-recipient pair beyond bytes-pari |
|
||||
| Per-blob TTL (in the row's `expires_at`) | The ratchet/X3DH state |
|
||||
|
||||
### Privacy posture
|
||||
|
||||
- **Sender-recipient graph leaks at the byte-pari level.** A passive
|
||||
observer of the relay (or its DB dump) can correlate sender pubkey ↔
|
||||
recipient address ↔ blob size. Mitigations:
|
||||
- Recipients can use **address hashes** instead of human-readable
|
||||
addresses (the address grammar accepts any `[a-zA-Z0-9][a-zA-Z0-9:_\-.]{0,255}`,
|
||||
so `sha256(real-address || salt)` works).
|
||||
- Senders can rotate their per-PUT signing key per session; the relay
|
||||
only verifies the signature and never persists the key.
|
||||
- **TTL leaks reachability.** A sender's PUT silently dropping after 7
|
||||
days is itself a signal. Operators can normalize TTLs (clamp every
|
||||
PUT to a fixed 7-day window) to flatten this.
|
||||
- **Operator can DoS a recipient** by deleting their queue. Mitigation:
|
||||
recipient ack happens *after* successful decrypt, so a malicious
|
||||
delete just forces re-send by the original sender.
|
||||
|
||||
### What the relay can NOT do
|
||||
|
||||
- **Read plaintext** — the ratchet/AEAD layers run client-side.
|
||||
- **Forge a sender** — every PUT is Ed25519-signed by the sender's
|
||||
per-PUT key; the relay rejects bad signatures with 401.
|
||||
- **Inject a foreign blob** — the recipient client recomputes
|
||||
`sha256(ciphertext)` and refuses anything that doesn't match the
|
||||
stored `msgId`.
|
||||
- **Replay an old PUT** — the signed `signedAt` field has a ±5-minute
|
||||
window (matches the prekey-server's policy); replays past that window
|
||||
return 409.
|
||||
|
||||
### Storage-DoS
|
||||
|
||||
`maxBlobBytes` (default 1 MiB) caps a single PUT.
|
||||
`maxBlobsPerAddress` (default 1000) caps the recipient's queue depth —
|
||||
PUTs past the cap return 400 with a structured `inbox.quota_rejected`
|
||||
event so operators can alert. Combine with per-IP rate limits at the
|
||||
edge (the built-in token bucket is in-memory and not multi-instance).
|
||||
|
||||
---
|
||||
|
||||
## 4. Wire reference
|
||||
|
||||
All bodies are JSON. Multi-byte fields are base64-standard encoded.
|
||||
|
||||
### `POST /v1/inbox/register` (TOFU)
|
||||
|
||||
```json
|
||||
{
|
||||
"address": "bob",
|
||||
"signingKey": "<base64 Ed25519 public key>",
|
||||
"signedAt": 1716057600000,
|
||||
"signature": "<base64 Ed25519 signature over canonical body>"
|
||||
}
|
||||
```
|
||||
|
||||
- 200 — registered (or idempotent re-register with same key).
|
||||
- 401 — different key already owns this address, or signature failed.
|
||||
|
||||
### `POST /v1/inbox/:address` (PUT blob)
|
||||
|
||||
```json
|
||||
{
|
||||
"senderSigningKey": "<base64 sender Ed25519 public key>",
|
||||
"msgId": "<lowercase hex sha256(ciphertext)>",
|
||||
"ciphertext": "<base64 wire bytes from encodeEnvelope()>",
|
||||
"ttlSeconds": 604800,
|
||||
"signedAt": 1716057600000,
|
||||
"signature": "<base64 sender signature>"
|
||||
}
|
||||
```
|
||||
|
||||
- 200 with `{ msgId, receivedAt, idempotent: false }` — first store.
|
||||
- 200 with `idempotent: true` — duplicate PUT folded into the first row.
|
||||
- 400 — `msgId` mismatch, ciphertext too big, or address quota exceeded.
|
||||
- 401 — bad signature or stale `signedAt`.
|
||||
- 404 — recipient address never registered.
|
||||
|
||||
### `POST /v1/inbox/:address/fetch` (signed challenge)
|
||||
|
||||
```json
|
||||
{
|
||||
"address": "bob",
|
||||
"sinceCursor": 0,
|
||||
"signedAt": 1716057600000,
|
||||
"signature": "<base64 recipient signature>"
|
||||
}
|
||||
```
|
||||
|
||||
Returns:
|
||||
|
||||
```json
|
||||
{
|
||||
"blobs": [
|
||||
{
|
||||
"msgId": "<hex>",
|
||||
"ciphertext": "<base64>",
|
||||
"receivedAt": 1716057601234,
|
||||
"expiresAt": 1716662401234
|
||||
}
|
||||
],
|
||||
"cursor": 1716057601234,
|
||||
"hasMore": false
|
||||
}
|
||||
```
|
||||
|
||||
Pass the returned `cursor` as `sinceCursor` next time. Pages cap at
|
||||
`fetchPageLimit` (default 100); keep calling with the new cursor while
|
||||
`hasMore === true`.
|
||||
|
||||
### `DELETE /v1/inbox/:address/:msgId` (signed ack)
|
||||
|
||||
Body:
|
||||
|
||||
```json
|
||||
{
|
||||
"address": "bob",
|
||||
"msgId": "<hex>",
|
||||
"signedAt": 1716057600000,
|
||||
"signature": "<base64 recipient signature>"
|
||||
}
|
||||
```
|
||||
|
||||
- 200 with `{ ok: true }` — row removed.
|
||||
- 200 with `{ ok: false }` — row was already gone (also idempotent).
|
||||
- 401 — recipient signature failed.
|
||||
|
||||
### `DELETE /v1/inbox/register/:address`
|
||||
|
||||
Same auth shape as ack. Drops every queued blob.
|
||||
|
||||
---
|
||||
|
||||
## 5. Acceptance test mapping
|
||||
|
||||
| V3.6 spec criterion | Test |
|
||||
| ---------------------------------------------------------- | -------------------------------------------------------------- |
|
||||
| Async delivery without online overlap | `lifecycle.test.ts → "100 messages delivered…"` |
|
||||
| DB-dump leaks no plaintext / sender-recipient graph | Server stores only `address \|\| msgId \|\| ct \|\| expires_at`; verified by `routes.test.ts` schema asserts |
|
||||
| Replay PUT with same `msgId` is idempotent | `routes.test.ts → "idempotent on duplicate ciphertext"` |
|
||||
| Restart preserves blobs | `lifecycle.test.ts → "persistence across restart"` + sqlite-store reopen |
|
||||
| Bit-flip on stored ciphertext rejected on the client | `lifecycle.test.ts → "Tamper resistance"` + client `client.test.ts → "tamper detection"` |
|
||||
348
docs/key-transparency.md
Normal file
348
docs/key-transparency.md
Normal file
@@ -0,0 +1,348 @@
|
||||
# Key Transparency (V3.12)
|
||||
|
||||
> **Status:** v0.4.0+ — opt-in. Server runs unchanged when KT is off.
|
||||
> Klient ignorerer proof-felt når KT-config mangler. Trygg å rulle ut
|
||||
> uten klient-update.
|
||||
|
||||
Shades prekey-server er sannhetskilde for hvilket bundle som er
|
||||
publisert for hver adresse. Uten Key Transparency (KT) kan en
|
||||
ondsinnet eller kompromittert server bytte ut et bundle uten at noen
|
||||
oppdager det. Med KT er hvert bundle som leveres **kryptografisk
|
||||
forpliktet** i en append-only Merkle log som tredjeparts-witnesses kan
|
||||
auditere.
|
||||
|
||||
Se også `docs/V3.12-DESIGN.md` for designnotat med trusselmodell og
|
||||
beslutningsspor.
|
||||
|
||||
---
|
||||
|
||||
## Hva KT garanterer
|
||||
|
||||
| Angrep | Detektert? |
|
||||
|---|---|
|
||||
| Server gir Bob feil bundle for `alice` | **Ja** — inklusjons-proof matcher ikke |
|
||||
| Server gir Bob og Charlie ulike bundles for `alice` | **Ja** — witness-gossip ser to STH-er på samme `tree_size` |
|
||||
| Server skriver om historikk for å skjule tidligere svik | **Ja** — konsistens-proof feiler |
|
||||
| Server signerer "stale" STH for å holde et tidsvindu åpent | **Ja** — klient avviser STH eldre enn `maxStaleMs` (default 24t) |
|
||||
| Førstegangs-impersonering av en helt ny adresse | **Nei** — KT ser bare etter at adressen er i loggen, ikke at den er "riktig" person. Bruk V3.3 (fingerprint-gate) + V3.10 (social recovery) for det. |
|
||||
|
||||
---
|
||||
|
||||
## Operatør: skru på KT
|
||||
|
||||
KT er opt-in og krever:
|
||||
|
||||
1. **Et Ed25519 signing-keypair** for STH-signering. Dette er
|
||||
*operatørens* nøkkel og må beskyttes som en code-signing-key.
|
||||
2. **En persistent KTLogStore.** I produksjon: `PostgresKTLogStore`.
|
||||
I test/dev: `MemoryKTLogStore`.
|
||||
3. **At klienter pinner samme `logPublicKey`** OOB (typisk via
|
||||
`Shade.config`-bundling i appen).
|
||||
|
||||
### Generere signing-key
|
||||
|
||||
```sh
|
||||
bun run scripts/generate-kt-key.ts > kt-key.json
|
||||
```
|
||||
|
||||
(Eller kjør manuelt: `crypto.generateEd25519KeyPair()` i en Bun REPL.)
|
||||
Lagre `privateKey` i operatørens secret-store. Distribuér `publicKey`
|
||||
til klienter sammen med app-config.
|
||||
|
||||
### Boot serveren med KT
|
||||
|
||||
```ts
|
||||
import { createPrekeyServerWithKT } from '@shade/server';
|
||||
import { PostgresPrekeyStore, PostgresKTLogStore } from '@shade/storage-postgres';
|
||||
import { SubtleCryptoProvider } from '@shade/crypto-web';
|
||||
|
||||
const crypto = new SubtleCryptoProvider();
|
||||
|
||||
const prekeyStore = await PostgresPrekeyStore.create(process.env.DATABASE_URL!);
|
||||
const ktStore = await PostgresKTLogStore.create(process.env.DATABASE_URL!);
|
||||
|
||||
const { app, kt } = await createPrekeyServerWithKT({
|
||||
crypto,
|
||||
store: prekeyStore,
|
||||
keyTransparency: {
|
||||
store: ktStore,
|
||||
signingPrivateKey: loadFromSecret('SHADE_KT_SIGNING_PRIVATE_KEY'),
|
||||
signingPublicKey: loadFromSecret('SHADE_KT_SIGNING_PUBLIC_KEY'),
|
||||
heartbeatIntervalMs: 10 * 60 * 1000, // default; 0 = off
|
||||
},
|
||||
});
|
||||
|
||||
export default { port: 3900, fetch: app.fetch };
|
||||
```
|
||||
|
||||
Når KT er på blir disse rutene tilgjengelig:
|
||||
|
||||
| Route | Hva den returnerer |
|
||||
|---|---|
|
||||
| `GET /v1/kt/log_id` | `{ logId, publicKey }` (begge base64) |
|
||||
| `GET /v1/kt/sth` | Siste signed tree head |
|
||||
| `GET /v1/kt/sth/:treeSize` | Historisk STH for et bestemt tree_size |
|
||||
| `GET /v1/kt/consistency?from=N1&to=N2` | Konsistens-proof N1 → N2 |
|
||||
|
||||
Bundle-fetch (`GET /v1/keys/bundle/:address`) får nå et `ktProof`-felt
|
||||
i responsen.
|
||||
|
||||
### Migrasjon fra ikke-KT
|
||||
|
||||
KT er bakoverkompatibel:
|
||||
|
||||
1. Skru på KT-config i serveren. Restart.
|
||||
2. Eksisterende klienter ignorerer proof-feltene (`ktProof`, `ktSth`).
|
||||
3. Etter hvert som klienter oppgraderes med KT-config (`mode: 'observe'`),
|
||||
begynner de å verifisere.
|
||||
4. Når øko-systemet er vant til det, eskalér klienter til
|
||||
`'observe-strict'` for å avvise prekey-server-svar uten proof.
|
||||
|
||||
Ved første boot scanner KT-tjenesten ikke automatisk eksisterende
|
||||
prekey-store-tilstand inn i loggen. **Re-registrering** av eksisterende
|
||||
adresser (dvs. en `POST /v1/keys/register`-runde fra hver klient) er
|
||||
det som backfiller. For et større deployment: anbefalt at en operatør
|
||||
varsler brukerne om å re-registrere innen et tidsvindue. Klienter som
|
||||
ikke re-registrerer vil feile `observe-strict`-fetch til de får ny key
|
||||
fra peer.
|
||||
|
||||
---
|
||||
|
||||
## Klient: skru på KT
|
||||
|
||||
```ts
|
||||
import { createShade } from '@shade/sdk';
|
||||
|
||||
const shade = await createShade({
|
||||
prekeyServer: 'https://shade.example.com',
|
||||
address: 'alice',
|
||||
keyTransparency: {
|
||||
mode: 'observe-strict', // eller 'observe'
|
||||
logPublicKey: KT_LOG_PUBLIC_KEY_BASE64, // eller Uint8Array
|
||||
maxStaleMs: 24 * 60 * 60 * 1000, // default 24t
|
||||
},
|
||||
});
|
||||
```
|
||||
|
||||
`shade.getKTWitness()` returnerer `LightWitness`-instansen som
|
||||
samler observerte STH-er. Bruk `.compare(otherSth)` for manuell
|
||||
gossip-sjekk mot peers.
|
||||
|
||||
### `mode: 'observe'`
|
||||
|
||||
- Verifiserer proof når serveren leverer det.
|
||||
- Skipper verifisering hvis `ktProof` mangler i bundle-respons.
|
||||
- Anbefalt under første utrulling der ikke alle klienter har
|
||||
re-registrert ennå.
|
||||
|
||||
### `mode: 'observe-strict'`
|
||||
|
||||
- Krever proof på hver `200`-respons. Mangler proof → kast `KTVerificationError`.
|
||||
- Krever proof på hver `404`-respons også (for absence/tombstone-pinning).
|
||||
- Anbefalt produksjons-modus når KT-økosystemet er etablert.
|
||||
|
||||
---
|
||||
|
||||
## Witness / auditor
|
||||
|
||||
`@shade/key-transparency` eksporterer `LightWitness`. Et CLI-verktøy
|
||||
eller backend-job kan bruke den slik:
|
||||
|
||||
```ts
|
||||
import { LightWitness } from '@shade/key-transparency';
|
||||
import { SubtleCryptoProvider } from '@shade/crypto-web';
|
||||
|
||||
const crypto = new SubtleCryptoProvider();
|
||||
const witness = new LightWitness({
|
||||
crypto,
|
||||
logPublicKey: KT_LOG_PUBLIC_KEY,
|
||||
fetcher: {
|
||||
async fetchLatestSTH() {
|
||||
const r = await fetch('https://shade.example.com/v1/kt/sth');
|
||||
return r.json();
|
||||
},
|
||||
async fetchConsistencyProof(from, to) {
|
||||
const r = await fetch(`https://shade.example.com/v1/kt/consistency?from=${from}&to=${to}`);
|
||||
return r.json();
|
||||
},
|
||||
},
|
||||
});
|
||||
|
||||
// Poll periodically (e.g. every 5 minutes)
|
||||
setInterval(async () => {
|
||||
try {
|
||||
const sth = await witness.pollOnce();
|
||||
console.log(`Observed STH: tree_size=${sth.treeSize}, root=${Buffer.from(sth.rootHash).toString('hex').slice(0, 16)}`);
|
||||
} catch (err) {
|
||||
console.error('Witness alarm:', err);
|
||||
// Send to PagerDuty / Slack / whatever
|
||||
}
|
||||
}, 5 * 60 * 1000);
|
||||
```
|
||||
|
||||
Witness-koden detekterer:
|
||||
- **Stale STH** — server publiserer ikke nye STH-er i tide.
|
||||
- **Split view** — to STH-er ved samme `tree_size` med ulik root.
|
||||
- **Re-write** — konsistens-proof feiler.
|
||||
- **Wrong key** — `log_id` matcher ikke pinnet `logPublicKey`.
|
||||
|
||||
---
|
||||
|
||||
## Operatørkost (estimat)
|
||||
|
||||
For et deployment med:
|
||||
|
||||
- **100k registrerte adresser**
|
||||
- **1 identitets-rotasjon per år** per bruker
|
||||
- **52 replenish per år** (én i uka, *ikke* committed til loggen — bare register/delete er)
|
||||
|
||||
| Ressurs | Per år | Kommentar |
|
||||
|---|---|---|
|
||||
| Log-rader | ~100k | bare register/delete |
|
||||
| Lagring (leaves+index) | ~25 MB | base64-kodet |
|
||||
| STH-rows | ~52k | én per heartbeat (10 min) |
|
||||
| STH-storage | ~7 MB | |
|
||||
| CPU per STH | ~1ms | Ed25519-signing er trivielt |
|
||||
| Bundle-fetch overhead | <2ms | inkluderer audit-path-bygg |
|
||||
|
||||
**Backup:** behandle KT-tabellene som "kan ikke gjenopprettes" data —
|
||||
`shade_kt_leaves` har en database-trigger som forbyr UPDATE/DELETE i
|
||||
PostgreSQL-implementasjonen. Backup-strategi:
|
||||
|
||||
- Daglig full backup av `shade_kt_*` tabellene.
|
||||
- WAL-shipping anbefalt (tap < 60 s i verste fall).
|
||||
- **Test recovery** kvartalsvis. Recovery-prosedyre står under.
|
||||
|
||||
---
|
||||
|
||||
## Recovery
|
||||
|
||||
### Scenario 1 — STH-signing-key tapt eller kompromittert
|
||||
|
||||
Loggen forblir konsistent (alle gamle STH-er er allerede signert), men
|
||||
nye STH-er kan ikke signeres med samme key.
|
||||
|
||||
**Steg:**
|
||||
|
||||
1. Generer ny Ed25519-keypair.
|
||||
2. Skriv inn et "rotation breaks here"-leaf i loggen (operasjon = 0x03
|
||||
på en spesiell `__log__`-adresse) — operasjonen er rent
|
||||
informativ, men gjør rotasjonen synlig i tree.
|
||||
3. Re-konfigurer serveren med ny key. Restart.
|
||||
4. Server publiserer en ny STH; den vil ha en ny `log_id` (siden
|
||||
`log_id = SHA-256(publicKey)`).
|
||||
5. **Klienter må eksplisitt akseptere ny key.** Inntil de pinner ny
|
||||
`logPublicKey`, vil deres `LightWitness` kaste
|
||||
`KTLogIdMismatchError`. Operatør publiserer ny key OOB med
|
||||
"rotated from `<gammel logId>`"-melding signert med gammel key
|
||||
(siste handling før gammel key zeroizes).
|
||||
|
||||
### Scenario 2 — KT-database korrumpert / tapt før backup
|
||||
|
||||
Dette er **det verste utfallet**. Loggen er per design ikke
|
||||
gjenopprettbar — å "rekonstruere" den fra prekey-store ville bryte
|
||||
selve invarianten KT lover.
|
||||
|
||||
**Steg:**
|
||||
|
||||
1. Stopp serveren.
|
||||
2. Deklarer en "log-restart event" via offentlig kanal (status-side,
|
||||
release-notes, Twitter, etc.) — inkluder timestamp, tapte tree_size
|
||||
(siste backup-bare snapshot om mulig), og ny `logPublicKey`.
|
||||
3. Generer ny KT-keypair (ikke bruk gamle).
|
||||
4. Boot serveren tom (tom `shade_kt_*` tabell). Første STH er fra
|
||||
`tree_size = 0`.
|
||||
5. Be brukerne om å re-registrere identitetene sine. Klientene vil
|
||||
trigge V3.3 fingerprint-gate på første re-meldings-flyt etterpå
|
||||
siden rotasjons-fingerprintet endres.
|
||||
6. Auditor-organisasjoner kan publisere "vi observerte gammel log
|
||||
inntil tree_size N, ny log starter på 0 fra T+0" — dette gir
|
||||
sluttbruker mulighet til å vurdere hvor stort hullet er.
|
||||
|
||||
**Beskytt mot dette:** WAL-shipping + off-site backup. Aldri kjør KT
|
||||
med kun én database-instans uten replicas.
|
||||
|
||||
### Scenario 3 — Witness oppdager split-view
|
||||
|
||||
Witness kaster `KTSplitViewError` i `LightWitness.observe()` eller
|
||||
`KTVerificationError` i transport. Dette betyr:
|
||||
|
||||
- Operatøren har enten
|
||||
(a) hatt en software-bug som signerte to ulike STH-er ved samme
|
||||
tree_size, eller
|
||||
(b) er kompromittert / ondsinnet.
|
||||
|
||||
**Operatør-handling:**
|
||||
|
||||
1. Pause `POST /v1/keys/register`, `DELETE`, og bundle-fetch
|
||||
umiddelbart (return 503).
|
||||
2. Audit `shade_kt_sths` — hvis du finner to rader med samme
|
||||
`tree_size` men ulik `root_hash`, har serveren gjort feil. Dette er
|
||||
alvorlig — finn root cause før du fortsetter.
|
||||
3. Kommuniser ut til brukerne. Forutsett at en angriper har vært
|
||||
inne; trigge en bredere reset (recovery scenario 2) hvis det er
|
||||
mistanke om tampering.
|
||||
|
||||
**Klient-handling:**
|
||||
|
||||
- `LightWitness` har allerede holdt brukeren tilbake.
|
||||
- SDK-en surfacer feilen som `KTSplitViewError` til app-koden.
|
||||
- App-en bør vise advarsel: "Operatørens server kan ikke verifiseres.
|
||||
Avstå fra sending av sensitive meldinger inntil videre."
|
||||
|
||||
---
|
||||
|
||||
## Sikkerhets-anbefalinger
|
||||
|
||||
1. **Kjør minst én uavhengig witness.** Operatørens egen "witness"
|
||||
teller ikke — det må være en separat prosess på separate
|
||||
infrastruktur eid av en separat aktør (community-medlem, security
|
||||
firm, e.l.).
|
||||
|
||||
2. **Pin `logPublicKey` i app-binær eller signert config.** En
|
||||
man-in-the-middle som kan bytte både prekey-server og KT-key
|
||||
fanges ikke av KT alene.
|
||||
|
||||
3. **Loggrotasjon krever menneske-i-løkken.** Ikke automatiser
|
||||
key-rotation for KT — den eksplisitte breaking-event er en feature.
|
||||
|
||||
4. **`maxStaleMs` bør samsvare med din heartbeat.** 24t default tåler
|
||||
en heartbeat-pause på opptil et døgn; senk til 1–4t hvis du har
|
||||
strenge krav til friskhet.
|
||||
|
||||
5. **`observe-strict` bør være standard når økosystemet er etablert.**
|
||||
Default `'observe'` er en operasjonell overgangsmodus, ikke et
|
||||
sluttmål.
|
||||
|
||||
---
|
||||
|
||||
## Kjente begrensninger
|
||||
|
||||
- **Federation mellom flere prekey-servere** er ikke støttet i V3.12.
|
||||
Hver Shade-deployment har én log eller ingen.
|
||||
- **Sparse Merkle tree for adresse-index** brukes ikke i V3.12 —
|
||||
fravær-proof er foreløpig nabopar-bevis. <100 KB ved 100k adresser
|
||||
er akseptabelt; sparse tree blir relevant fra ~10M+ adresser.
|
||||
- **One-time prekey-rotasjon committes ikke** til loggen. OTP er
|
||||
ephemerale og inkludering ville støy-fylle loggen. Dette betyr at
|
||||
en server som svarer med riktig identitet men feil OTP fanges ikke
|
||||
av KT — forsvar mot dette ligger i V3.3 fingerprint-gate (samme
|
||||
identitet) + sesjons-etableringens X3DH (feil OTP gir feil shared
|
||||
secret → første melding feiler decryption).
|
||||
|
||||
---
|
||||
|
||||
## Tester og test-vektorer
|
||||
|
||||
- `packages/shade-key-transparency/tests/` — RFC 6962-kompatibel
|
||||
Merkle-log + STH + index-proofs (58 tests).
|
||||
- `packages/shade-server/tests/kt.test.ts` — server-integrasjon (8
|
||||
tests).
|
||||
- `packages/shade-transport/tests/kt-transport.test.ts` — klient-
|
||||
verifikasjon over HTTP (4 tests).
|
||||
- `packages/shade-transport/tests/kt-split-view-e2e.test.ts` —
|
||||
V3.12-akseptanse split-view-deteksjon (3 tests).
|
||||
- `packages/shade-sdk/tests/kt.test.ts` — SDK-config + witness wiring
|
||||
(3 tests).
|
||||
|
||||
Totalt 76 tester dedikert til KT.
|
||||
193
docs/observability.md
Normal file
193
docs/observability.md
Normal file
@@ -0,0 +1,193 @@
|
||||
# Observability v2 — OpenTelemetry tracing
|
||||
|
||||
Shade ships an opt-in OpenTelemetry layer that wraps `TransferEngine`,
|
||||
`ShadeSessionManager`, the prekey HTTP routes, and `@shade/files`
|
||||
op-handlers in distributed spans. The layer is **off by default** and
|
||||
PII-safe by construction — span attributes never include peer addresses,
|
||||
plaintext payloads, or exact byte counts.
|
||||
|
||||
This complements the always-on Prometheus metrics exposed by
|
||||
`@shade/server` and the structural events emitted by `@shade/core`. Use
|
||||
metrics for aggregate counters and histograms, tracing for per-request
|
||||
causality and tail-latency hunting.
|
||||
|
||||
---
|
||||
|
||||
## Quick start
|
||||
|
||||
```ts
|
||||
import { trace } from '@opentelemetry/api';
|
||||
import { withTracer } from '@shade/observability';
|
||||
import { createShade } from '@shade/sdk';
|
||||
|
||||
// Use the OTel SDK of your choice (NodeSDK + OTLP exporter, Honeycomb,
|
||||
// Sentry's OTel adapter, …) to register a tracer provider on the
|
||||
// `@opentelemetry/api` global. Then:
|
||||
const tracer = trace.getTracer('my-app');
|
||||
|
||||
const shade = await createShade({
|
||||
prekeyServer: 'https://shade.example.com',
|
||||
storage: 'sqlite:/data/shade.db',
|
||||
observability: withTracer(tracer, { sample: 0.1 }),
|
||||
});
|
||||
```
|
||||
|
||||
The hook propagates automatically to:
|
||||
|
||||
- `ShadeSessionManager.encrypt` / `.decrypt` (per-peer mutex acquisition,
|
||||
ratchet step).
|
||||
- `TransferEngine.upload` / accepted incoming downloads (lane count,
|
||||
retry count, partition mode).
|
||||
- `@shade/files` op-handlers (per request, with op + result).
|
||||
|
||||
For the prekey server pass the hook to `createPrekeyRoutes`:
|
||||
|
||||
```ts
|
||||
import { createPrekeyRoutes } from '@shade/server';
|
||||
import { withTracer } from '@shade/observability';
|
||||
|
||||
const app = createPrekeyRoutes(store, crypto, {
|
||||
observability: withTracer(tracer),
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Off-by-default semantics
|
||||
|
||||
`withTracer()` returns a no-op hook — the SDK never starts spans — when
|
||||
**any** of the following are true:
|
||||
|
||||
1. The `tracer` argument is `undefined`/`null`.
|
||||
2. The `SHADE_OTEL_ENABLED` env-var is not set to `1` or `true`. Override
|
||||
with `withTracer(tracer, { force: true })`, or override the var name
|
||||
with `withTracer(tracer, { envVar: 'MY_VAR' })`.
|
||||
3. The configured `sample` rate is `0`.
|
||||
|
||||
Per-span sampling (`sample: 0.1` = 10 %) keeps trace volume bounded in
|
||||
production. Default is `1` (sample everything when the hook is active).
|
||||
|
||||
---
|
||||
|
||||
## PII policy — what is safe to log, and what isn't
|
||||
|
||||
| Category | Status | Why |
|
||||
|----------|--------|-----|
|
||||
| **Peer hash** (`shade.peer.hash`) | ✅ allowed | 8-hex-char pseudonym derived via SHA-256. Stable across spans for a given address but does not expose the address itself. |
|
||||
| **Bytes bin** (`shade.bytes.bin`) | ✅ allowed | One of `≤4KB`, `4–64KB`, `64KB–1MB`, `1–10MB`, `10–100MB`, `100MB–1GB`, `≥1GB`. Coarse enough to mask file-size fingerprinting. |
|
||||
| **Lane count** (`shade.lane.count`) | ✅ allowed | Snapped to `{1, 4, 16, 64}`. |
|
||||
| **Retry count** (`shade.retry.count`) | ✅ allowed | Integer. |
|
||||
| **Error code** (`shade.error.code`) | ✅ allowed | `SHADE_*` stable string code — never the full message, which may interpolate user input. |
|
||||
| **Op kind** (`shade.op`) | ✅ allowed | `list`, `read`, `write`, `custom:foo`, etc. |
|
||||
| **Route template** (`shade.route`) | ✅ allowed | `/v1/keys/bundle/:address` — the template, never the resolved path. |
|
||||
| **HTTP status** (`shade.http.status`) | ✅ allowed | Integer status code. |
|
||||
| **Partition mode** (`shade.partition`) | ✅ allowed | `range` or `round-robin`. |
|
||||
| **Direction** (`shade.direction`) | ✅ allowed | `upload` or `download`. |
|
||||
| Plaintext peer addresses | ❌ forbidden | Use `peerHash()`. |
|
||||
| Plaintext message/file payloads | ❌ forbidden | Encryption boundary — never log. |
|
||||
| Exact byte counts | ❌ forbidden | Use `bytesBin()`. |
|
||||
| User identifiers (email, DID, `device:UUID`) | ❌ forbidden | Treat as PII. |
|
||||
|
||||
The full attribute-key allow-list is exported from `@shade/observability`
|
||||
as `ATTR_*` constants. Plug-in authors who want to attach their own tags
|
||||
should pass each `(key, value)` through `safeAttribute()`, which throws
|
||||
`UnsafeAttributeError` for any key/value pair that looks like the
|
||||
forbidden categories above (heuristics: `@`, `device:`, `did:`, key
|
||||
fragments such as `peer.address` / `bytes.exact`, oversized strings).
|
||||
|
||||
---
|
||||
|
||||
## Span surface
|
||||
|
||||
### `shade.session.encrypt` / `shade.session.decrypt`
|
||||
|
||||
Wraps each per-peer `encrypt`/`decrypt` call. Includes the time spent
|
||||
waiting on the per-peer mutex (`shade.lock.wait_ms`) — handy for
|
||||
diagnosing ratchet contention under load.
|
||||
|
||||
### `shade.transfer.upload` / `shade.transfer.upload.resume`
|
||||
|
||||
Wraps an outbound stream transfer end-to-end. Attributes: `peer.hash`,
|
||||
`bytes.bin`, `lane.count`, `partition`, `retry.count`, `result`,
|
||||
`error.code`.
|
||||
|
||||
### `shade.transfer.download`
|
||||
|
||||
Started when the consumer calls `incoming.accept(...)`, ended when the
|
||||
transfer completes, aborts, or fails an integrity check. Same attribute
|
||||
set as upload.
|
||||
|
||||
### `shade.prekey.request`
|
||||
|
||||
One span per HTTP request handled by `@shade/server`'s prekey routes.
|
||||
Attributes: `route` (the template), `http.status`, `error.code` on
|
||||
failure. The address path-parameter is **never** placed on the span.
|
||||
|
||||
### `shade.files.op`
|
||||
|
||||
One span per `@shade/files` RPC. Attributes: `peer.hash`, `op` (the
|
||||
resolved op kind, e.g. `read` or `custom:foo`), `bytes.bin` (estimated
|
||||
plaintext size, binned), `result`, `error.code`.
|
||||
|
||||
---
|
||||
|
||||
## Recording & testing
|
||||
|
||||
`@shade/observability` ships a deterministic in-memory recorder for
|
||||
unit tests:
|
||||
|
||||
```ts
|
||||
import { createRecorder } from '@shade/observability';
|
||||
|
||||
const rec = createRecorder();
|
||||
const shade = await createShade({ ..., observability: rec });
|
||||
|
||||
// … exercise code under test …
|
||||
|
||||
const hits = rec.scanForPII(['alice@example.com', 'plaintext-secret']);
|
||||
expect(hits).toHaveLength(0);
|
||||
```
|
||||
|
||||
The Shade test suite runs this recorder over every documented entry
|
||||
point — see
|
||||
`packages/shade-observability/tests/integration-pii.test.ts` and
|
||||
`packages/shade-transfer/tests/observability.test.ts`. Any new
|
||||
instrumentation must keep the suite green.
|
||||
|
||||
---
|
||||
|
||||
## Performance characteristics
|
||||
|
||||
- With OTel **off** (default): every Shade hook resolves to the shared
|
||||
`NOOP_HOOK` instance. The cost is one function call + an object
|
||||
allocation that V8 hoists out in the steady state — measured at
|
||||
< 1 % overhead vs the pre-V3.4 baseline in the upload roundtrip
|
||||
benchmark.
|
||||
- With OTel **on**: cost depends entirely on the configured exporter.
|
||||
Use `sample: 0.1` (or smaller) on hot paths in production.
|
||||
|
||||
---
|
||||
|
||||
## Adding new instrumentation
|
||||
|
||||
1. Identify a logical operation worth a span — typically anything that
|
||||
crosses a network/disk boundary or contends on a lock.
|
||||
2. Add an `observability?: ObservabilityHook` to the relevant config
|
||||
surface, default to `NOOP_HOOK`.
|
||||
3. Name the span `shade.<area>.<op>` to keep cardinality bounded.
|
||||
4. Set attributes via the `ATTR_*` constants from
|
||||
`@shade/observability`. **Never** introduce a new attribute key
|
||||
without a PII review — if you must, run the value through
|
||||
`safeAttribute()`.
|
||||
5. Add a test that exercises the new instrumentation under the
|
||||
`createRecorder()` recorder and asserts no PII leaks.
|
||||
|
||||
---
|
||||
|
||||
## Migration
|
||||
|
||||
Previous versions had no tracing — only Prometheus metrics. Adding the
|
||||
`observability` field to existing configs is fully backwards-compatible
|
||||
and never required. The `SHADE_OTEL_ENABLED` gate ensures forgetting to
|
||||
flip the env-var in production won't surprise anyone with unexpected
|
||||
overhead.
|
||||
308
docs/recovery.md
Normal file
308
docs/recovery.md
Normal file
@@ -0,0 +1,308 @@
|
||||
# Social Key Recovery (`@shade/recovery`)
|
||||
|
||||
V3.10 closes the biggest UX hole in any E2EE system: **"What happens
|
||||
if I lose my phone?"**. Shade's social-recovery flow lets a user
|
||||
designate `n` guardians (family / friends / co-workers) at setup time
|
||||
such that any threshold-many `k` of them can together restore the
|
||||
user's identity onto a new device — without any single guardian
|
||||
being able to do it alone, and without the prekey server ever seeing
|
||||
the recovered key material.
|
||||
|
||||
The whole flow ships entirely over existing 1:1 Shade sessions; no
|
||||
server-side recovery agent, no escrow service, no "cloud guardian".
|
||||
|
||||
---
|
||||
|
||||
## Threat model recap
|
||||
|
||||
| # | Adversary | Recovered? |
|
||||
|---|-----------|------------|
|
||||
| 1 | Coalition of ≤ k-1 guardians | **No** (information-theoretic, by Shamir construction) |
|
||||
| 2 | Prekey server alone | **No** (server only relays Double-Ratchet ciphertext) |
|
||||
| 3 | Single malicious guardian who forges a share | **Detected** — AES-GCM tag mismatch on the backup blob; `requestRecovery` exhaustively tries threshold-sized subsets and rejects when none authenticate |
|
||||
| 4 | Social engineering (impersonator calls a guardian) | **Mitigated, not eliminated** — guardians MUST OOB-confirm the new device's safety number before approving (see `<RecoveryApprove />`) |
|
||||
| 5 | Compromised guardian device | **Out of scope** — see "Guardian compromise" below |
|
||||
| 6 | Compromised primary device at setup time | **Out of scope** — recovery only protects the device; if setup material is exfiltrated, all bets are off |
|
||||
|
||||
---
|
||||
|
||||
## Setup
|
||||
|
||||
### What the user does
|
||||
|
||||
1. Pick `n` guardians from their existing peers.
|
||||
2. Pick a threshold `k` (typically `⌈n/2⌉ + 1` to avoid pure-majority
|
||||
dominance but still survive losing one or two).
|
||||
3. Run `setupRecovery(...)`.
|
||||
4. Print / record a **recovery card** with:
|
||||
- The user's own address
|
||||
- `setupId`
|
||||
- `k` and `n`
|
||||
- The list of guardian addresses
|
||||
- Setup-time safety number
|
||||
|
||||
The recovery card is the only piece of state the user must remember
|
||||
out-of-band (or store in a password manager). Without it, the user
|
||||
cannot drive recovery on a new device — the new device needs to know
|
||||
who the guardians are.
|
||||
|
||||
### What happens cryptographically
|
||||
|
||||
```text
|
||||
recoveryKey = random(32 bytes)
|
||||
backupBlob = Shade.exportBackup(passphrase = "shade-rk:" + base64url(recoveryKey),
|
||||
knownAddresses = [...])
|
||||
shares[i] = Shamir-split(recoveryKey, k, n)
|
||||
```
|
||||
|
||||
For each guardian `i`:
|
||||
|
||||
```text
|
||||
share-deposit envelope:
|
||||
shadeRecovery: 1
|
||||
type: "share-deposit"
|
||||
flowId, setupId, originalAddress
|
||||
threshold (k), guardianCount (n), shareIndex (i)
|
||||
shareBytes: base64url( encodeShare(shares[i]) )
|
||||
backupBlob: Shade.exportBackup output (identical for every guardian)
|
||||
setupFingerprint, createdAt
|
||||
```
|
||||
|
||||
The envelope rides through `Shade.send` like any other plaintext —
|
||||
double-ratchet encrypted, AAD-bound, replay-safe.
|
||||
|
||||
The `recoveryKey` is **zeroized** on the primary device immediately
|
||||
after the split returns. The primary therefore retains nothing
|
||||
except `setupId` and the public roster.
|
||||
|
||||
### What each guardian stores
|
||||
|
||||
Per (`originalAddress`, `setupId`):
|
||||
|
||||
```text
|
||||
{
|
||||
shareIndex, // 1..n
|
||||
shareBytes, // base64url-encoded Shamir share
|
||||
backupBlob, // identical for every guardian
|
||||
setupFingerprint, // for sanity-checks at recovery time
|
||||
guardianCount, threshold,
|
||||
receivedAt
|
||||
}
|
||||
```
|
||||
|
||||
The guardian's app provides a `RecoveryStore` implementation. The
|
||||
package ships `MemoryRecoveryStore` for tests and small one-shot
|
||||
demos; production guardian apps MUST supply a persistent store
|
||||
(IndexedDB, AsyncStorage, SQLite, etc.). See "Persistence
|
||||
recommendations" below.
|
||||
|
||||
---
|
||||
|
||||
## Recovery
|
||||
|
||||
### What the user does on the new device
|
||||
|
||||
1. Boot a fresh Shade with a temporary identity.
|
||||
2. Read the recovery card.
|
||||
3. In the recovery widget, type / paste:
|
||||
- `originalAddress`
|
||||
- `setupId`
|
||||
- `threshold`
|
||||
- The guardian roster
|
||||
4. Read the new device's safety number (the widget displays it
|
||||
prominently) to each guardian over a side channel — phone call,
|
||||
in person, whatever they trust.
|
||||
5. Wait for `≥ k` guardians to approve.
|
||||
|
||||
### What happens cryptographically
|
||||
|
||||
For each guardian, the new device sends:
|
||||
|
||||
```text
|
||||
recovery-request envelope:
|
||||
shadeRecovery: 1
|
||||
type: "recovery-request"
|
||||
flowId, originalAddress, setupId
|
||||
requesterFingerprint (= safety number of the temporary identity)
|
||||
requestedAt
|
||||
```
|
||||
|
||||
Each guardian's `attachGuardian` handler:
|
||||
|
||||
1. Looks up its stored deposit by `(originalAddress, setupId)`. If
|
||||
missing, replies with `share-decline` (`reason = "unknown setup"`).
|
||||
2. Invokes the `approve` callback with the requester's address +
|
||||
fingerprint + the original device's setup-time fingerprint. The
|
||||
callback is the **OOB-confirmation gate** — it MUST require an
|
||||
explicit user click after they verified the fingerprint. The
|
||||
`<RecoveryApprove />` widget enforces this with a two-checkbox
|
||||
gate.
|
||||
3. On approve → ships `share-grant`. On reject → ships
|
||||
`share-decline` with a short reason.
|
||||
|
||||
The new device collects grants, and as soon as `k` arrive:
|
||||
|
||||
1. Combines the `k` shares via Lagrange interpolation at `x = 0` to
|
||||
reconstruct `recoveryKey`.
|
||||
2. Derives `passphrase = "shade-rk:" + base64url(recoveryKey)`.
|
||||
3. Calls `Shade.importBackup(backupBlob, passphrase)` — the
|
||||
AES-GCM tag in the blob authenticates the reconstruction. **A
|
||||
forged share is detected here.**
|
||||
4. If a guardian forged a share, `importBackup` throws. The
|
||||
reconstruction loop then tries every other threshold-sized subset
|
||||
of grants until one authenticates (the V3.10 acceptance criterion
|
||||
"no coalition of (k-1) guardians can rebuild the secret" is the
|
||||
safety invariant; the AEAD authenticates which subset is
|
||||
honest).
|
||||
5. If every subset fails, `RecoveryReconstructionError` is raised
|
||||
and the user is told that at least one guardian is malicious.
|
||||
|
||||
After `importBackup` succeeds, the new device hosts the original
|
||||
identity and immediately calls `Shade.rotate()` to retire the
|
||||
recovery-recovered key material from the conversation graph (the
|
||||
old session keys persisted in the backup blob are now considered
|
||||
"compromised — used for recovery").
|
||||
|
||||
> **The `Shade.beforeBackupImport` gate fires automatically.**
|
||||
> Without a registered handler the SDK falls back to TOFU-with-warning
|
||||
> (consistent with the V3.3 contract). Production apps SHOULD register
|
||||
> a handler that pops the user one more confirmation before the
|
||||
> identity rotates.
|
||||
|
||||
---
|
||||
|
||||
## Acceptance criteria status
|
||||
|
||||
- [x] **3-of-5 recovery works end-to-end on two separate Shade
|
||||
instances.** See `tests/integration.test.ts`.
|
||||
- [x] **No coalition of (k-1) guardians can reconstruct
|
||||
`recoveryKey`.** Property test asserts this with `fast-check`
|
||||
across random k/n configurations.
|
||||
See `tests/shamir.test.ts` and
|
||||
`tests/adversarial.test.ts`.
|
||||
- [x] **Guardian-side widget requires fingerprint-confirmation
|
||||
before sending.** `<RecoveryApprove />` enforces a
|
||||
two-checkbox gate; `tests/adversarial.test.ts` exercises
|
||||
both the matching-OOB and rejecting-OOB code paths.
|
||||
|
||||
---
|
||||
|
||||
## Persistence recommendations
|
||||
|
||||
The `RecoveryStore` interface is intentionally small (4 methods).
|
||||
Pick the implementation that fits your platform:
|
||||
|
||||
| Platform | Suggested backing store |
|
||||
|--------------------------|----------------------------------------|
|
||||
| Browser (PWA) | IndexedDB (one object store, idb) |
|
||||
| Browser (extension) | `chrome.storage.local` |
|
||||
| React Native | AsyncStorage (with crypto-protected blob) |
|
||||
| Bun / Node server | SQLite via `@shade/storage-sqlite` extension table OR a side file |
|
||||
| Android (native) | Room / EncryptedSharedPreferences |
|
||||
|
||||
Whatever you pick, the records ARE NOT secret on their own — without
|
||||
threshold-many other guardians' shares they're useless — but they
|
||||
should still be stored encrypted-at-rest like any other Shade state.
|
||||
Do not commit them to plaintext logs or network-replicated state.
|
||||
|
||||
---
|
||||
|
||||
## Guardian-UX guide
|
||||
|
||||
### How many guardians?
|
||||
|
||||
| n | Survives | Comment |
|
||||
|---|----------|---------|
|
||||
| 3, k=2 | 1 lost guardian | Minimum useful — one device away from danger |
|
||||
| 5, k=3 | 2 lost guardians | Sweet spot for most users |
|
||||
| 7, k=4 | 3 lost guardians | Suitable when you genuinely have 7+ trustworthy people |
|
||||
| n=k | 0 lost | DO NOT USE — single point of failure |
|
||||
|
||||
The widget defaults to `k = ⌈n/2⌉` which is liberal but
|
||||
collusion-resistant for `n ≥ 3`. Apps targeting paranoid users may
|
||||
want to bump that to `⌈2n/3⌉`.
|
||||
|
||||
### Replacing a guardian
|
||||
|
||||
If a guardian dies, loses their device permanently, or you no longer
|
||||
trust them:
|
||||
|
||||
1. Pick a replacement.
|
||||
2. Run `setupRecovery` again with the new roster — this generates a
|
||||
fresh `setupId` and a fresh `recoveryKey`. The old shares become
|
||||
garbage (no guardian set can use them, because the
|
||||
`backupBlob` is different).
|
||||
|
||||
The widget records the new `setupId` on the recovery card. Treat
|
||||
this as a hard rotation; the user MUST re-record the card.
|
||||
|
||||
### Guardian health checks
|
||||
|
||||
Periodically (the V3.10 plan suggests a quarterly prompt), the user
|
||||
should confirm each guardian is still reachable. Any guardian who
|
||||
can't be reached in two consecutive prompts SHOULD trigger a
|
||||
re-setup with a fresh roster. The widget UX track is to be added in
|
||||
a follow-up release; the primitive is in place.
|
||||
|
||||
---
|
||||
|
||||
## Wiring example
|
||||
|
||||
```ts
|
||||
import {
|
||||
setupRecovery,
|
||||
attachGuardian,
|
||||
requestRecovery,
|
||||
MemoryRecoveryStore,
|
||||
} from '@shade/recovery';
|
||||
|
||||
// On the primary device:
|
||||
const result = await setupRecovery({
|
||||
shade,
|
||||
guardians: ['bob', 'carol', 'dan', 'eve', 'faythe'],
|
||||
threshold: 3,
|
||||
deliver: async (to, envelope) => {
|
||||
// wire to your app's existing message-delivery layer
|
||||
await myMessageOutbox.send(to, envelope);
|
||||
},
|
||||
});
|
||||
console.log(result.setupId);
|
||||
|
||||
// On each guardian device:
|
||||
const stop = attachGuardian({
|
||||
shade,
|
||||
store: myPersistentStore, // see "Persistence" above
|
||||
approve: async (ctx) => {
|
||||
// Show ctx.requesterFingerprint to the user.
|
||||
// Block until they confirm OOB and click "Release share".
|
||||
return await myUI.askApproval(ctx);
|
||||
},
|
||||
deliver: myMessageOutbox.send,
|
||||
});
|
||||
|
||||
// On the new device:
|
||||
const recovered = await requestRecovery({
|
||||
shade: temporaryShade, // fresh identity for now
|
||||
originalAddress: 'alice',
|
||||
setupId: 'sid-from-recovery-card',
|
||||
threshold: 3,
|
||||
guardians: ['bob', 'carol', 'dan', 'eve', 'faythe'],
|
||||
deliver: myMessageOutbox.send,
|
||||
onProgress: (p) => myUI.showProgress(p),
|
||||
});
|
||||
// `temporaryShade` now hosts the original identity.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Out of scope (V3.10)
|
||||
|
||||
- **Cloud guardian / Shade-operated recovery agent.** Explicit
|
||||
non-goal; the spec rejects any centralized component that can
|
||||
recover on its own.
|
||||
- **Auto-distribution.** The user must explicitly pick guardians.
|
||||
- **Multi-share-per-guardian.** Each guardian holds exactly one
|
||||
share. Apps that need redundancy should bump `n`, not give the
|
||||
same guardian multiple shares.
|
||||
- **Guardian ZK-proofs of liveness.** A guardian who refuses to
|
||||
respond is treated as offline; we don't try to compel them.
|
||||
160
docs/storage-encryption.md
Normal file
160
docs/storage-encryption.md
Normal file
@@ -0,0 +1,160 @@
|
||||
# At-Rest Storage Encryption (V3.2)
|
||||
|
||||
**Status:** Implemented in `@shade/storage-encrypted` 0.4.0
|
||||
**Adresses:** THREAT-MODEL §4 — Compromised device storage
|
||||
|
||||
Shade's default `SQLiteStorage` and `PostgresStorage` write private keys and
|
||||
session state to disk *unencrypted* — the threat model assumes the DB lives
|
||||
inside a trusted environment. For deployments that need defence in depth,
|
||||
`@shade/storage-encrypted` adds opt-in at-rest encryption: a stolen DB file
|
||||
alone yields no usable private key material.
|
||||
|
||||
## At a glance
|
||||
|
||||
```ts
|
||||
import { KeyManager, EncryptedSQLiteStorage } from '@shade/storage-encrypted';
|
||||
|
||||
const km = await KeyManager.open({
|
||||
kind: 'passphrase',
|
||||
passphrase: process.env.SHADE_STORAGE_PASSPHRASE!,
|
||||
salt: loadSaltFromDisk(), // 16+ bytes, persisted alongside the DB
|
||||
});
|
||||
|
||||
const storage = await EncryptedSQLiteStorage.open({
|
||||
dbPath: '/data/shade-client.db',
|
||||
keyManager: km,
|
||||
});
|
||||
|
||||
// Use it exactly like SQLiteStorage — implements the same StorageProvider.
|
||||
const manager = new ShadeSessionManager(crypto, storage);
|
||||
```
|
||||
|
||||
## What is encrypted
|
||||
|
||||
Per-row AEAD over the sensitive payload of every row:
|
||||
|
||||
| Table | Encrypted |
|
||||
|--------------------------------|-----------|
|
||||
| `identity_enc` | the entire keypair (4× 32-byte keys) |
|
||||
| `config_enc` | `registrationId` |
|
||||
| `signed_prekeys_enc` | full `SignedPreKey` (incl. private half) |
|
||||
| `one_time_prekeys_enc` | full `OneTimePreKey` |
|
||||
| `sessions_enc` | the Double-Ratchet `SessionState` JSON |
|
||||
| `trusted_identities_enc` | the trusted peer identity key |
|
||||
| `retired_identities_enc` | full retired keypair |
|
||||
| `stream_state_enc.ciphertext` | partition / lane / IO descriptor / streamSecret |
|
||||
|
||||
Routing fields on `stream_state_enc` (`stream_id`, `direction`,
|
||||
`peer_address`, `status`, timestamps) stay plaintext so `listActiveStreamStates()`
|
||||
remains an indexed query.
|
||||
|
||||
## Cryptographic design
|
||||
|
||||
```
|
||||
masterKey (passphrase / keychain / app-injected)
|
||||
│
|
||||
├─ HKDF-SHA-256("shade-storage-v1") → storageKey (32 bytes)
|
||||
│ └─ HKDF-SHA-256(storageKey, "shade-field-v1:{table}:{column}") → fieldKey (32 bytes)
|
||||
│
|
||||
└─ Used (transitively) for fingerprint checks
|
||||
```
|
||||
|
||||
For each encrypted blob:
|
||||
|
||||
- `nonce = HKDF(fieldKey, "shade-row-nonce-v1:{table}:{pk}")[..12]` —
|
||||
deterministic per (key, row), safe because the per-(table, column)
|
||||
fieldKey is unique. AES-GCM nonce reuse is catastrophic only if the
|
||||
*same* key is reused with the *same* nonce on different plaintexts;
|
||||
here every (key, row) pair has a unique nonce.
|
||||
- `aad = "shade-aad-v1|{table}|{column}|{pk}"` — binds the ciphertext
|
||||
to its row identity so a row swap or column move triggers decrypt
|
||||
failure.
|
||||
- `wire = nonce(12) || ciphertext || tag(16)` — stored as a single
|
||||
`BLOB`/`BYTEA` column.
|
||||
|
||||
## Key sources
|
||||
|
||||
`KeyManager.open(...)` accepts three sources:
|
||||
|
||||
1. **Passphrase + KDF** — scrypt over `(passphrase, salt)`. Default
|
||||
parameters: `N=2^17, r=8, p=1, dkLen=32` (~250 ms on a modern laptop).
|
||||
The salt MUST be persisted alongside the DB (e.g. `<db>.salt`).
|
||||
2. **OS keychain** — via `@shade/keychain`. Backends:
|
||||
- macOS: `security` CLI (Keychain).
|
||||
- Linux: `secret-tool` (libsecret).
|
||||
- Windows: PowerShell + `CredentialManager` module.
|
||||
No native deps; `createIfMissing: true` generates and stores a fresh
|
||||
32-byte key.
|
||||
3. **App-injected** — caller supplies a 32-byte raw key. Most flexible;
|
||||
plug your own KMS / HSM / Vault path here.
|
||||
|
||||
Wrong-passphrase detection is built in: a fingerprint of the storageKey
|
||||
is persisted in `shade_meta_enc` on first open and compared on every
|
||||
subsequent open. A mismatch raises with a clear error — never silently
|
||||
writing under the wrong key.
|
||||
|
||||
## Migration
|
||||
|
||||
CLI:
|
||||
|
||||
```bash
|
||||
# Encrypt an existing unencrypted DB (atomic per row, .bak written first).
|
||||
shade migrate-storage \
|
||||
--key-source passphrase \
|
||||
--passphrase "$SHADE_STORAGE_PASSPHRASE" \
|
||||
--salt-file /data/shade-client.db.salt
|
||||
|
||||
# Validate without writing.
|
||||
shade migrate-storage ... --dry-run
|
||||
|
||||
# Keychain mode.
|
||||
shade migrate-storage --key-source keychain \
|
||||
--keychain-service shade.storage --keychain-account default
|
||||
|
||||
# Inject a raw key (e.g. from your KMS).
|
||||
shade migrate-storage --key-source injected \
|
||||
--key-hex "$(cat ~/.shade/storage.key.hex)"
|
||||
```
|
||||
|
||||
The migration is *resumable*: re-running it on a partially-migrated DB
|
||||
re-writes the same rows under the same key (idempotent). On clean
|
||||
completion, the unencrypted tables are dropped (use `--keep-original`
|
||||
to preserve them).
|
||||
|
||||
## Rotation
|
||||
|
||||
```bash
|
||||
shade rotate-storage-key \
|
||||
--key-source passphrase --passphrase "$OLD_PASS" \
|
||||
--new-key-source passphrase --new-passphrase "$NEW_PASS" \
|
||||
--new-salt-file /data/shade-client.db.salt.new
|
||||
```
|
||||
|
||||
Reads each encrypted row under the old key, re-seals under the new key.
|
||||
The DB stays online; brief read-after-write inconsistency for in-flight
|
||||
readers is acceptable for the supported deployments (CLI tools,
|
||||
single-process servers). On completion the fingerprint is updated and
|
||||
the old key no longer opens the DB.
|
||||
|
||||
## What this does *not* protect
|
||||
|
||||
Even with at-rest enabled:
|
||||
|
||||
- A live process holds the storageKey and fieldKeys in memory. An attacker
|
||||
who can dump process memory (`/proc/<pid>/mem`, swap, hibernation,
|
||||
coredump) recovers the keys.
|
||||
- Swap is not encrypted by Shade. Use an encrypted swap device.
|
||||
- The `.bak` file produced during migration is plaintext during the
|
||||
migration window. Treat it like the original DB and store securely.
|
||||
- Lost master key = lost DB. V3.10 (Social Recovery) is the long-term
|
||||
mitigation.
|
||||
|
||||
See `THREAT-MODEL.md` §4 for the full list, including the "with at-rest
|
||||
enabled" boundary.
|
||||
|
||||
## Cross-implementation parity
|
||||
|
||||
`test-vectors/storage-encryption.json` pins KDF parameters, info strings,
|
||||
nonce derivation, and AAD format. The Android implementation (V3.5) MUST
|
||||
produce byte-identical outputs for the same inputs — covered by
|
||||
`packages/shade-storage-encrypted/tests/test-vectors.test.ts`.
|
||||
253
docs/streams.md
253
docs/streams.md
@@ -107,11 +107,264 @@ manually after rotation.
|
||||
| S7 | seq overflow practical-impossible (u64 max) |
|
||||
| S8 | At-rest streamSecret encrypted under device-key |
|
||||
|
||||
## Hardening
|
||||
|
||||
`@shade/streams` ships unbounded by default — a peer can declare a
|
||||
1 PiB transfer and the receiver will dutifully allocate lane state for
|
||||
it. Production receivers must enforce limits at the boundary. The
|
||||
`@shade/files` package wires the same patterns up for its filesystem
|
||||
RPC; copy the shapes that fit your app.
|
||||
|
||||
### Per-stream caps
|
||||
|
||||
The receiver sees the declared plaintext size in the `stream-init`
|
||||
control message before it accepts. Reject above your tolerance:
|
||||
|
||||
```ts
|
||||
shade.onIncomingTransfer(async (incoming) => {
|
||||
if (incoming.metadata.totalBytes > 256 * 1024 * 1024) {
|
||||
await incoming.decline({ reason: 'stream too large' });
|
||||
return;
|
||||
}
|
||||
await incoming.accept({ output: ... });
|
||||
});
|
||||
```
|
||||
|
||||
Recommended ceilings (tune to your product, not these):
|
||||
|
||||
| Tier | totalBytes ceiling | Rationale |
|
||||
|------|--------------------|-----------|
|
||||
| Chat attachment | 25 MiB | matches mobile MMS / Slack expectations |
|
||||
| Photo / doc share | 256 MiB | covers raw RAW + most desktop docs |
|
||||
| Backup / dataset | 4 GiB | larger needs explicit operator opt-in |
|
||||
|
||||
### Per-chunk cap
|
||||
|
||||
`createTransferRoutes` accepts `maxChunkBytes` (default ≈ 16 MiB +
|
||||
header). Lower it if your sink can't absorb that — the receiver will
|
||||
413 anything over the limit before the chunk is decrypted, which
|
||||
keeps DoS cost bounded.
|
||||
|
||||
### Per-sender quotas
|
||||
|
||||
`@shade/files` ships a `RateLimiter` (`packages/shade-files/src/server/rate-limiter.ts`)
|
||||
that enforces both ops-per-window and bytes-per-hour caps per sender
|
||||
address. The same shape is the recommended template for guarding raw
|
||||
streams: wrap `incoming.accept` in a check that consumes from a token
|
||||
bucket keyed by `incoming.fromAddress`, and reject with `decline()`
|
||||
when the bucket is empty. See
|
||||
`packages/shade-files/tests/security/quota.test.ts` for the test
|
||||
shape.
|
||||
|
||||
### TTL on idle streams
|
||||
|
||||
A `paused` stream-state record consumes a row in your storage and an
|
||||
encrypted streamSecret slot until it expires. Use the **Retention**
|
||||
defaults below to expire abandoned streams; pair with a metric
|
||||
(`shade_stream_states_active`) and an alert when the count grows
|
||||
unbounded. A peer that opens streams and never finishes them is the
|
||||
dominant abuse pattern for resumable transfer.
|
||||
|
||||
### Trust gates
|
||||
|
||||
For high-stakes transfers (backups, key material, internal docs),
|
||||
gate `accept()` on a verified fingerprint. The pattern mirrors
|
||||
`@shade/files`'s fingerprint gate — see
|
||||
`packages/shade-files/tests/security/fingerprint-gate.test.ts`.
|
||||
|
||||
## Retention
|
||||
|
||||
Resumable streams persist a `PersistedStreamState` per in-flight
|
||||
transfer, encrypted under a device key. Without retention, every
|
||||
crashed or abandoned upload leaves a row behind forever.
|
||||
|
||||
### Defaults
|
||||
|
||||
The shipped `bun-server` SDK template (`shade init --template bun-server`)
|
||||
schedules `pruneStreamStates` on a daily cron with a **14-day**
|
||||
horizon. That is: any stream-state record whose `updatedAt` is older
|
||||
than 14 days is removed at the next sweep. If a sender resumes a
|
||||
14-day-old stream, it will get a "no state" 404 and start over —
|
||||
which is the right answer for a transfer that has been idle for two
|
||||
weeks.
|
||||
|
||||
### Tuning the horizon
|
||||
|
||||
Set `SHADE_STREAM_RETENTION_DAYS` in the template's environment to
|
||||
override the 14-day default. Recommended ranges:
|
||||
|
||||
| Use case | Horizon | Why |
|
||||
|----------|---------|-----|
|
||||
| Synchronous chat | 1–3 days | resume-after-crash, not resume-after-vacation |
|
||||
| File-share product | 7–14 days | covers a typical user vacation |
|
||||
| Cold backup target | 30+ days | deliberate, but plan for storage growth |
|
||||
|
||||
### Hooking the prune call manually
|
||||
|
||||
If you bring your own server (no `bun-server` template), call the
|
||||
storage method on your own schedule:
|
||||
|
||||
```ts
|
||||
import { setInterval } from 'node:timers';
|
||||
|
||||
const ONE_DAY_MS = 24 * 60 * 60 * 1000;
|
||||
const HORIZON_MS = 14 * ONE_DAY_MS;
|
||||
|
||||
setInterval(async () => {
|
||||
if (storage.pruneStreamStates !== undefined) {
|
||||
await storage.pruneStreamStates(Date.now() - HORIZON_MS);
|
||||
}
|
||||
}, ONE_DAY_MS);
|
||||
```
|
||||
|
||||
`pruneStreamStates(olderThan)` removes records whose `updatedAt` is
|
||||
strictly less than `olderThan`. It is idempotent and safe to call
|
||||
concurrently.
|
||||
|
||||
## Rich file metadata + previews (V3.9)
|
||||
|
||||
`stream-init` plaintext can carry an optional `fileMetadata` field that
|
||||
ships filename, MIME-type, and a thumbnail-stream pointer **end-to-end
|
||||
encrypted**. Older receivers ignore the field — backwards-compatible
|
||||
with 0.2.x / 0.3.x peers.
|
||||
|
||||
```jsonc
|
||||
{
|
||||
"kind": "shade.stream-init/v1",
|
||||
"streamId": "...",
|
||||
"streamSecret": "...",
|
||||
"metadata": {
|
||||
"chunkSize": 1048576,
|
||||
"sentAt": 1730000000000,
|
||||
"fileMetadata": {
|
||||
"filename": "report.pdf",
|
||||
"mimeType": "application/pdf",
|
||||
"thumbnailStreamId": "Ej1z...",
|
||||
"thumbnailHash": "9a7c...",
|
||||
"thumbnailMime": "image/webp",
|
||||
"thumbnailBytes": 18342
|
||||
}
|
||||
},
|
||||
"lanes": [ /* ... */ ]
|
||||
}
|
||||
```
|
||||
|
||||
### What rides where
|
||||
|
||||
| Field | Plane | Visible to server? |
|
||||
|-------|-------|--------------------|
|
||||
| `filename` | inside Double Ratchet plaintext | no |
|
||||
| `mimeType` | inside Double Ratchet plaintext | no |
|
||||
| `thumbnailStreamId` | streamId of companion stream | yes (random ID, no info leak) |
|
||||
| `thumbnailHash` | sha256 of preview plaintext | base64 hash only, no pixels |
|
||||
| `thumbnailMime` | one of `image/jpeg / image/webp / image/png` | yes (allowlist enforced) |
|
||||
| `thumbnailBytes` | declared length, capped at 64 KiB | yes |
|
||||
| thumbnail bytes themselves | separate AEAD stream, own lane | no |
|
||||
|
||||
The thumbnail rides as its **own stream-transfer**, keyed independently
|
||||
from the main stream. A server compromise leaks neither preview pixels
|
||||
nor original bytes.
|
||||
|
||||
### Sender — attach a preview
|
||||
|
||||
```ts
|
||||
// Pre-computed preview (server-side pipeline path):
|
||||
await shade.upload({
|
||||
to: 'bob',
|
||||
input: pdfBytes,
|
||||
thumbnail: { bytes: previewWebp, mime: 'image/webp' },
|
||||
metadata: { fileMetadata: { filename: 'report.pdf', mimeType: 'application/pdf' } },
|
||||
});
|
||||
|
||||
// Browser auto-generation (image File / Blob → 256×256 preview):
|
||||
await shade.upload({
|
||||
to: 'bob',
|
||||
input: imageFile, // a `File` from <input type="file">
|
||||
generateThumbnail: true, // OffscreenCanvas + createImageBitmap
|
||||
});
|
||||
```
|
||||
|
||||
`generateThumbnail` is a no-op on runtimes lacking
|
||||
`OffscreenCanvas + createImageBitmap` (Bun, Node) — those callers should
|
||||
pre-generate and pass `thumbnail` directly, or skip the preview entirely.
|
||||
|
||||
### Receiver — render in widgets
|
||||
|
||||
The bundled `@shade/widgets` `useShadeDownload` hook auto-accepts
|
||||
thumbnail streams (marked by `userMetadata.shadeThumbnail = '1'`) into
|
||||
an in-memory `ShadeThumbnailCache`. `<TransferRow showThumbnail
|
||||
fileMetadata={...} />` reads from the same cache and renders inside an
|
||||
`<img>` element so the browser's image-decoding sandbox is the trust
|
||||
boundary for format parsing.
|
||||
|
||||
```tsx
|
||||
<ShadeThumbnailProvider>
|
||||
<TransferRow
|
||||
handle={handle}
|
||||
progress={progress}
|
||||
showThumbnail
|
||||
fileMetadata={incoming.metadata.fileMetadata}
|
||||
/>
|
||||
</ShadeThumbnailProvider>
|
||||
```
|
||||
|
||||
### Format-hardening (sender + receiver)
|
||||
|
||||
Both sides enforce the same rules — single source of truth in
|
||||
`@shade/streams/file-metadata.ts`:
|
||||
|
||||
| Rule | Limit |
|
||||
|------|-------|
|
||||
| `thumbnailMime` allowlist | `image/jpeg`, `image/webp`, `image/png` |
|
||||
| `thumbnailBytes` cap | 64 KiB (`THUMBNAIL_MAX_BYTES`) |
|
||||
| `filename` length | ≤ 1024 chars, no control characters |
|
||||
| `mimeType` shape | RFC 7231 `type/subtype` token |
|
||||
| Hash binding | declared `thumbnailHash` = sha256(preview bytes); mismatched bytes are dropped at the cache before any render |
|
||||
|
||||
A hostile peer cannot:
|
||||
- smuggle exotic image formats past the allowlist (envelope parser
|
||||
rejects at decode-time),
|
||||
- substitute different bytes for a declared preview (cache verifies
|
||||
sha256 before exposing bytes to a renderer),
|
||||
- inflate the cache to OOM the receiver (LRU + 1 MiB total cap).
|
||||
|
||||
### Risks consciously accepted
|
||||
|
||||
- **Preview-arrival ≠ send completion.** A receiver may see the
|
||||
thumbnail before the main upload finishes. For high-stakes flows
|
||||
where "did Alice send X?" is itself sensitive, send the preview
|
||||
*only* after main completion (set `thumbnail` to `null` and instead
|
||||
ship a follow-up `stream-init` with the preview). The default
|
||||
ordering optimizes UX, not metadata-secrecy.
|
||||
- **Renderer trust.** We render through a Blob-URL `<img>`. A 0-day
|
||||
in the browser's image decoder would still reach the receiver. Keep
|
||||
browsers patched; rely on the CSP of your embedding app.
|
||||
|
||||
## API surface
|
||||
|
||||
See package READMEs:
|
||||
|
||||
- `packages/shade-streams/README.md` — crypto + state machines
|
||||
- `packages/shade-transfer/README.md` — orchestration, transports, persistence
|
||||
- `packages/shade-transport-webrtc/README.md` — V3.11 P2P transport plug-in
|
||||
- `packages/shade-sdk/README.md` — magic drop-in
|
||||
- `packages/shade-widgets/README.md` — React UI
|
||||
|
||||
## Transports
|
||||
|
||||
`@shade/transfer` ships HTTP + WebSocket chunk transports. V3.11 adds an
|
||||
opt-in P2P chunk transport via `RTCDataChannel`:
|
||||
|
||||
- HTTP — `ShadeTransferHttpTransport`. POST per chunk; the receiver-
|
||||
side route is `app.route('/v1/transfer', await shade.transferRoute())`.
|
||||
- WebSocket — `ShadeTransferWsTransport`. One connection per peer,
|
||||
binary-framed chunks, JSON acks; same wire format inside the frame as
|
||||
the WebRTC transport.
|
||||
- WebRTC — `WebRtcTransferTransport` from `@shade/transport-webrtc`.
|
||||
Wired automatically by `shade.configureWebRTC()` as the primary
|
||||
layer of a `MultiTransportFallback([webrtc, http])`. See
|
||||
[docs/webrtc.md](./webrtc.md).
|
||||
|
||||
`MultiTransportFallback` is the N-ary generalisation of
|
||||
`FallbackTransferTransport`: pass an ordered list of named transports
|
||||
and the engine demotes sticky on `TransferTransportError`.
|
||||
|
||||
224
docs/transport.md
Normal file
224
docs/transport.md
Normal file
@@ -0,0 +1,224 @@
|
||||
# Shade Transport — Bridge Layer (V3.7)
|
||||
|
||||
> **Looking for V3.11 (peer-to-peer chunk transport via `RTCDataChannel`)?**
|
||||
> See [docs/webrtc.md](./webrtc.md). This page covers the V3.7 bridge
|
||||
> layer that ships ciphertext *envelopes* (control plane) over
|
||||
> WS / SSE / long-poll. The two are orthogonal: the bridge handles
|
||||
> store-and-forward control envelopes; WebRTC handles direct chunk data.
|
||||
|
||||
The bridge layer is the answer to: **"my client is a browser extension /
|
||||
strict-corp-proxy / edge-runtime / iOS app — I cannot keep a WebSocket
|
||||
open. How do I receive ciphertext envelopes?"**
|
||||
|
||||
It is built on top of the V3.6 inbox: every transport delivers the same
|
||||
inbox blobs, with the same authentication semantics. Application code
|
||||
sees a single `IncomingMessage` shape and never branches on transport.
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ application code │
|
||||
│ │
|
||||
│ bridge.connect({ onMessage: (m) => decrypt(m.bytes) }) │
|
||||
└────────────────────────────────┬────────────────────────────────┘
|
||||
│
|
||||
┌─────────────────────────┴──────────────────────────┐
|
||||
│ FallbackBridgeTransport │
|
||||
│ (sticky-after-first-success) │
|
||||
└──┬──────────────────┬─────────────────────────┬────┘
|
||||
│ │ │
|
||||
┌──────▼─────┐ ┌──────▼─────┐ ┌──────▼─────┐
|
||||
│ WsBridge │ │ SseBridge │ │ LongPoll │
|
||||
│ /v1/ │ │ /v1/ │ │ Bridge │
|
||||
│ bridge/ws │ │ bridge/ │ │ /v1/bridge │
|
||||
│ │ │ stream │ │ /poll │
|
||||
└──────┬─────┘ └──────┬─────┘ └──────┬─────┘
|
||||
│ │ │
|
||||
└──────────────────┼─────────────────────────┘
|
||||
│
|
||||
┌─────▼──────┐
|
||||
│ inbox │ ← the same V3.6 store
|
||||
│ blobs │ and events
|
||||
└────────────┘
|
||||
```
|
||||
|
||||
## When to reach for which
|
||||
|
||||
| Transport | Latency | Proxy resilience | Browser | Server cost |
|
||||
|-------------|----------|------------------|---------|-------------|
|
||||
| WebSocket | ms | breaks under strict CONNECT-blocking proxies | ✓ | one socket per client |
|
||||
| SSE | ms | passes most HTTP proxies (text/event-stream) | ✓ | one streamed response per client |
|
||||
| long-poll | ≤ 25 s | passes anything that allows GET | ✓ | one held request per client |
|
||||
|
||||
The recommended composition:
|
||||
|
||||
```ts
|
||||
import {
|
||||
FallbackBridgeTransport,
|
||||
WsBridge,
|
||||
SseBridge,
|
||||
LongPollBridge,
|
||||
} from '@shade/transport-bridge';
|
||||
|
||||
const auth = {
|
||||
crypto, // CryptoProvider
|
||||
signingPrivateKey, // recipient's Ed25519 private key
|
||||
address: 'bob',
|
||||
};
|
||||
|
||||
const bridge = new FallbackBridgeTransport([
|
||||
new WsBridge({ baseUrl: 'https://relay.example.com', auth }),
|
||||
new SseBridge({ baseUrl: 'https://relay.example.com', auth }),
|
||||
new LongPollBridge({ baseUrl: 'https://relay.example.com', auth }),
|
||||
]);
|
||||
|
||||
await bridge.connect({
|
||||
onMessage: async (msg) => {
|
||||
// msg.bytes is a Uint8Array — pass it to your decrypt path.
|
||||
// msg.from is the relay-known sender hint (may be empty); the
|
||||
// authoritative sender comes from the decrypted envelope.
|
||||
// msg.msgId is the relay's deterministic message id (sha256(ciphertext)).
|
||||
const envelope = decodeEnvelope(msg.bytes);
|
||||
await shade.receive(senderAddress, envelope);
|
||||
},
|
||||
});
|
||||
|
||||
// Read which transport the fallback chain settled on:
|
||||
console.log(bridge.activeKind); // "ws" | "sse" | "long-poll"
|
||||
```
|
||||
|
||||
## The IncomingMessage shape
|
||||
|
||||
```ts
|
||||
interface IncomingMessage {
|
||||
from: string; // relay-side sender hint (may be "")
|
||||
bytes: Uint8Array; // the ciphertext envelope, exactly as PUT
|
||||
receivedAt: number; // relay-monotonic cursor — NOT wall-clock arrival
|
||||
msgId?: string; // sha256(bytes) — useful for ack/dedup
|
||||
}
|
||||
```
|
||||
|
||||
`from` is intentionally a hint — sender provenance lives inside the
|
||||
encrypted envelope and is recovered post-decrypt. The bridge layer is
|
||||
plaintext-blind by design.
|
||||
|
||||
## Auth — signed query parameters
|
||||
|
||||
Every bridge request signs the canonical
|
||||
`{address, kind, since, signedAt}` payload with the recipient's Ed25519
|
||||
signing private key. The server looks up the address-owner key
|
||||
registered via `/v1/inbox/register` and verifies the signature.
|
||||
|
||||
`kind` is bound into the canonical payload so a signature for `/poll`
|
||||
cannot be replayed against `/stream` or `/ws`.
|
||||
|
||||
The browser `EventSource` API does not let callers attach custom
|
||||
headers; query parameters are the only portable carrier and so the
|
||||
bridge protocol uses them uniformly across all three transports.
|
||||
|
||||
## Server-side — `createBridgeRoutes`
|
||||
|
||||
```ts
|
||||
import { createBridgeRoutes } from '@shade/inbox-server';
|
||||
import { Hono } from 'hono';
|
||||
|
||||
const inbox = new MemoryInboxStore();
|
||||
const events = new InboxServerEvents();
|
||||
|
||||
const bridge = createBridgeRoutes({
|
||||
store: inbox,
|
||||
crypto,
|
||||
events,
|
||||
longPollTimeoutMs: 25_000, // default — under typical proxy idle limits
|
||||
heartbeatIntervalMs: 15_000, // SSE keepalive comments
|
||||
fallbackPollIntervalMs: 1_000, // when no `events` emitter is wired
|
||||
});
|
||||
|
||||
const app = new Hono();
|
||||
app.route('/', bridge.app);
|
||||
|
||||
Bun.serve({
|
||||
port: 3900,
|
||||
fetch: (req, srv) => app.fetch(req, srv),
|
||||
websocket: bridge.websocket as any,
|
||||
});
|
||||
```
|
||||
|
||||
The bridge subscribes to `InboxServerEvents` (`inbox.blob_stored`) for
|
||||
push-style delivery — when an event fires for a connected address, the
|
||||
server fetches new blobs and forwards them. If no events emitter is
|
||||
wired, the server falls back to a small in-process polling timer at
|
||||
`fallbackPollIntervalMs` cadence.
|
||||
|
||||
## Cursor & resume
|
||||
|
||||
Every `IncomingMessage.receivedAt` is the relay's monotonic cursor for
|
||||
the address. Bridges expose `getCursor()` so applications can persist
|
||||
the high-water mark and pass it as `startCursor` on the next
|
||||
`connect()`:
|
||||
|
||||
```ts
|
||||
const sse = new SseBridge({
|
||||
baseUrl,
|
||||
auth,
|
||||
startCursor: await persistedCursor.load(),
|
||||
});
|
||||
|
||||
await sse.connect({
|
||||
onMessage: async (msg) => {
|
||||
await persistedCursor.save(msg.receivedAt);
|
||||
// …
|
||||
},
|
||||
});
|
||||
```
|
||||
|
||||
For SSE specifically, the server emits an `id:` field per event; the
|
||||
bridge sends it back as `Last-Event-ID` plus the `since=` query
|
||||
parameter on reconnect, so a flapping connection picks up exactly where
|
||||
it left off without duplicates.
|
||||
|
||||
## Reconnect & backoff
|
||||
|
||||
| Bridge | Auto-reconnect | Backoff |
|
||||
|-------------|----------------|----------------------|
|
||||
| WS | yes (default) | 250 ms → 10 s exponential |
|
||||
| SSE | yes (default) | 250 ms → 10 s exponential |
|
||||
| long-poll | always on (the loop *is* the reconnect) | 2 s on hard error |
|
||||
|
||||
Pass `disableAutoReconnect: true` (WS / SSE) for tests where you want a
|
||||
single attempt and immediate surfaced error.
|
||||
|
||||
## Long-poll concurrency
|
||||
|
||||
The `LongPollBridge` issues exactly one request at a time. The next
|
||||
request fires after the previous one resolves. This guarantees a
|
||||
client never holds more than one TCP connection on the server, which
|
||||
matches the V3.7 acceptance criterion and keeps capacity planning
|
||||
simple: max in-flight long-poll requests = number of connected clients.
|
||||
|
||||
## Failure modes
|
||||
|
||||
- **WS handshake rejected (4xxx code).** `WsBridge.connect` rejects.
|
||||
Caller (or `FallbackBridgeTransport`) moves on.
|
||||
- **SSE returns non-200.** `SseBridge.connect` throws a `BridgeError`
|
||||
with `httpStatus`.
|
||||
- **Long-poll returns non-200.** Same — `BridgeError` with `httpStatus`.
|
||||
- **Mid-stream error after connect.** WS/SSE auto-reconnect; long-poll
|
||||
swallows transient errors and continues looping. Errors flow to the
|
||||
caller's `onError` handler.
|
||||
|
||||
## Acceptance test coverage (V3.7)
|
||||
|
||||
`packages/shade-transport-bridge/tests/bridge.test.ts` covers:
|
||||
|
||||
- "Send 100 small messages" — one test per transport, all pass.
|
||||
- "WS blocked by proxy → SSE → long-poll" — fallback test boots a
|
||||
server where the WS endpoint is unreachable and the SSE endpoint
|
||||
returns 502, verifies the chain falls all the way through to
|
||||
long-poll without message loss.
|
||||
- "Long-poll uses ≤ 1 outstanding request" — wraps `fetch` to count
|
||||
in-flight requests over 1.5 s of steady-state operation.
|
||||
- Cursor resume — tears down an SSE connection mid-stream, pushes more
|
||||
blobs, reconnects with the persisted cursor, asserts exactly the new
|
||||
blobs are delivered (no overlap with the pre-disconnect set).
|
||||
- Auth rejection — wrong signing key and unregistered address both
|
||||
produce hard `connect` rejections so the fallback chain advances.
|
||||
156
docs/trust-ux.md
Normal file
156
docs/trust-ux.md
Normal file
@@ -0,0 +1,156 @@
|
||||
# Trust UX — Fingerprint Gates (V3.3)
|
||||
|
||||
> Status: shipped in 0.4.0, GA-frozen in 4.0 — see [V3.3 plan](./archive/V3.3.md).
|
||||
|
||||
Shade ships with a small number of **blocking** verification gates that
|
||||
fire automatically before the operations where MITM risk is highest.
|
||||
Each gate calls a handler you register on the SDK; until the user (or
|
||||
your handler) approves, the operation aborts with
|
||||
`FingerprintNotVerifiedError`.
|
||||
|
||||
The point of the gate model is to be alert-fatigue-free: you don't see
|
||||
a prompt before every chat message, just before the handful of moments
|
||||
that genuinely matter.
|
||||
|
||||
---
|
||||
|
||||
## What the gates protect
|
||||
|
||||
| Gate | Fires when | Default policy |
|
||||
|------|------------|----------------|
|
||||
| `first-large-file` | `Shade.upload(...)` for an unverified peer with a known size at or above the configured threshold. | Threshold `10 MiB`. Below = no gate. |
|
||||
| `backup-import` | `Shade.importBackup(...)` before any state is written. Handler receives the fingerprint of the identity *embedded in the backup*. | Always fires. |
|
||||
| `new-device-trust` | `Shade.acceptIdentityChange(...)` after a peer rotates identity. The peer's `identity_version` is bumped first so any prior verification is automatically stale. | Always fires. |
|
||||
| `inbox-fanout` | Reserved for V3.6 (`@shade/inbox`). Per-recipient hook is wired today so apps can register it now. | Always fires. |
|
||||
|
||||
---
|
||||
|
||||
## Registering handlers
|
||||
|
||||
```ts
|
||||
const shade = await createShade({
|
||||
prekeyServer: 'https://prekeys.example.com',
|
||||
storage: 'sqlite:/data/shade.db',
|
||||
});
|
||||
|
||||
shade.beforeFirstLargeFile(10 * 1024 * 1024, async (ctx) => {
|
||||
// ctx.peerAddress, ctx.fingerprint, ctx.fileSize
|
||||
return await ui.confirmFingerprintModal(ctx);
|
||||
});
|
||||
|
||||
shade.beforeBackupImport(async (ctx) => {
|
||||
// ctx.fingerprint = fingerprint of the identity in the backup blob
|
||||
return await ui.confirmBackupOwner(ctx);
|
||||
});
|
||||
|
||||
shade.beforeNewDeviceTrust(async (ctx) => {
|
||||
// ctx.fingerprint = fingerprint of the rotated identity
|
||||
return await ui.confirmDeviceRotation(ctx);
|
||||
});
|
||||
```
|
||||
|
||||
Return `true` to allow the operation and persist a `'user'` verification.
|
||||
Return `false` (or throw) to abort with `FingerprintNotVerifiedError`.
|
||||
|
||||
If you don't register a handler, the gate **logs a one-time warning per
|
||||
peer and proceeds on TOFU**, persisting a `'tofu-after-warning'`
|
||||
verification. This satisfies the V3.3 acceptance criterion that apps
|
||||
without registered gates get sane defaults instead of hard-failing — but
|
||||
it does mean the gate is informational, not a hard wall, in that
|
||||
configuration. Always register handlers in production.
|
||||
|
||||
---
|
||||
|
||||
## Manual verification
|
||||
|
||||
The handler model assumes your app drives the OOB compare/confirm
|
||||
flow. If the user verifies through some other path (QR code scan, audio
|
||||
read-aloud, transitive trust from V3.10), call:
|
||||
|
||||
```ts
|
||||
await shade.markPeerVerified('bob'); // pin current fingerprint
|
||||
await shade.unmarkPeerVerified('bob'); // revoke
|
||||
const ok = await shade.isPeerVerified('bob'); // check status
|
||||
```
|
||||
|
||||
`markPeerVerified` reads the peer's *current* fingerprint and pins it
|
||||
together with the per-peer `identity_version`. When the peer rotates
|
||||
(`acceptIdentityChange`), the version bumps and the saved verification
|
||||
goes stale automatically — `isPeerVerified` will return `false` until
|
||||
the user re-verifies.
|
||||
|
||||
---
|
||||
|
||||
## Tuning thresholds
|
||||
|
||||
The `first-large-file` threshold is the only knob that's customer-tunable
|
||||
without code changes. The defaults are conservative:
|
||||
|
||||
- **Default:** `10 MiB`. Big enough that ordinary chat attachments don't
|
||||
trigger; small enough that obvious "exfil candidates" do.
|
||||
- **Lower** (e.g. `1 MiB`) for high-sensitivity deployments — every
|
||||
document goes through the gate.
|
||||
- **Raise** (e.g. `100 MiB`) only for use cases where small uploads are
|
||||
routine and large transfers are deliberate / pre-arranged.
|
||||
|
||||
`backup-import` and `new-device-trust` have no threshold by design — the
|
||||
spec mandates an irremovable minimum gate for both, since each one
|
||||
either trusts a fresh identity or overwrites pinned trust wholesale.
|
||||
|
||||
---
|
||||
|
||||
## React widget
|
||||
|
||||
Use `<FingerprintGate />` from `@shade/widgets` to block UI on
|
||||
verification status:
|
||||
|
||||
```tsx
|
||||
import { FingerprintGate } from '@shade/widgets';
|
||||
|
||||
<FingerprintGate peerAddress="bob">
|
||||
<ChatThread peer="bob" />
|
||||
</FingerprintGate>
|
||||
```
|
||||
|
||||
The default fallback shows the safety number, a "Copy OOB text" button,
|
||||
and an "I have verified" button that calls `Shade.markPeerVerified`.
|
||||
Pass a `fallback` render prop to use your own UI, or `onVerified` to
|
||||
react to the unverified → verified transition.
|
||||
|
||||
`<FingerprintCompare />` is the existing observer-dashboard widget; it
|
||||
now exposes the same Copy-OOB / verify actions when an `onVerified`
|
||||
prop is wired.
|
||||
|
||||
---
|
||||
|
||||
## Errors
|
||||
|
||||
`FingerprintNotVerifiedError` carries:
|
||||
|
||||
- `peerAddress` — the address the gate was protecting.
|
||||
- `gate` — `'first-large-file' | 'backup-import' | 'new-device-trust' | 'inbox-fanout'`.
|
||||
- `code = 'SHADE_FINGERPRINT_NOT_VERIFIED'` — maps to HTTP 403.
|
||||
|
||||
Catch it explicitly when wrapping `upload`, `importBackup`, and
|
||||
`acceptIdentityChange`:
|
||||
|
||||
```ts
|
||||
try {
|
||||
await shade.upload({ to: 'bob', input: bytes });
|
||||
} catch (err) {
|
||||
if (err instanceof FingerprintNotVerifiedError) {
|
||||
showVerifyFirst(err.peerAddress);
|
||||
return;
|
||||
}
|
||||
throw err;
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Migration from 0.3.x
|
||||
|
||||
No breaking changes: existing apps gain warning-mode gates automatically
|
||||
(see the no-handler note above). To upgrade to hard gates, register
|
||||
handlers for the operations you use. Your existing `FingerprintCompare`
|
||||
calls keep working; pass `onVerified` to enable the new actions.
|
||||
276
docs/web-workers.md
Normal file
276
docs/web-workers.md
Normal file
@@ -0,0 +1,276 @@
|
||||
# Web Workers Crypto
|
||||
|
||||
Status: Implemented (V3.8 — `0.4.0`).
|
||||
|
||||
`@shade/crypto-web` ships with an opt-in dedicated Web Worker that keeps
|
||||
AES-GCM, HKDF, HMAC, X25519 and Ed25519 — and full per-lane stream state —
|
||||
off the main thread. Big in-browser uploads (100 MB+) stay smooth without
|
||||
frame drops.
|
||||
|
||||
This doc covers:
|
||||
|
||||
- [When to use it](#when-to-use-it)
|
||||
- [Setup](#setup)
|
||||
- [API](#api)
|
||||
- [Bundler recipes](#bundler-recipes)
|
||||
- [Safari notes](#safari-notes)
|
||||
- [SharedArrayBuffer (COOP/COEP)](#sharedarraybuffer-coopcoep)
|
||||
- [Lifecycle and rotation](#lifecycle-and-rotation)
|
||||
- [Threat-model considerations](#threat-model-considerations)
|
||||
|
||||
---
|
||||
|
||||
## When to use it
|
||||
|
||||
The default `SubtleCryptoProvider` runs on whatever thread you give it.
|
||||
For the SDK that means the main thread. AES-GCM via SubtleCrypto is fast
|
||||
(hardware-accelerated), but a 100 MB file at 256 KiB chunks is ~400 AEAD
|
||||
calls — each one queues a microtask on the main thread. Layered on top of
|
||||
React reflows and large `postMessage` payloads to the network worker, you
|
||||
*will* see frame drops.
|
||||
|
||||
Reach for the Worker pipeline when:
|
||||
|
||||
- You upload or download files that don't fit in a single AEAD chunk
|
||||
(≥ ~1 MB) inside a UI-bearing browser tab.
|
||||
- You generate or rotate identity / device keys in a UI thread that must
|
||||
stay interactive.
|
||||
- You do batch AEAD (e.g. backup export over many records).
|
||||
|
||||
You can keep using `SubtleCryptoProvider` for short ops (Signal session
|
||||
encrypt/decrypt for a chat message). The cost of a `postMessage` round-
|
||||
trip dwarfs the cost of a single 256-byte AES call.
|
||||
|
||||
---
|
||||
|
||||
## Setup
|
||||
|
||||
`@shade/crypto-web` exposes the worker as a separate subpath, so your
|
||||
bundler can resolve it through the standard `new Worker(new URL(...,
|
||||
import.meta.url))` idiom.
|
||||
|
||||
```ts
|
||||
import { createShade } from '@shade/sdk';
|
||||
|
||||
const shade = await createShade({ /* ... */ });
|
||||
|
||||
shade.configureWorkerCrypto({
|
||||
workerUrl: new URL('@shade/crypto-web/worker', import.meta.url),
|
||||
});
|
||||
```
|
||||
|
||||
After `configureWorkerCrypto`, the SDK exposes:
|
||||
|
||||
- `shade.encryptStream({ streamId, streamSecret, ... })` — returns a
|
||||
`TransformStream<Uint8Array, Uint8Array>` and a `laneSha256` promise.
|
||||
- `shade.decryptStream({ streamId, streamSecret, ... })` — inverse.
|
||||
- `shade.getWorkerCrypto()` — direct access to the `WorkerCryptoProvider`
|
||||
for one-off ops (HKDF batches, X25519 batch DH, etc.).
|
||||
|
||||
The worker is spawned on first use and self-terminates after
|
||||
`idleTimeoutMs` (default 30 s) — no manual lifecycle management required.
|
||||
|
||||
---
|
||||
|
||||
## API
|
||||
|
||||
### Stream encryption
|
||||
|
||||
```ts
|
||||
const { stream, laneSha256 } = await shade.encryptStream({
|
||||
streamId: streamId, // 16 random bytes, agreed with peer
|
||||
streamSecret: streamSecret,// 32 random bytes, derived via Double Ratchet
|
||||
laneId: 0, // lane index (use multi-lane for parallel HTTP)
|
||||
chunkSize: 256 * 1024, // optional; default 256 KiB
|
||||
});
|
||||
|
||||
await file.stream()
|
||||
.pipeThrough(stream)
|
||||
.pipeTo(transferSink); // your HTTP-shipping WritableStream
|
||||
|
||||
const sha256 = await laneSha256; // for end-to-end integrity proof
|
||||
```
|
||||
|
||||
`stream` consumes plaintext and emits one wire-encoded
|
||||
`stream-chunk` envelope per write. `flush` always emits a final chunk
|
||||
with `isLast=true` (even if the trailing slice is empty), so receivers
|
||||
see a clean termination.
|
||||
|
||||
### Stream decryption
|
||||
|
||||
```ts
|
||||
const { stream, laneSha256 } = await shade.decryptStream({
|
||||
streamId,
|
||||
streamSecret,
|
||||
laneId: 0,
|
||||
});
|
||||
|
||||
await incomingChunkStream
|
||||
.pipeThrough(stream)
|
||||
.pipeTo(fileSink);
|
||||
|
||||
const sha = await laneSha256;
|
||||
if (!equal(sha, peerLaneSha256)) throw new IntegrityError();
|
||||
```
|
||||
|
||||
Each input chunk MUST be a complete wire envelope. The transport-layer
|
||||
caller is responsible for framing (one envelope per write). Out-of-order
|
||||
or replayed chunks reject the stream — the lane key never crosses thread
|
||||
boundaries, so a man-in-the-middle script in the page can't recover key
|
||||
material to replay against.
|
||||
|
||||
### Direct provider access
|
||||
|
||||
```ts
|
||||
const crypto = await shade.getWorkerCrypto();
|
||||
// Implements `CryptoProvider` — drop-in replacement for SubtleCryptoProvider
|
||||
const { ciphertext, nonce } = await crypto.aesGcmEncrypt(key, plaintext);
|
||||
```
|
||||
|
||||
`randomBytes`, `randomUint32`, `constantTimeEqual`, `zeroize` execute on
|
||||
the calling thread (no round-trip). Async ops forward to the worker.
|
||||
|
||||
---
|
||||
|
||||
## Bundler recipes
|
||||
|
||||
### Vite
|
||||
|
||||
```ts
|
||||
shade.configureWorkerCrypto({
|
||||
workerUrl: new URL('@shade/crypto-web/worker', import.meta.url),
|
||||
});
|
||||
```
|
||||
|
||||
Vite resolves the URL via `import.meta.url` and emits a discrete chunk
|
||||
for the worker. No additional config required for Vite ≥ 5.
|
||||
|
||||
If your build complains about `?worker` syntax, use the explicit URL
|
||||
form (above) — it's the standard Vite idiom.
|
||||
|
||||
### Webpack 5 / Rspack
|
||||
|
||||
Same idiom — Webpack 5 understands `new URL('./worker.js', import.meta.url)`
|
||||
natively as long as the source is ESM:
|
||||
|
||||
```ts
|
||||
new Worker(new URL('@shade/crypto-web/worker', import.meta.url), {
|
||||
type: 'module',
|
||||
});
|
||||
```
|
||||
|
||||
For Webpack 4 or non-ESM builds, you need `worker-loader` (legacy). We
|
||||
do not officially support Webpack 4.
|
||||
|
||||
### Rollup
|
||||
|
||||
Rollup needs `@rollup/plugin-web-worker-loader` or a recent
|
||||
`rollup-plugin-import-meta-url`. The standard idiom works once the
|
||||
plugin is wired:
|
||||
|
||||
```ts
|
||||
new URL('@shade/crypto-web/worker', import.meta.url)
|
||||
```
|
||||
|
||||
If your bundler can't resolve `@shade/crypto-web/worker`, copy
|
||||
`node_modules/@shade/crypto-web/src/worker.ts` (or the compiled `.js`
|
||||
once we ship dist artefacts) into your `public/` directory and pass an
|
||||
absolute URL:
|
||||
|
||||
```ts
|
||||
shade.configureWorkerCrypto({ workerUrl: '/shade-crypto.worker.js' });
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Safari notes
|
||||
|
||||
Safari ≤ 17 has a smaller `postMessage` transferable budget than Chrome /
|
||||
Firefox. Single transfers above ~64 MB occasionally fail silently. The
|
||||
shipped pipeline already chunks plaintext to 256 KiB before AEAD, so
|
||||
each `postMessage` carries ≤ ~256 KiB + AEAD overhead — well under any
|
||||
known Safari limit.
|
||||
|
||||
If you override `chunkSize`, keep individual buffers below 16 MiB:
|
||||
|
||||
```ts
|
||||
shade.encryptStream({
|
||||
streamId, streamSecret,
|
||||
chunkSize: 8 * 1024 * 1024, // 8 MiB — safe across all browsers
|
||||
});
|
||||
```
|
||||
|
||||
We do not officially support Safari ≤ 14 (no module workers).
|
||||
|
||||
---
|
||||
|
||||
## SharedArrayBuffer (COOP/COEP)
|
||||
|
||||
The default pipeline uses `ArrayBuffer` transfer (zero-copy ownership
|
||||
hand-off). It does **not** require COOP/COEP headers.
|
||||
|
||||
For multi-lane parallel transfers across multiple workers, you may opt
|
||||
in to `SharedArrayBuffer` for the AEAD plaintext buffers. That requires
|
||||
your origin to serve:
|
||||
|
||||
```
|
||||
Cross-Origin-Opener-Policy: same-origin
|
||||
Cross-Origin-Embedder-Policy: require-corp
|
||||
```
|
||||
|
||||
`SharedArrayBuffer` support is gated behind a future `useSharedBuffers`
|
||||
option and is not enabled in V3.8. See `docs/V4.0.md` if/when this lands.
|
||||
|
||||
---
|
||||
|
||||
## Lifecycle and rotation
|
||||
|
||||
```ts
|
||||
const crypto = await shade.getWorkerCrypto();
|
||||
await crypto.rotate(); // tear down the current worker, respawn lazily
|
||||
await crypto.destroy(); // permanent — every subsequent call rejects
|
||||
```
|
||||
|
||||
`shade.shutdown()` calls `destroy()` automatically. The idle-timer fires
|
||||
30 seconds after the last response (configurable via
|
||||
`configureWorkerCrypto({ idleTimeoutMs })`); if the timer fires while
|
||||
calls are pending, it does nothing and reschedules.
|
||||
|
||||
---
|
||||
|
||||
## Threat-model considerations
|
||||
|
||||
- The worker runs in the same origin and the same browsing context as
|
||||
the main thread. It is **not** a sandbox against a compromised page;
|
||||
any script that can `eval` in your tab can also `postMessage` to the
|
||||
worker. The Worker is a *performance* boundary, not a *security*
|
||||
boundary.
|
||||
- Lane keys derived inside the worker stay there; they are never
|
||||
postMessage'd to the main thread. This narrows the window during which
|
||||
a key sits in main-thread heap, which helps against post-mortem heap
|
||||
inspection by a curious extension. It does not help against an active
|
||||
in-page attacker.
|
||||
- `randomBytes` runs on the calling thread (uses `crypto.getRandomValues`
|
||||
directly). The worker has its own random source for ops that derive
|
||||
inside it (nonces are derived deterministically from `(laneId, seq)`).
|
||||
|
||||
For the full picture, see `THREAT-MODEL.md`.
|
||||
|
||||
---
|
||||
|
||||
## Verifying main-thread budget
|
||||
|
||||
V3.8 acceptance: 100 MB upload in Chrome without main thread blocked
|
||||
> 16 ms in P99.
|
||||
|
||||
To verify in your app:
|
||||
|
||||
1. Open Chrome DevTools → Performance.
|
||||
2. Record a 100 MB upload.
|
||||
3. Inspect the main-thread flame chart. Look at "Long Tasks" and
|
||||
"Self time" of `Shade.encryptStream`.
|
||||
4. Confirm no contiguous block exceeds ~16 ms (one frame at 60 fps).
|
||||
|
||||
If you observe long tasks, lower `chunkSize` (more frequent yields) or
|
||||
report the trace — see [`docs/archive/V3.8.md`](./archive/V3.8.md) for
|
||||
the original acceptance criteria.
|
||||
302
docs/webrtc.md
Normal file
302
docs/webrtc.md
Normal file
@@ -0,0 +1,302 @@
|
||||
# Shade Transport — WebRTC P2P Layer (V3.11)
|
||||
|
||||
`@shade/transport-webrtc` adds a direct peer-to-peer chunk transport on
|
||||
top of the existing `@shade/transfer` engine. When two clients can reach
|
||||
each other through NAT/firewall, large transfers (`@shade/files`,
|
||||
`@shade/transfer`) flow over a single bidirectional `RTCDataChannel`
|
||||
instead of paying the round-trip cost of HTTP-relayed POSTs. When NAT
|
||||
traversal fails, the multi-transport fallback automatically demotes the
|
||||
chain back to HTTP — without losing any chunks already in flight.
|
||||
|
||||
The wire payload is unchanged: every chunk is still a Shade ratchet /
|
||||
streams envelope (AES-256-GCM under HKDF-derived per-lane keys). DTLS-
|
||||
SRTP is only the WebRTC transport secret; turning a TURN-relay on does
|
||||
not give the relay operator access to plaintext.
|
||||
|
||||
```
|
||||
┌───────────────────────────────────────────────────────────────┐
|
||||
│ application code │
|
||||
│ │
|
||||
│ shade.upload({ to: 'bob', input: file }) │
|
||||
└────────────────────────────────┬──────────────────────────────┘
|
||||
│
|
||||
┌─────────▼──────────┐
|
||||
│ TransferEngine │
|
||||
└─────────┬──────────┘
|
||||
│ ITransferTransport
|
||||
┌─────────▼──────────┐
|
||||
│ MultiTransport │
|
||||
│ Fallback (sticky) │
|
||||
└────┬─────┬─────┬───┘
|
||||
│ │ │
|
||||
┌─────────────▼┐ ┌─▼─┐ ┌▼────────────┐
|
||||
│ WebRtcTransfer│ │WS │ │ ShadeTransfer│
|
||||
│ Transport │ │… │ │ HttpTransport│
|
||||
└─────┬─────────┘ └───┘ └──────────────┘
|
||||
│ DataChannel binary frames
|
||||
┌─────▼─────────┐
|
||||
│ WebRtcConn │ ←──── SDP/ICE over Shade.send
|
||||
│ Manager │ (ratchet-encrypted)
|
||||
└───────────────┘
|
||||
```
|
||||
|
||||
## When to reach for it
|
||||
|
||||
| Scenario | Default (HTTP) | + WebRTC |
|
||||
|---------------------------------------|----------------|----------------|
|
||||
| Two clients on the same LAN | server-relayed | direct, P2P |
|
||||
| One peer behind enterprise NAT only | works | TURN-relay |
|
||||
| Both peers behind symmetric NAT | works | falls back to HTTP |
|
||||
| One peer offline | inbox-buffered | inbox-buffered (HTTP path) |
|
||||
| Browser extension with strict CSP | works | works (uses RTCPeerConnection) |
|
||||
|
||||
Use cases:
|
||||
|
||||
- `@shade/transfer` upload of multi-MB / multi-GB files
|
||||
- `@shade/files` `read`/`write` of large inline blobs
|
||||
- Future: `@shade/streams` real-time channels (V5.0 reuses this same DataChannel)
|
||||
|
||||
## Quick start (browser)
|
||||
|
||||
```ts
|
||||
import { createShade } from '@shade/sdk';
|
||||
import { nativeRtcFactory } from '@shade/transport-webrtc';
|
||||
|
||||
const shade = await createShade({ prekeyServer: 'https://prekey.example.com' });
|
||||
|
||||
// IMPORTANT: configureWebRTC MUST be called BEFORE the first upload() /
|
||||
// onIncomingTransfer() / transferRoute() call, because those build the
|
||||
// transfer engine — and the engine captures its transport stack at
|
||||
// construction time.
|
||||
shade.configureWebRTC({
|
||||
factory: nativeRtcFactory(),
|
||||
// Optional — defaults to two public Google STUN servers.
|
||||
iceServers: [
|
||||
{ urls: 'stun:stun.l.google.com:19302' },
|
||||
{
|
||||
urls: 'turn:turn.example.com:3478',
|
||||
username: 'shade',
|
||||
credential: 'YOUR_TURN_SECRET',
|
||||
},
|
||||
],
|
||||
});
|
||||
|
||||
shade.configureTransfers({
|
||||
resolveBaseUrl: async (peer) => directory.lookup(peer),
|
||||
});
|
||||
|
||||
await shade.upload({ to: 'bob', input: file }); // → P2P when NAT allows
|
||||
```
|
||||
|
||||
## Quick start (Bun / Node)
|
||||
|
||||
Bun does not yet expose `RTCPeerConnection` natively. Use one of:
|
||||
|
||||
- [`node-datachannel`](https://github.com/murat-dogan/node-datachannel)
|
||||
— small, stable, libdatachannel under the hood
|
||||
- [`@roamhq/wrtc`](https://www.npmjs.com/package/@roamhq/wrtc) — fork of
|
||||
the Google `wrtc` bindings
|
||||
|
||||
Wrap the chosen library behind an `IRtcFactory` (the package only depends
|
||||
on a narrow surface — `createPeerConnection`, `createDataChannel`,
|
||||
`addEventListener`):
|
||||
|
||||
```ts
|
||||
import { IRtcFactory, IPeerConnection, IDataChannel } from '@shade/transport-webrtc';
|
||||
// pseudo-adapter for node-datachannel
|
||||
class NodeDataChannelFactory implements IRtcFactory {
|
||||
createPeerConnection(config) { /* ... return adapter wrapping nodeDc PeerConnection */ }
|
||||
}
|
||||
|
||||
shade.configureWebRTC({ factory: new NodeDataChannelFactory(), iceServers });
|
||||
```
|
||||
|
||||
## Connection flow
|
||||
|
||||
```
|
||||
Alice initiates Bob receives
|
||||
─────────────── ────────────
|
||||
1. createOffer() → SDP 2. shade.send delivers offer
|
||||
→ Bob.createAnswer()
|
||||
3. shade.send delivers answer 4. setRemoteDescription(answer)
|
||||
5. trickle ICE candidates (both directions) 6. trickle ICE candidates
|
||||
7. DataChannel onopen (both sides) 7. DataChannel onopen
|
||||
```
|
||||
|
||||
All four signaling kinds (`shade.webrtc-offer/v1`, `shade.webrtc-answer/v1`,
|
||||
`shade.webrtc-ice/v1`, `shade.webrtc-bye/v1`) ride the existing Shade
|
||||
ratchet — the relay sees only ciphertext envelopes.
|
||||
|
||||
### Glare resolution
|
||||
|
||||
If both peers call `getOrCreate()` simultaneously, the manager uses
|
||||
lexicographic tiebreak: the side with the smaller address wins
|
||||
caller-role; the side with the larger address closes its outgoing
|
||||
connection and accepts the inbound offer instead. Both peers ultimately
|
||||
converge on a single `WebRtcConnection`.
|
||||
|
||||
## Backpressure
|
||||
|
||||
The `WebRtcTransferTransport` polls `RTCDataChannel.bufferedAmount` and
|
||||
suspends new sends once the buffer crosses `backpressureThresholdBytes`
|
||||
(default 4 MiB). This avoids SCTP queue runaway when the application
|
||||
pushes faster than the network can drain. Tune lower for memory-
|
||||
constrained clients (mobile / extension contexts).
|
||||
|
||||
## Auto-fallback
|
||||
|
||||
Configuring WebRTC wires `MultiTransportFallback([webrtc, http])` as the
|
||||
engine's transport. The chain is sticky-after-first-failure: when WebRTC
|
||||
raises a `TransferTransportError` (timeout, ICE failed, data channel
|
||||
closed, frame too large), the fallback advances to HTTP and stays there
|
||||
for the lifetime of the engine.
|
||||
|
||||
For three-tier composition (e.g. WebRTC → WebSocket → HTTP), build the
|
||||
fallback yourself and pass a custom transport via the engine deps:
|
||||
|
||||
```ts
|
||||
import { MultiTransportFallback } from '@shade/sdk';
|
||||
|
||||
const stack = new MultiTransportFallback([
|
||||
{ name: 'webrtc', transport: rtcTransport },
|
||||
{ name: 'ws', transport: wsTransport },
|
||||
{ name: 'http', transport: httpTransport },
|
||||
]);
|
||||
stack.onSwitch((from, to) => metrics.observe('shade.transport.demoted', { from, to }));
|
||||
```
|
||||
|
||||
The `WebRtcConnectionManager`'s connect timeout (default 30 s) is the
|
||||
upper bound on how long the chain dwells on WebRTC before demoting. The
|
||||
V3.11 acceptance criterion is "P2P-død → HTTP innen 5 s" — set
|
||||
`connectTimeoutMs: 4_000` in your `configureWebRTC()` call to keep the
|
||||
upper bound at 4 seconds and meet the SLO with margin.
|
||||
|
||||
## ICE server config
|
||||
|
||||
| Setting | Default | When to override |
|
||||
|------------------------|-----------------------------------|------------------|
|
||||
| `iceServers` | Google public STUN (×2) | Production — pin your own STUN to avoid Google rate limits, plus your TURN credentials |
|
||||
| `iceTransportPolicy` | `'all'` (host + reflexive + relay)| `'relay'` to mandate TURN-only routing (e.g. inside a corporate network where direct connectivity must never leak) |
|
||||
| `bundlePolicy` | spec default (`'balanced'`) | rarely |
|
||||
|
||||
Public STUN works for ~80% of consumer NATs. The remaining 20% (symmetric
|
||||
NAT, paranoid corporate proxies, mobile carrier-grade NAT) need TURN.
|
||||
Run your own [coturn](https://github.com/coturn/coturn) or use a managed
|
||||
provider — but **TURN traffic is real bandwidth through your server**, so
|
||||
budget accordingly. Shade's wire format is at least as efficient over
|
||||
TURN as over HTTPS (no per-request HTTP framing overhead).
|
||||
|
||||
## NAT-traversal: hopes and realities
|
||||
|
||||
What works without TURN, in our testing:
|
||||
|
||||
- Same NAT (LAN): always
|
||||
- Two clients behind cone NATs: usually
|
||||
- One client behind symmetric NAT, the other behind any cone NAT: usually
|
||||
- Two clients behind symmetric NATs: rarely — falls back to TURN
|
||||
|
||||
What doesn't work:
|
||||
|
||||
- Two clients behind strict carrier-grade NAT (CGNAT): TURN required
|
||||
- Clients on networks that block UDP entirely: TURN over TCP/443 required
|
||||
|
||||
When in doubt, configure TURN over TCP/443 — it impersonates HTTPS and
|
||||
gets through nearly every middlebox.
|
||||
|
||||
## Diagnostics
|
||||
|
||||
The SDK exposes the live runtime via `shade.getWebRtcRuntime()`:
|
||||
|
||||
```ts
|
||||
const runtime = shade.getWebRtcRuntime();
|
||||
if (runtime !== null) {
|
||||
console.log('active transport:', runtime.fallback.activeName);
|
||||
console.log('peers:', [...runtime.manager.byPeer ?? []]);
|
||||
|
||||
runtime.fallback.onSwitch((from, to) => {
|
||||
console.warn(`shade transport demoted ${from} → ${to}`);
|
||||
});
|
||||
}
|
||||
```
|
||||
|
||||
The `failures` array on `MultiTransportFallback` records every
|
||||
demotion's reason — wire it to your observability backend to track
|
||||
NAT/TURN problems in production.
|
||||
|
||||
## Sample code
|
||||
|
||||
End-to-end test using `MemoryRtcFactory` (no real network):
|
||||
|
||||
```ts
|
||||
import { MemoryRtcFactory } from '@shade/transport-webrtc';
|
||||
|
||||
const factory = new MemoryRtcFactory();
|
||||
alice.configureWebRTC({ factory });
|
||||
bob.configureWebRTC({ factory });
|
||||
|
||||
await alice.upload({ to: 'bob', input: bytes }); // → P2P loopback
|
||||
```
|
||||
|
||||
See `packages/shade-sdk/tests/webrtc-integration.test.ts` for the full
|
||||
loopback test, `webrtc-failover.test.ts` for the auto-fallback test, and
|
||||
`packages/shade-transport-webrtc/tests/` for the unit tests covering
|
||||
wire format, signaling, glare, and TURN-only configuration.
|
||||
|
||||
## Wire format inside the DataChannel
|
||||
|
||||
The DataChannel is a single bidirectional pipe shared by every in-flight
|
||||
stream between two peers. Each frame is a self-describing binary blob:
|
||||
|
||||
```
|
||||
client → server server → client
|
||||
─────────────── ───────────────
|
||||
0x01 chunk reqId(16) sid(16) lane(u32) seq(u64) env(...) 0x81 chunk-ack reqId(16) lastSeq(u32) bytesRecv(u32)
|
||||
0x02 resume-query reqId(16) sid(16) 0x82 resume-state reqId(16) jsonBody(utf-8)
|
||||
0x03 ping reqId(16) nonce(u64) 0x83 pong reqId(16) nonce(u64)
|
||||
0xFE error reqId(16) jsonBody(utf-8)
|
||||
```
|
||||
|
||||
`reqId` is a 16-byte random correlation token; the responder echoes it
|
||||
verbatim so multiple in-flight requests can be matched without a stream
|
||||
multiplexer on top of SCTP.
|
||||
|
||||
The wire matches `ShadeTransferWsTransport` exactly — adapters for
|
||||
either transport can interoperate by translating between SCTP message-
|
||||
framing and WS binary frames at the byte level.
|
||||
|
||||
## Limits
|
||||
|
||||
- Max DataChannel message: **256 KiB** (Chrome's safe ceiling). Configure
|
||||
`chunkSize` ≤ 256 KiB on uploads that prefer WebRTC. The transport
|
||||
raises a clear error when an envelope exceeds the cap; the engine then
|
||||
retries via HTTP.
|
||||
- One DataChannel per peer pair (label `shade-transfer/v1`). Multiple
|
||||
in-flight transfers from the same peer pair multiplex via `reqId`.
|
||||
- No SFU/MCU — group transfers fan out at the application layer.
|
||||
- DTLS-fingerprint binding to Shade's identity-fingerprint is **not** in
|
||||
V3.11 (deferred as hardening work — DataChannel is already inside a
|
||||
ratchet-authenticated session, so the practical exposure window is
|
||||
limited to in-process MITM scenarios that already require malware).
|
||||
|
||||
## Migration
|
||||
|
||||
Opt-in. If you don't call `configureWebRTC`, your existing HTTP/WS
|
||||
transport stack is unchanged.
|
||||
|
||||
When you do opt in, the **engine must not be built yet** — the easy way
|
||||
to ensure this is to call `configureWebRTC` before `configureTransfers`
|
||||
or before any of `upload` / `onIncomingTransfer` / `transferRoute`.
|
||||
Receiver-side: the WebRTC manager wires receiver-hooks into the engine
|
||||
during `engine()` construction, so make sure both sides do `configureWebRTC`
|
||||
+ `configureTransfers` before the first `transferRoute()` call.
|
||||
|
||||
## Related modules
|
||||
|
||||
- [`@shade/transfer`](../packages/shade-transfer/) — engine, lane queues,
|
||||
HTTP transport, multi-fallback wrapper.
|
||||
- [`@shade/streams`](./streams.md) — chunk encryption + lane key
|
||||
derivation. Indirect dep.
|
||||
- [`@shade/transport-bridge`](./transport.md) — V3.7 bridge layer (WS /
|
||||
SSE / long-poll for control envelopes). Orthogonal to V3.11.
|
||||
- [V5.0 — real-time channels](./V5.0.md) — downstream consumer of the
|
||||
same DataChannel for voice/video/broadcast.
|
||||
Reference in New Issue
Block a user