152 lines
9.0 KiB
Markdown
152 lines
9.0 KiB
Markdown
|
|
# Shade V2.1 — Improvements (infrastructure, storage, operations, security)
|
|||
|
|
|
|||
|
|
This document describes **improvements** agreed for next-generation work on Shade: clearer product story, stronger storage, mobile parity, operational hardening, transfer abuse, and a formal security narrative.
|
|||
|
|
|
|||
|
|
**Audience:** **Maintainers and contributors** implementing the changes. Add status fields as items land in code/docs.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 1. Clear “who is the server?” and data flow
|
|||
|
|
|
|||
|
|
**Problem:** New users may think the prekey server is a message hub or that all E2EE traffic goes through the Shade container.
|
|||
|
|
|
|||
|
|
**Goal:** One consistent explanation across the root README, package READMEs, and optional onboarding: **the prekey server distributes public keys and bundles**; **actual messages and (typically) file chunks go through your app’s own channel** (your transport, your backend, your URLs).
|
|||
|
|
|
|||
|
|
**Deliverables (proposal):**
|
|||
|
|
|
|||
|
|
- Diagram + short “keys vs payloads” text in the root README and in `@shade/server` README.
|
|||
|
|
- Link to `THREAT-MODEL.md` from the same section (MITM on first contact ↔ safety numbers).
|
|||
|
|
- Optionally one “concept page” (or extend `MIGRATION.md`) with typical architecture: *A ↔ B via app; both talk to the prekey host for X3DH material*.
|
|||
|
|
|
|||
|
|
**Acceptance criteria:** A new developer without domain background understands in one reading *what* goes to the Shade server and *what* does not.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 2. Optional encryption of storage (at-rest)
|
|||
|
|
|
|||
|
|
**Problem:** `THREAT-MODEL.md` states that a stolen DB + filesystem can expose private keys because Shade does not encrypt the storage layer by default.
|
|||
|
|
|
|||
|
|
**Goal:** **Opt-in** protection for sensitive state (identity, session, optional stream resume secrets) with keys that **do not** live in plaintext in the DB — e.g. OS keychain/Keystore, passphrase + KDF, or an explicit device key injected by the app.
|
|||
|
|
|
|||
|
|
**Design principles:**
|
|||
|
|
|
|||
|
|
- Default developer experience (dev, simple demos) stays unchanged or includes a clear “insecure mode” warning in docs.
|
|||
|
|
- APIs implementable per platform (Bun/SQLite, Postgres, web/IndexedDB, Android).
|
|||
|
|
- Document limitations: what remains uncovered (e.g. active memory compromise).
|
|||
|
|
|
|||
|
|
**Acceptance criteria:** Threat model updated for “when encrypted storage is enabled”; at least one reference implementation + migration note.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 3. Android parity and a published roadmap
|
|||
|
|
|
|||
|
|
**Problem:** `shade-android` is under development; drift from the TS SDK undermines the “byte-compatible” promise.
|
|||
|
|
|
|||
|
|
**Goal:** A **published roadmap** (milestones + what counts as parity vs TS-only) and **CI running shared test vectors** as a merge gate before release.
|
|||
|
|
|
|||
|
|
**Deliverables:**
|
|||
|
|
|
|||
|
|
- Roadmap section in `android/shade-android/README.md` or dedicated `ROADMAP-ANDROID.md` with explicit cross-checkpoints: wire format, fingerprints, rotations, streams (`0x11`) where applicable, resume semantics.
|
|||
|
|
- CI job that fails on Kotlin vs TS vector mismatch.
|
|||
|
|
|
|||
|
|
**Acceptance criteria:** Parity coverage is visible and enforceable; the first critical cross-surface (e.g. core ratchet + proto) is green before a “production” label.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 4. Operational hardening — prekey container and production
|
|||
|
|
|
|||
|
|
**Problem:** Many teams deploy the Docker image quickly; mistakes around TLS, backups, and secrets add avoidable risk.
|
|||
|
|
|
|||
|
|
**Goal:** A **production checklist**: TLS termination, volume backup (`/data`), rotation of `SHADE_OBSERVER_TOKEN`, use of `SHADE_PREKEY_PG_URL` vs SQLite, observability hooks, logging levels, meaning of stale cleanup parameters.
|
|||
|
|
|
|||
|
|
**Deliverables:**
|
|||
|
|
|
|||
|
|
- Extend `docs/DEPLOYMENT.md` or add short `docs/PRODUCTION-CHECKLIST.md` with bullet defaults.
|
|||
|
|
- Link from the main README under “Deployment”.
|
|||
|
|
|
|||
|
|
**Acceptance criteria:** A checklist operators can follow without reading the whole codebase first.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 5. Abuse and resource limits on the transfer plane
|
|||
|
|
|
|||
|
|
**Problem:** Parallel lanes and large uploads can be abused for resource or storage if consumer mounts of `createTransferRoutes()` share no coherent policy.
|
|||
|
|
|
|||
|
|
**Goal:** Documented **limits and patterns**: authentication (already an active SDK topic), max stream size, TTL for temporary chunk storage, quotas per identity or IP where sensible.
|
|||
|
|
|
|||
|
|
**Deliverables:**
|
|||
|
|
|
|||
|
|
- Guidelines in `docs/streams.md` or a dedicated “Transfer hardening” section.
|
|||
|
|
- Optional helpers or middleware examples in `@shade/transfer` / server routes for common limits (without forcing every deployment into one DB model).
|
|||
|
|
|
|||
|
|
**Acceptance criteria:** A clear “recommended minimum” for production that teams can copy.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 6. Security review and formal test / narrative
|
|||
|
|
|
|||
|
|
**Problem:** Enterprises and security-conscious users often ask for independent review and a traceable test matrix.
|
|||
|
|
|
|||
|
|
**Goal:** Plan for **independent crypto review** (timing, scope, deliverables) and a **published test / threat matrix** linking `THREAT-MODEL.md` to concrete automated tests (replay, tamper, out-of-order, resume, etc.).
|
|||
|
|
|
|||
|
|
**Deliverables:**
|
|||
|
|
|
|||
|
|
- Internal checklist “preparing for external review” (which files, assumptions, known limits).
|
|||
|
|
- Short section in `SECURITY.md` on review status and how to report findings.
|
|||
|
|
|
|||
|
|
**Acceptance criteria:** One authoritative source for “what is tested automatically” vs “what needs manual/MITM/out-of-band process”.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Dependencies and order (proposal)
|
|||
|
|
|
|||
|
|
| Priority | Topic | Note |
|
|||
|
|
|---------|-------|------|
|
|||
|
|
| Low friction | §1, §4 | Documentation and checklists only |
|
|||
|
|
| Medium | §5 | Design + docs + possibly small API helpers |
|
|||
|
|
| High effort | §2, §3 | Platform work and long-term maintenance |
|
|||
|
|
| Coordinated | §6 | Parallel with maturity in §2–§5 |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Document versioning
|
|||
|
|
|
|||
|
|
- **V2.1** — first consolidated plan for the six improvement areas above. Update with dates and short per-section status lines as work starts and completes.
|
|||
|
|
|
|||
|
|
Gitt der du er nå (post-streams 0.2.0), ville jeg prioritert:
|
|||
|
|
|
|||
|
|
1. Sosial nøkkel-recovery — løser "hva skjer hvis jeg mister telefonen?" som er det store hullet i alle
|
|||
|
|
E2EE-systemer
|
|||
|
|
2. P2P WebRTC transport — naturlig kompletterende til streams; gir massiv throughput-gevinst gratis
|
|||
|
|
3. E2EE Pub/Sub — lavt scope, høy nytte, og passer perfekt med din "magic drop-in"-filosofi
|
|||
|
|
|
|||
|
|
Disse tre har alle samme egenskap: de er infrastruktur-features som hver app du bygger fremover vil dra nytte av,
|
|||
|
|
ikke bare Dispatch. Helt i tråd med "Shade som komponent som dras inn i hvilket som helst prosjekt".
|
|||
|
|
|
|||
|
|
|
|||
|
|
/btw Hva med Shade videostreaming, VOIP streaming, og lignende?
|
|||
|
|
|
|||
|
|
Magic drop-in-tilnærming (din filosofi)
|
|||
|
|
|
|||
|
|
Tenk deg at i en Dispatch-app kan du bare gjøre:
|
|||
|
|
|
|||
|
|
<ShadeVideoCall to="device:server-admin" />
|
|||
|
|
<ShadeVoiceButton to={peerAddress} />
|
|||
|
|
<ShadeBroadcaster streamKey="game-stream-1" />
|
|||
|
|
<ShadeViewer streamKey="game-stream-1" />
|
|||
|
|
|
|||
|
|
…og det bare fungerer, E2EE, P2P når mulig, fallback til relay automatisk. Det er det samme mønsteret som streams
|
|||
|
|
0.2.0, bare for sanntid.
|
|||
|
|
|
|||
|
|
Realisme-sjekk
|
|||
|
|
|
|||
|
|
Video/VOIP er det vanskeligste i hele E2EE-verdenen. Signal brukte år på å få det riktig. Du bør:
|
|||
|
|
1. Ferdigstille streams 0.2.0 først (verifiserer crypto-fundamentet)
|
|||
|
|
2. Bygge P2P WebRTC-transport som separat milestone
|
|||
|
|
3. Da har du alle byggeklossene og Voice 0.4.0 blir 70% gjenbruk
|
|||
|
|
|
|||
|
|
Men ja — dette hører absolutt hjemme i Shade. Shade som "alt-i-ett E2EE-platform" er en mye sterkere posisjon enn
|
|||
|
|
"bare messaging + filer". Du kan bli til E2EE hva Twilio er til vanlig kommunikasjon.
|
|||
|
|
|
|||
|
|
|