docs/archive/V2.1.md

# Shade V2.1 — Improvements (infrastructure, storage, operations, security)

This document describes **improvements** agreed for next-generation work on Shade: clearer product story, stronger storage, mobile parity, operational hardening, transfer abuse, and a formal security narrative.

**Audience:** **Maintainers and contributors** implementing the changes. Add status fields as items land in code/docs.

---

## 1. Clear “who is the server?” and data flow

**Problem:** New users may think the prekey server is a message hub or that all E2EE traffic goes through the Shade container.

**Goal:** One consistent explanation across the root README, package READMEs, and optional onboarding: **the prekey server distributes public keys and bundles**; **actual messages and (typically) file chunks go through your app’s own channel** (your transport, your backend, your URLs).

**Deliverables (proposal):**

- Diagram + short “keys vs payloads” text in the root README and in `@shade/server` README.
- Link to `THREAT-MODEL.md` from the same section (MITM on first contact ↔ safety numbers).
- Optionally one “concept page” (or extend `MIGRATION.md`) with typical architecture: *A ↔ B via app; both talk to the prekey host for X3DH material*.

**Acceptance criteria:** A new developer without domain background understands in one reading *what* goes to the Shade server and *what* does not.

---

## 2. Optional encryption of storage (at-rest)

**Problem:** `THREAT-MODEL.md` states that a stolen DB + filesystem can expose private keys because Shade does not encrypt the storage layer by default.

**Goal:** **Opt-in** protection for sensitive state (identity, session, optional stream resume secrets) with keys that **do not** live in plaintext in the DB — e.g. OS keychain/Keystore, passphrase + KDF, or an explicit device key injected by the app.

**Design principles:**

- Default developer experience (dev, simple demos) stays unchanged or includes a clear “insecure mode” warning in docs.
- APIs implementable per platform (Bun/SQLite, Postgres, web/IndexedDB, Android).
- Document limitations: what remains uncovered (e.g. active memory compromise).

**Acceptance criteria:** Threat model updated for “when encrypted storage is enabled”; at least one reference implementation + migration note.

---

## 3. Android parity and a published roadmap

**Problem:** `shade-android` is under development; drift from the TS SDK undermines the “byte-compatible” promise.

**Goal:** A **published roadmap** (milestones + what counts as parity vs TS-only) and **CI running shared test vectors** as a merge gate before release.

**Deliverables:**

- Roadmap section in `android/shade-android/README.md` or dedicated `ROADMAP-ANDROID.md` with explicit cross-checkpoints: wire format, fingerprints, rotations, streams (`0x11`) where applicable, resume semantics.
- CI job that fails on Kotlin vs TS vector mismatch.

**Acceptance criteria:** Parity coverage is visible and enforceable; the first critical cross-surface (e.g. core ratchet + proto) is green before a “production” label.

---

## 4. Operational hardening — prekey container and production

**Problem:** Many teams deploy the Docker image quickly; mistakes around TLS, backups, and secrets add avoidable risk.

**Goal:** A **production checklist**: TLS termination, volume backup (`/data`), rotation of `SHADE_OBSERVER_TOKEN`, use of `SHADE_PREKEY_PG_URL` vs SQLite, observability hooks, logging levels, meaning of stale cleanup parameters.

**Deliverables:**

- Extend `docs/DEPLOYMENT.md` or add short `docs/PRODUCTION-CHECKLIST.md` with bullet defaults.
- Link from the main README under “Deployment”.

**Acceptance criteria:** A checklist operators can follow without reading the whole codebase first.

---

## 5. Abuse and resource limits on the transfer plane

**Problem:** Parallel lanes and large uploads can be abused for resource or storage if consumer mounts of `createTransferRoutes()` share no coherent policy.

**Goal:** Documented **limits and patterns**: authentication (already an active SDK topic), max stream size, TTL for temporary chunk storage, quotas per identity or IP where sensible.

**Deliverables:**

- Guidelines in `docs/streams.md` or a dedicated “Transfer hardening” section.
- Optional helpers or middleware examples in `@shade/transfer` / server routes for common limits (without forcing every deployment into one DB model).

**Acceptance criteria:** A clear “recommended minimum” for production that teams can copy.

---

## 6. Security review and formal test / narrative

**Problem:** Enterprises and security-conscious users often ask for independent review and a traceable test matrix.

**Goal:** Plan for **independent crypto review** (timing, scope, deliverables) and a **published test / threat matrix** linking `THREAT-MODEL.md` to concrete automated tests (replay, tamper, out-of-order, resume, etc.).

**Deliverables:**

- Internal checklist “preparing for external review” (which files, assumptions, known limits).
- Short section in `SECURITY.md` on review status and how to report findings.

**Acceptance criteria:** One authoritative source for “what is tested automatically” vs “what needs manual/MITM/out-of-band process”.

---

## Dependencies and order (proposal)

| Priority | Topic | Note |
|---------|-------|------|
| Low friction | §1, §4 | Documentation and checklists only |
| Medium | §5 | Design + docs + possibly small API helpers |
| High effort | §2, §3 | Platform work and long-term maintenance |
| Coordinated | §6 | Parallel with maturity in §2–§5 |

---

## Document versioning

- **V2.1** — first consolidated plan for the six improvement areas above. Update with dates and short per-section status lines as work starts and completes.

Gitt der du er nå (post-streams 0.2.0), ville jeg prioritert:

    1. Sosial nøkkel-recovery — løser "hva skjer hvis jeg mister telefonen?" som er det store hullet i alle
    E2EE-systemer
    2. P2P WebRTC transport — naturlig kompletterende til streams; gir massiv throughput-gevinst gratis
    3. E2EE Pub/Sub — lavt scope, høy nytte, og passer perfekt med din "magic drop-in"-filosofi

    Disse tre har alle samme egenskap: de er infrastruktur-features som hver app du bygger fremover vil dra nytte av,
     ikke bare Dispatch. Helt i tråd med "Shade som komponent som dras inn i hvilket som helst prosjekt".


  /btw Hva med Shade videostreaming, VOIP streaming, og lignende?                                                    
                                                                                                                     
    Magic drop-in-tilnærming (din filosofi)                                                                          
                                                                                                                     
    Tenk deg at i en Dispatch-app kan du bare gjøre:                                                                 
                                                                                                                     
    <ShadeVideoCall to="device:server-admin" />                                                                      
    <ShadeVoiceButton to={peerAddress} />                                                                            
    <ShadeBroadcaster streamKey="game-stream-1" />                                                                   
    <ShadeViewer streamKey="game-stream-1" />                                                                        
                                                                                                                     
    …og det bare fungerer, E2EE, P2P når mulig, fallback til relay automatisk. Det er det samme mønsteret som streams
     0.2.0, bare for sanntid.                                                                                        
                                                                                                                     
    Realisme-sjekk                                                                                                   
                                                                                                                     
    Video/VOIP er det vanskeligste i hele E2EE-verdenen. Signal brukte år på å få det riktig. Du bør:                
    1. Ferdigstille streams 0.2.0 først (verifiserer crypto-fundamentet)                                             
    2. Bygge P2P WebRTC-transport som separat milestone                                                              
    3. Da har du alle byggeklossene og Voice 0.4.0 blir 70% gjenbruk                                                 
                                                                                                                     
    Men ja — dette hører absolutt hjemme i Shade. Shade som "alt-i-ett E2EE-platform" er en mye sterkere posisjon enn
     "bare messaging + filer". Du kan bli til E2EE hva Twilio er til vanlig kommunikasjon.
-												feat(files): @shade/files 0.3.0 — E2EE filesystem RPC primitive

M-Files-1..6 land the full files-RPC layer + everything 0.3.0 needs to
ship. Apps keep their own UI; this layer ships the typed RPC, the
streams bridge for content I/O, and production hooks (rate limit,
retention, fingerprint gate, metrics).

@shade/files (NEW)
- Standard ops: list/stat/mkdir/delete/move/read/write/getThumbnail with
  Zod-validated wire schemas + clean user-handler types.
- Custom ops: typed via TypeScript declaration merging on CustomOpsMap
  + per-op Zod schemas; client.custom('app.foo', {...}) is fully typed.
- Content I/O: inline (≤ 256 KiB plaintext) base64-in-RPC; streams
  (> 256 KiB) ride @shade/transfer via userMetadata.shadeFilesWriteId
  / shadeFilesReadStreamId correlation. Server-side TransformStream
  bridges accept inbound transfers immediately (engine rejects chunks
  that arrive before accept) and park the readable for the matching
  RPC.
- Directory ops: walk(path, opts) async-iterable depth-first walker;
  uploadDirectory()/downloadDirectory() with bounded concurrency pool
  (default 4, cap 16), aggregated progress, abort.
- Production hooks (callback-based, vendor-neutral): rate-limit (op +
  byte), idempotency cache (LRU + TTL + in-flight de-dupe), path
  policy (traversal + percent-decode hardening), fingerprint gate
  (required/optional/reject), pluggable Ed25519 sig verification with
  ±5 min replay window, onMetric sink (standard names).
- React hooks (subpath @shade/files/react): ShadeFilesProvider,
  useShadeFiles, useFileList, useFileTransfer/Upload/Download.
- Shade.files.serve(handler) + Shade.files.client(peer) high-level
  entrypoint in @shade/sdk; lazy + memoized; one handler per Shade.

Wire format bump
- @shade/proto wire VERSION 0x01 → 0x02. Length prefixes changed from
  u16 to u32. The previous u16 silently truncated payloads above
  64 KiB — a hard correctness ceiling that blocked inline file ops
  up to 256 KiB. Wire-incompatible with 0.2.x peers; new sessions
  only. Cross-platform Kotlin port (android/shade-android) updated to
  match; test-vectors/wire-format.json regenerated.

Concurrency safety
- ShadeSessionManager.encrypt/.decrypt now run under per-peer mutex.
  Concurrent decryptions of the same peer raced ratchet state
  (manifested as sporadic "Failed to decrypt — wrong key or tampered
  data" under load — surfaced once concurrent uploadDirectory pumped
  many writes in flight). Encrypt was already serialized via
  Shade.send's encryptChains; decrypt is now serialized at the
  manager layer too.

@shade/streams extension
- StreamMetadata.userMetadata?: Record<string, string> for
  application-level key/value pairs that round-trip verbatim through
  stream-init plaintext. Used by @shade/files for write/read
  correlation; available to any consumer.

@shade/sdk extension
- Shade.files getter (lazy + memoized).
- BackgroundHooks.onPruneFiles + periodic timer (default 5 min) +
  BackgroundTasks.setHook(name, fn) for runtime hook registration.

Bundles in-flight 0.2.0 work
- packages/shade-streams/, packages/shade-transfer/, related
  shade-sdk streams-bridge + shade-widgets transfer hooks were
  uncommitted prior to this session. Including them keeps the
  workspace consistent at 0.3.0 since @shade/files depends on them.

Tests
- 74 new tests in @shade/files (572 → 646 workspace pass; 0 fail;
  3× stable). Coverage spans unit (inline-threshold + concurrency),
  integration (read-write inline + streams up to 1 MiB, walk +
  upload/download directory, custom-op, metrics, SDK namespace
  end-to-end), and security (tampered-envelope sig verification,
  replay window, fingerprint gate, rate-limit + quota).

Release artifacts
- All packages bumped to 0.3.0 via scripts/bump-version.ts.
- scripts/publish-all.ts PACKAGES updated with shade-files in
  topological order (after shade-transfer, before shade-sdk).
- bun run publish:dry clean (14 packed, 0 failed).
- examples/08-files-browser/ — three-process CLI demo (prekey + Bob
  server + Alice CLI) covering list/stat/mkdir/delete/upload/download.
- docs/files.md — full API + design doc.
- CHANGELOG.md 0.3.0 entry.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-02 14:00:01 +02:00
+								# Shade V2.1 — Improvements (infrastructure, storage, operations, security)
 								This document describes **improvements** agreed for next-generation work on Shade: clearer product story, stronger storage, mobile parity, operational hardening, transfer abuse, and a formal security narrative.
 								**Audience:** **Maintainers and contributors** implementing the changes. Add status fields as items land in code/docs.
 								---
 								## 1. Clear “who is the server?” and data flow
 								**Problem:** New users may think the prekey server is a message hub or that all E2EE traffic goes through the Shade container.
 								**Goal:** One consistent explanation across the root README, package READMEs, and optional onboarding: **the prekey server distributes public keys and bundles**; **actual messages and (typically) file chunks go through your app’s own channel** (your transport, your backend, your URLs).
 								**Deliverables (proposal):**
 								- Diagram + short “keys vs payloads” text in the root README and in `@shade/server` README.
 								- Link to `THREAT-MODEL.md` from the same section (MITM on first contact ↔ safety numbers).
 								- Optionally one “concept page” (or extend `MIGRATION.md`) with typical architecture: *A ↔ B via app; both talk to the prekey host for X3DH material*.
 								**Acceptance criteria:** A new developer without domain background understands in one reading *what* goes to the Shade server and *what* does not.
 								---
 								## 2. Optional encryption of storage (at-rest)
 								**Problem:** `THREAT-MODEL.md` states that a stolen DB + filesystem can expose private keys because Shade does not encrypt the storage layer by default.
 								**Goal:** **Opt-in** protection for sensitive state (identity, session, optional stream resume secrets) with keys that **do not** live in plaintext in the DB — e.g. OS keychain/Keystore, passphrase + KDF, or an explicit device key injected by the app.
 								**Design principles:**
 								- Default developer experience (dev, simple demos) stays unchanged or includes a clear “insecure mode” warning in docs.
 								- APIs implementable per platform (Bun/SQLite, Postgres, web/IndexedDB, Android).
 								- Document limitations: what remains uncovered (e.g. active memory compromise).
 								**Acceptance criteria:** Threat model updated for “when encrypted storage is enabled”; at least one reference implementation + migration note.
 								---
 								## 3. Android parity and a published roadmap
 								**Problem:** `shade-android` is under development; drift from the TS SDK undermines the “byte-compatible” promise.
 								**Goal:** A **published roadmap** (milestones + what counts as parity vs TS-only) and **CI running shared test vectors** as a merge gate before release.
 								**Deliverables:**
 								- Roadmap section in `android/shade-android/README.md` or dedicated `ROADMAP-ANDROID.md` with explicit cross-checkpoints: wire format, fingerprints, rotations, streams (`0x11`) where applicable, resume semantics.
 								- CI job that fails on Kotlin vs TS vector mismatch.
 								**Acceptance criteria:** Parity coverage is visible and enforceable; the first critical cross-surface (e.g. core ratchet + proto) is green before a “production” label.
 								---
 								## 4. Operational hardening — prekey container and production
 								**Problem:** Many teams deploy the Docker image quickly; mistakes around TLS, backups, and secrets add avoidable risk.
 								**Goal:** A **production checklist**: TLS termination, volume backup (`/data`), rotation of `SHADE_OBSERVER_TOKEN`, use of `SHADE_PREKEY_PG_URL` vs SQLite, observability hooks, logging levels, meaning of stale cleanup parameters.
 								**Deliverables:**
 								- Extend `docs/DEPLOYMENT.md` or add short `docs/PRODUCTION-CHECKLIST.md` with bullet defaults.
 								- Link from the main README under “Deployment”.
 								**Acceptance criteria:** A checklist operators can follow without reading the whole codebase first.
 								---
 								## 5. Abuse and resource limits on the transfer plane
 								**Problem:** Parallel lanes and large uploads can be abused for resource or storage if consumer mounts of `createTransferRoutes()` share no coherent policy.
 								**Goal:** Documented **limits and patterns**: authentication (already an active SDK topic), max stream size, TTL for temporary chunk storage, quotas per identity or IP where sensible.
 								**Deliverables:**
 								- Guidelines in `docs/streams.md` or a dedicated “Transfer hardening” section.
 								- Optional helpers or middleware examples in `@shade/transfer` / server routes for common limits (without forcing every deployment into one DB model).
 								**Acceptance criteria:** A clear “recommended minimum” for production that teams can copy.
 								---
 								## 6. Security review and formal test / narrative
 								**Problem:** Enterprises and security-conscious users often ask for independent review and a traceable test matrix.
 								**Goal:** Plan for **independent crypto review** (timing, scope, deliverables) and a **published test / threat matrix** linking `THREAT-MODEL.md` to concrete automated tests (replay, tamper, out-of-order, resume, etc.).
 								**Deliverables:**
 								- Internal checklist “preparing for external review” (which files, assumptions, known limits).
 								- Short section in `SECURITY.md` on review status and how to report findings.
 								**Acceptance criteria:** One authoritative source for “what is tested automatically” vs “what needs manual/MITM/out-of-band process”.
 								---
 								## Dependencies and order (proposal)
 								| Priority | Topic | Note |
 								|---------|-------|------|
 								| Low friction | §1, §4 | Documentation and checklists only |
 								| Medium | §5 | Design + docs + possibly small API helpers |
 								| High effort | §2, §3 | Platform work and long-term maintenance |
 								| Coordinated | §6 | Parallel with maturity in §2–§5 |
 								---
 								## Document versioning
 								- **V2.1** — first consolidated plan for the six improvement areas above. Update with dates and short per-section status lines as work starts and completes.
 								Gitt der du er nå (post-streams 0.2.0), ville jeg prioritert:
 . Sosial nøkkel-recovery — løser "hva skjer hvis jeg mister telefonen?" som er det store hullet i alle
 								    E2EE-systemer
 . P2P WebRTC transport — naturlig kompletterende til streams; gir massiv throughput-gevinst gratis
 . E2EE Pub/Sub — lavt scope, høy nytte, og passer perfekt med din "magic drop-in"-filosofi
 								    Disse tre har alle samme egenskap: de er infrastruktur-features som hver app du bygger fremover vil dra nytte av,
 								     ikke bare Dispatch. Helt i tråd med "Shade som komponent som dras inn i hvilket som helst prosjekt".
 								  /btw Hva med Shade videostreaming, VOIP streaming, og lignende?
 								    Magic drop-in-tilnærming (din filosofi)
 								    Tenk deg at i en Dispatch-app kan du bare gjøre:
 								    <ShadeVideoCall to="device:server-admin" />
 								    <ShadeVoiceButton to={peerAddress} />
 								    <ShadeBroadcaster streamKey="game-stream-1" />
 								    <ShadeViewer streamKey="game-stream-1" />
 								    …og det bare fungerer, E2EE, P2P når mulig, fallback til relay automatisk. Det er det samme mønsteret som streams
 .2.0, bare for sanntid.
 								    Realisme-sjekk
 								    Video/VOIP er det vanskeligste i hele E2EE-verdenen. Signal brukte år på å få det riktig. Du bør:
 . Ferdigstille streams 0.2.0 først (verifiserer crypto-fundamentet)
 . Bygge P2P WebRTC-transport som separat milestone
 . Da har du alle byggeklossene og Voice 0.4.0 blir 70% gjenbruk
 								    Men ja — dette hører absolutt hjemme i Shade. Shade som "alt-i-ett E2EE-platform" er en mye sterkere posisjon enn
 								     "bare messaging + filer". Du kan bli til E2EE hva Twilio er til vanlig kommunikasjon.