Files
Shade/docs/V2.1.md
Sterister fa770d3063
Some checks failed
Test / test (push) Has been cancelled
feat(files): @shade/files 0.3.0 — E2EE filesystem RPC primitive
M-Files-1..6 land the full files-RPC layer + everything 0.3.0 needs to
ship. Apps keep their own UI; this layer ships the typed RPC, the
streams bridge for content I/O, and production hooks (rate limit,
retention, fingerprint gate, metrics).

@shade/files (NEW)
- Standard ops: list/stat/mkdir/delete/move/read/write/getThumbnail with
  Zod-validated wire schemas + clean user-handler types.
- Custom ops: typed via TypeScript declaration merging on CustomOpsMap
  + per-op Zod schemas; client.custom('app.foo', {...}) is fully typed.
- Content I/O: inline (≤ 256 KiB plaintext) base64-in-RPC; streams
  (> 256 KiB) ride @shade/transfer via userMetadata.shadeFilesWriteId
  / shadeFilesReadStreamId correlation. Server-side TransformStream
  bridges accept inbound transfers immediately (engine rejects chunks
  that arrive before accept) and park the readable for the matching
  RPC.
- Directory ops: walk(path, opts) async-iterable depth-first walker;
  uploadDirectory()/downloadDirectory() with bounded concurrency pool
  (default 4, cap 16), aggregated progress, abort.
- Production hooks (callback-based, vendor-neutral): rate-limit (op +
  byte), idempotency cache (LRU + TTL + in-flight de-dupe), path
  policy (traversal + percent-decode hardening), fingerprint gate
  (required/optional/reject), pluggable Ed25519 sig verification with
  ±5 min replay window, onMetric sink (standard names).
- React hooks (subpath @shade/files/react): ShadeFilesProvider,
  useShadeFiles, useFileList, useFileTransfer/Upload/Download.
- Shade.files.serve(handler) + Shade.files.client(peer) high-level
  entrypoint in @shade/sdk; lazy + memoized; one handler per Shade.

Wire format bump
- @shade/proto wire VERSION 0x01 → 0x02. Length prefixes changed from
  u16 to u32. The previous u16 silently truncated payloads above
  64 KiB — a hard correctness ceiling that blocked inline file ops
  up to 256 KiB. Wire-incompatible with 0.2.x peers; new sessions
  only. Cross-platform Kotlin port (android/shade-android) updated to
  match; test-vectors/wire-format.json regenerated.

Concurrency safety
- ShadeSessionManager.encrypt/.decrypt now run under per-peer mutex.
  Concurrent decryptions of the same peer raced ratchet state
  (manifested as sporadic "Failed to decrypt — wrong key or tampered
  data" under load — surfaced once concurrent uploadDirectory pumped
  many writes in flight). Encrypt was already serialized via
  Shade.send's encryptChains; decrypt is now serialized at the
  manager layer too.

@shade/streams extension
- StreamMetadata.userMetadata?: Record<string, string> for
  application-level key/value pairs that round-trip verbatim through
  stream-init plaintext. Used by @shade/files for write/read
  correlation; available to any consumer.

@shade/sdk extension
- Shade.files getter (lazy + memoized).
- BackgroundHooks.onPruneFiles + periodic timer (default 5 min) +
  BackgroundTasks.setHook(name, fn) for runtime hook registration.

Bundles in-flight 0.2.0 work
- packages/shade-streams/, packages/shade-transfer/, related
  shade-sdk streams-bridge + shade-widgets transfer hooks were
  uncommitted prior to this session. Including them keeps the
  workspace consistent at 0.3.0 since @shade/files depends on them.

Tests
- 74 new tests in @shade/files (572 → 646 workspace pass; 0 fail;
  3× stable). Coverage spans unit (inline-threshold + concurrency),
  integration (read-write inline + streams up to 1 MiB, walk +
  upload/download directory, custom-op, metrics, SDK namespace
  end-to-end), and security (tampered-envelope sig verification,
  replay window, fingerprint gate, rate-limit + quota).

Release artifacts
- All packages bumped to 0.3.0 via scripts/bump-version.ts.
- scripts/publish-all.ts PACKAGES updated with shade-files in
  topological order (after shade-transfer, before shade-sdk).
- bun run publish:dry clean (14 packed, 0 failed).
- examples/08-files-browser/ — three-process CLI demo (prekey + Bob
  server + Alice CLI) covering list/stat/mkdir/delete/upload/download.
- docs/files.md — full API + design doc.
- CHANGELOG.md 0.3.0 entry.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 14:00:01 +02:00

152 lines
9.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Shade V2.1 — Improvements (infrastructure, storage, operations, security)
This document describes **improvements** agreed for next-generation work on Shade: clearer product story, stronger storage, mobile parity, operational hardening, transfer abuse, and a formal security narrative.
**Audience:** **Maintainers and contributors** implementing the changes. Add status fields as items land in code/docs.
---
## 1. Clear “who is the server?” and data flow
**Problem:** New users may think the prekey server is a message hub or that all E2EE traffic goes through the Shade container.
**Goal:** One consistent explanation across the root README, package READMEs, and optional onboarding: **the prekey server distributes public keys and bundles**; **actual messages and (typically) file chunks go through your apps own channel** (your transport, your backend, your URLs).
**Deliverables (proposal):**
- Diagram + short “keys vs payloads” text in the root README and in `@shade/server` README.
- Link to `THREAT-MODEL.md` from the same section (MITM on first contact ↔ safety numbers).
- Optionally one “concept page” (or extend `MIGRATION.md`) with typical architecture: *A ↔ B via app; both talk to the prekey host for X3DH material*.
**Acceptance criteria:** A new developer without domain background understands in one reading *what* goes to the Shade server and *what* does not.
---
## 2. Optional encryption of storage (at-rest)
**Problem:** `THREAT-MODEL.md` states that a stolen DB + filesystem can expose private keys because Shade does not encrypt the storage layer by default.
**Goal:** **Opt-in** protection for sensitive state (identity, session, optional stream resume secrets) with keys that **do not** live in plaintext in the DB — e.g. OS keychain/Keystore, passphrase + KDF, or an explicit device key injected by the app.
**Design principles:**
- Default developer experience (dev, simple demos) stays unchanged or includes a clear “insecure mode” warning in docs.
- APIs implementable per platform (Bun/SQLite, Postgres, web/IndexedDB, Android).
- Document limitations: what remains uncovered (e.g. active memory compromise).
**Acceptance criteria:** Threat model updated for “when encrypted storage is enabled”; at least one reference implementation + migration note.
---
## 3. Android parity and a published roadmap
**Problem:** `shade-android` is under development; drift from the TS SDK undermines the “byte-compatible” promise.
**Goal:** A **published roadmap** (milestones + what counts as parity vs TS-only) and **CI running shared test vectors** as a merge gate before release.
**Deliverables:**
- Roadmap section in `android/shade-android/README.md` or dedicated `ROADMAP-ANDROID.md` with explicit cross-checkpoints: wire format, fingerprints, rotations, streams (`0x11`) where applicable, resume semantics.
- CI job that fails on Kotlin vs TS vector mismatch.
**Acceptance criteria:** Parity coverage is visible and enforceable; the first critical cross-surface (e.g. core ratchet + proto) is green before a “production” label.
---
## 4. Operational hardening — prekey container and production
**Problem:** Many teams deploy the Docker image quickly; mistakes around TLS, backups, and secrets add avoidable risk.
**Goal:** A **production checklist**: TLS termination, volume backup (`/data`), rotation of `SHADE_OBSERVER_TOKEN`, use of `SHADE_PREKEY_PG_URL` vs SQLite, observability hooks, logging levels, meaning of stale cleanup parameters.
**Deliverables:**
- Extend `docs/DEPLOYMENT.md` or add short `docs/PRODUCTION-CHECKLIST.md` with bullet defaults.
- Link from the main README under “Deployment”.
**Acceptance criteria:** A checklist operators can follow without reading the whole codebase first.
---
## 5. Abuse and resource limits on the transfer plane
**Problem:** Parallel lanes and large uploads can be abused for resource or storage if consumer mounts of `createTransferRoutes()` share no coherent policy.
**Goal:** Documented **limits and patterns**: authentication (already an active SDK topic), max stream size, TTL for temporary chunk storage, quotas per identity or IP where sensible.
**Deliverables:**
- Guidelines in `docs/streams.md` or a dedicated “Transfer hardening” section.
- Optional helpers or middleware examples in `@shade/transfer` / server routes for common limits (without forcing every deployment into one DB model).
**Acceptance criteria:** A clear “recommended minimum” for production that teams can copy.
---
## 6. Security review and formal test / narrative
**Problem:** Enterprises and security-conscious users often ask for independent review and a traceable test matrix.
**Goal:** Plan for **independent crypto review** (timing, scope, deliverables) and a **published test / threat matrix** linking `THREAT-MODEL.md` to concrete automated tests (replay, tamper, out-of-order, resume, etc.).
**Deliverables:**
- Internal checklist “preparing for external review” (which files, assumptions, known limits).
- Short section in `SECURITY.md` on review status and how to report findings.
**Acceptance criteria:** One authoritative source for “what is tested automatically” vs “what needs manual/MITM/out-of-band process”.
---
## Dependencies and order (proposal)
| Priority | Topic | Note |
|---------|-------|------|
| Low friction | §1, §4 | Documentation and checklists only |
| Medium | §5 | Design + docs + possibly small API helpers |
| High effort | §2, §3 | Platform work and long-term maintenance |
| Coordinated | §6 | Parallel with maturity in §2§5 |
---
## Document versioning
- **V2.1** — first consolidated plan for the six improvement areas above. Update with dates and short per-section status lines as work starts and completes.
Gitt der du er nå (post-streams 0.2.0), ville jeg prioritert:
1. Sosial nøkkel-recovery — løser "hva skjer hvis jeg mister telefonen?" som er det store hullet i alle
E2EE-systemer
2. P2P WebRTC transport — naturlig kompletterende til streams; gir massiv throughput-gevinst gratis
3. E2EE Pub/Sub — lavt scope, høy nytte, og passer perfekt med din "magic drop-in"-filosofi
Disse tre har alle samme egenskap: de er infrastruktur-features som hver app du bygger fremover vil dra nytte av,
ikke bare Dispatch. Helt i tråd med "Shade som komponent som dras inn i hvilket som helst prosjekt".
/btw Hva med Shade videostreaming, VOIP streaming, og lignende?
Magic drop-in-tilnærming (din filosofi)
Tenk deg at i en Dispatch-app kan du bare gjøre:
<ShadeVideoCall to="device:server-admin" />
<ShadeVoiceButton to={peerAddress} />
<ShadeBroadcaster streamKey="game-stream-1" />
<ShadeViewer streamKey="game-stream-1" />
…og det bare fungerer, E2EE, P2P når mulig, fallback til relay automatisk. Det er det samme mønsteret som streams
0.2.0, bare for sanntid.
Realisme-sjekk
Video/VOIP er det vanskeligste i hele E2EE-verdenen. Signal brukte år på å få det riktig. Du bør:
1. Ferdigstille streams 0.2.0 først (verifiserer crypto-fundamentet)
2. Bygge P2P WebRTC-transport som separat milestone
3. Da har du alle byggeklossene og Voice 0.4.0 blir 70% gjenbruk
Men ja — dette hører absolutt hjemme i Shade. Shade som "alt-i-ett E2EE-platform" er en mye sterkere posisjon enn
"bare messaging + filer". Du kan bli til E2EE hva Twilio er til vanlig kommunikasjon.