release(v4.0.0): Shade GA — V3.x consolidation + audit prep
Some checks failed
Test / test (push) Has been cancelled
Cross-platform vectors / TypeScript vectors (bun) (push) Has been cancelled
Cross-platform vectors / Kotlin vectors (gradle) (push) Has been cancelled
Docker build and publish / docker (push) Has been cancelled
Publish / publish (push) Has been cancelled

V3.1 → V3.12 consolidated and tagged for the first GA release. Wire
format unchanged from 0.4.x — 4.0 peers interoperate with 0.4.x peers
byte-for-byte. The version bump is semantic: audit-cycle complete,
opt-in surface fully exposed, threat model refreshed for every new
surface.

Highlights:
- All 24 @shade/* packages bumped to 4.0.0 in lockstep.
- CHANGELOG 4.0.0 section is the canonical manifest of what landed.
- THREAT-MODEL extended (§10 fingerprint gates, §11 WebRTC P2P, §12
  Web-Worker boundary) + residual-risks table refreshed.
- OpenAPI now covers all 27 routes: prekey, transfer, KT, inbox,
  bridge, observer, /metrics, /healthz, /ready.
- MIGRATION 0.3.x → 4.0 documented + smoke-tested against
  shade migrate-storage on a real SQLite DB.
- docs/audit/REVIEW-BUNDLE.md + SCOPE.md ready for external reviewer.
- scripts/soak.ts harness for the GA-stable 2-week soak window.
- All V*.md plans archived under docs/archive/ with Status: Done.
- Voice/Video carved out into V5.0; 4.0 audit focuses on the frozen
  non-realtime stack.

Tests: TS 1000/1000 + Kotlin 11/11 cross-platform vectors green.
Docker: gt.zyon.no/stian/shade-prekey:4.0.0 builds and reports
  version 4.0.0 on /health.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-03 18:35:35 +02:00
parent 8b055912b7
commit e6fdf31b49
298 changed files with 37909 additions and 256 deletions

View File

@@ -164,9 +164,190 @@ Nova's `pushDevices.encryptionKey` column is a per-device static AES key. To mig
During the rollout, send notifications with a `v: 1` (legacy) or `v: 2` (Shade) field so old and new clients coexist.
## Migration to at-rest encryption (V3.2)
Shade 0.4.0 ships `@shade/storage-encrypted` — opt-in AES-256-GCM
encryption of every sensitive payload in the local SQLite/Postgres store.
Existing 0.3.x deploys keep their unencrypted DB and behave exactly as
before; encryption is enabled per-deployment with one CLI command.
### One-shot migration (SQLite)
```bash
# Encrypts in place, drops unencrypted tables, leaves a .bak alongside.
shade migrate-storage \
--key-source passphrase \
--passphrase "$SHADE_STORAGE_PASSPHRASE" \
--salt-file /data/shade-client.db.salt
```
For a dry run that validates every row without writing:
`shade migrate-storage … --dry-run`.
### Code-level switch
Replace:
```ts
import { SQLiteStorage } from '@shade/storage-sqlite';
const storage = new SQLiteStorage('/data/shade-client.db');
```
with:
```ts
import { KeyManager, EncryptedSQLiteStorage } from '@shade/storage-encrypted';
const km = await KeyManager.open({
kind: 'passphrase',
passphrase: process.env.SHADE_STORAGE_PASSPHRASE!,
salt: loadSaltFromDisk(),
});
const storage = await EncryptedSQLiteStorage.open({
dbPath: '/data/shade-client.db',
keyManager: km,
});
```
The encrypted store implements the same `StorageProvider`, so
`ShadeSessionManager` and the rest of the wiring is unchanged.
See `docs/storage-encryption.md` for the full design, key sources
(passphrase / OS keychain / app-injected) and rotation.
## Migrating from 0.3.x to 4.0 (GA)
Shade 4.0 is the GA-frozen baseline. Everything from V3.2V3.12 is
merged, externally reviewed, and the wire format is locked. Nothing is
breaking on the wire compared to 0.4.x — peers continue to interoperate.
The 4.0 migration is therefore mostly **opt-in surface activation**
plus a version-bump.
### What stays the same
- Wire envelope `0x02` (RatchetMessage) with u32 length-prefixes.
- Wire envelope `0x11` (stream-chunk) for `@shade/streams`.
- HTTP shape of all `/v1/keys/...` and `/v1/transfer/...` endpoints.
- All `StorageProvider` core method signatures.
- Identity fingerprints, X3DH flow, Ed25519 signature format.
A 0.3.x peer that has not enabled any opt-ins talks to a 4.0 peer
without code changes. The version bump is semantic ("we have completed
the audit cycle"), not breaking.
### What's new (opt-in)
| Surface | Package | How to enable |
|---------|---------|---------------|
| At-rest encryption | `@shade/storage-encrypted` | `shade migrate-storage` (see above) |
| Async store-and-forward | `@shade/inbox`, `@shade/inbox-server` | `createInboxServer()` + `new Inbox()` |
| Bridge transports (SSE, long-poll) | `@shade/transport-bridge`, `createBridgeRoutes()` | mount bridge routes; `FallbackBridgeTransport` |
| Web Workers crypto | `@shade/crypto-web/worker` | `shade.configureWorkerCrypto({ workerUrl })` |
| Social key recovery | `@shade/recovery` | `setupRecovery / attachGuardian / requestRecovery` |
| WebRTC P2P transport | `@shade/transport-webrtc` (peer-dep) | `shade.configureWebRTC({ factory })` |
| Key Transparency | `@shade/key-transparency`, `createPrekeyServerWithKT(...)` | server: `keyTransparency: { ... }` config; client: `keyTransparency: { mode, logPublicKey }` on `createShade` |
| Trust UX gates | built-in to `@shade/sdk` | `shade.beforeFirstLargeFile / beforeBackupImport / beforeNewDeviceTrust(...)` |
| Files RPC | `@shade/files` | `shade.files.serve(handler)` + `shade.files.client(peer)` |
Pulling in **none** of these gives you the 1.0-shape API at 4.0 quality
(audit-completed, soak-tested). Pulling in **all** of them gives the
full 4.0 stack.
### Schema additions
`StorageProvider` implementations (sqlite, postgres, encrypted variants)
auto-create the additional tables on `ensureTables()` /
`initialize()`. The 4.0 superset:
```sql
-- V3.2 (storage encryption) — only when EncryptedSQLiteStorage / EncryptedPostgresStorage is used
shade_master_key_meta(...) -- KeyManager fingerprint + scrypt params
shade_field_keys(...) -- per-(table, column) wrapped DEKs
-- V3.3 (fingerprint gates)
peer_verifications(...) -- markPeerVerified persistence
peer_identity_versions(...) -- bump on acceptIdentityChange
-- V3.6 (inbox relay)
shade_inbox_register(...) -- TOFU bind address ↔ signing key
shade_inbox_blobs(...) -- ciphertext blobs with TTL + msgId
-- V3.10 (recovery)
shade_recovery_setup(...) -- per-recoverer state
shade_recovery_deposits(...) -- per-guardian deposited shares
-- V3.12 (KT — server only)
shade_kt_leaves(...) -- append-only Merkle leaves
shade_kt_index(...) -- address-sorted commitment
shade_kt_sths(...) -- signed tree heads
-- streams resume (V0.2.0+, listed for completeness)
stream_state(...) -- at-rest encrypted streamSecret
```
A 0.3.x deploy that upgrades the package without enabling any new
surface gets these tables created on first start; they stay empty
unless the corresponding feature is wired. There is **no destructive
migration**. To verify before upgrading production:
```bash
shade doctor --db-path /data/shade-client.db
```
The CLI reports any mismatch between the on-disk schema and the version
the installed packages expect.
### Step-by-step upgrade (typical app)
1. **Bump dependencies.** Update every `@shade/*` to `^4.0.0` in your
`package.json`. Bun / npm / pnpm pull from the Gitea registry as
per `.npmrc`.
2. **Re-run install.** `bun install` (or your tool of choice). The new
table definitions ship with the storage backends — no schema-edit
PRs against your DB.
3. **Boot once with no new opt-ins.** Existing send/receive should work
byte-identically. `shade doctor` should print all green.
4. **Pick the opt-ins you actually want.** Wire them one at a time
(storage-encryption first, then fingerprint gates, then any of the
recovery / KT / WebRTC / inbox surfaces). Each surface has its own
doc under `docs/` (`storage-encryption.md`, `trust-ux.md`,
`recovery.md`, `key-transparency.md`, `webrtc.md`, `inbox.md`,
`transport.md`, `web-workers.md`, `files.md`).
5. **Run cross-version smoke.** Boot a 0.3.x peer next to a 4.0 peer in
staging; exchange a session; confirm `shade fingerprint` matches on
both ends and a round-trip message decrypts cleanly.
6. **Ship 4.0 to a canary.** Roll forward; revert path is `bun
install @shade/sdk@^0.4.0` — there is no DB write that 0.4 cannot
also read.
### Operator checklist (prekey container)
If you operate the standalone container (`gt.zyon.no/stian/shade-prekey`):
1. Pull the 4.0 image: `docker pull gt.zyon.no/stian/shade-prekey:4.0.0`.
2. Add new env vars only if you are turning the corresponding surface
on:
- `SHADE_INBOX_PG_URL` / `SHADE_INBOX_DB_PATH` — async store-and-forward.
- `SHADE_INBOX_PRUNE_INTERVAL_MINUTES` — inbox prune cadence.
- `SHADE_BRIDGE_*` — bridge / SSE / long-poll surface.
- `SHADE_KT_*` — Key Transparency mode + signing key path.
- `SHADE_TRANSFER_*` — transfer routes mounted on the same Hono app.
3. Restart with the existing volume; the inbox / KT tables auto-create
on first request.
4. Update `docs/PRODUCTION-CHECKLIST.md` items for any new surface
you've enabled (rate-limit budgets, retention policies, KT
witness-pinning).
5. Verify the [OpenAPI](packages/shade-server/openapi.yaml) endpoints
you advertise to clients now include the routes you mounted.
### What about 4.0 → 4.x?
V4.x is bug-fix only. No wire-bump until V5.0 (voice/video) which
is **additive** — it allocates new envelope types (frame-key prefixes)
that 4.0 clients ignore by design.
## Common pitfalls
1. **Don't store private keys in shared databases without encryption at rest**Shade trusts the storage layer to be secure. Use filesystem encryption or PostgreSQL TDE if the database is on shared infrastructure.
1. **Don't store private keys in shared databases without encryption at rest** — for shared infrastructure, enable `@shade/storage-encrypted` (V3.2) or use filesystem encryption / PostgreSQL TDE. The default `SQLiteStorage` and `PostgresStorage` write unencrypted.
2. **Don't skip identity verification** — Shade gives you fingerprints (`getIdentityFingerprint()`), but it's the user's responsibility to compare them out-of-band on first contact.
3. **Don't reuse session storage between identities** — each user/device should have its own Shade storage. Mixing identities in one storage will corrupt the ratchet state.
4. **Keep prekey stocks topped up** — call `ensurePreKeyStock()` periodically (e.g., on app start or every hour). When the server runs out of one-time prekeys, new sessions will fall back to using just the signed prekey, which is slightly less secure.