Files
Shade/MIGRATION.md
Sterister e6fdf31b49
Some checks failed
Test / test (push) Has been cancelled
Cross-platform vectors / TypeScript vectors (bun) (push) Has been cancelled
Cross-platform vectors / Kotlin vectors (gradle) (push) Has been cancelled
Docker build and publish / docker (push) Has been cancelled
Publish / publish (push) Has been cancelled
release(v4.0.0): Shade GA — V3.x consolidation + audit prep
V3.1 → V3.12 consolidated and tagged for the first GA release. Wire
format unchanged from 0.4.x — 4.0 peers interoperate with 0.4.x peers
byte-for-byte. The version bump is semantic: audit-cycle complete,
opt-in surface fully exposed, threat model refreshed for every new
surface.

Highlights:
- All 24 @shade/* packages bumped to 4.0.0 in lockstep.
- CHANGELOG 4.0.0 section is the canonical manifest of what landed.
- THREAT-MODEL extended (§10 fingerprint gates, §11 WebRTC P2P, §12
  Web-Worker boundary) + residual-risks table refreshed.
- OpenAPI now covers all 27 routes: prekey, transfer, KT, inbox,
  bridge, observer, /metrics, /healthz, /ready.
- MIGRATION 0.3.x → 4.0 documented + smoke-tested against
  shade migrate-storage on a real SQLite DB.
- docs/audit/REVIEW-BUNDLE.md + SCOPE.md ready for external reviewer.
- scripts/soak.ts harness for the GA-stable 2-week soak window.
- All V*.md plans archived under docs/archive/ with Status: Done.
- Voice/Video carved out into V5.0; 4.0 audit focuses on the frozen
  non-realtime stack.

Tests: TS 1000/1000 + Kotlin 11/11 cross-platform vectors green.
Docker: gt.zyon.no/stian/shade-prekey:4.0.0 builds and reports
  version 4.0.0 on /health.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 18:35:35 +02:00

15 KiB
Raw Permalink Blame History

Migration Guide

This document describes how to migrate existing systems with ad-hoc encryption to Shade's Signal Protocol implementation.

Why migrate?

If you currently use:

  • A static AES-256-GCM key per pair (e.g., ECDH at handshake, then never rotated)
  • Pre-shared keys distributed at registration time
  • Simple per-device symmetric encryption (like Nova's push notifications)

…then you're missing forward secrecy and post-compromise recovery. Shade gives you both with minimal code changes.

Migration phases

The recommended migration is a three-phase rollout that lets you ship without downtime:

Phase 1: Dual-write

  • Set up the Shade prekey server alongside your existing system
  • New devices register with both systems
  • Old devices continue using the legacy encryption
  • Both encrypted formats are accepted on read

Phase 2: Switch reads

  • Once the majority of devices are on Shade, prefer Shade for new sessions
  • Continue accepting legacy messages for older clients
  • Monitor decryption failure rates

Phase 3: Deprecate

  • Remove legacy encryption code
  • Force all devices to re-pair via Shade
  • Clean up legacy database columns

Concrete examples

Example A: Replacing a static AES tunnel

Before (crypto/e2ee.ts):

import { generateKeyPair, deriveSharedSecret, encrypt, decrypt } from './crypto/e2ee.js';

// During pairing
const myKp = await generateKeyPair();
const sharedSecret = await deriveSharedSecret(myKp.privateKey, peerPublicKey);
db.serverConnection.insert({ sharedSecret: exportSecret(sharedSecret) });

// On every message
const { ciphertext, nonce } = await encrypt(sharedSecret, plaintext);
ws.send({ ciphertext, nonce });

After (with Shade):

import { ShadeSessionManager } from '@shade/core';
import { SubtleCryptoProvider } from '@shade/crypto-web';
import { SQLiteStorage } from '@shade/storage-sqlite';
import { ShadeWebSocket, ShadeFetchTransport } from '@shade/transport';

const crypto = new SubtleCryptoProvider();
const storage = new SQLiteStorage('/data/shade.db');
const manager = new ShadeSessionManager(crypto, storage);
await manager.initialize();

// During pairing — fetch peer's bundle and start session
const transport = new ShadeFetchTransport({
  baseUrl: 'https://prekey.example.com',
  crypto,
  signingPrivateKey: (await storage.getIdentityKeyPair())!.signingPrivateKey,
});
const peerBundle = await transport.fetchBundle('peer-id');
await manager.initSessionFromBundle('peer-id', peerBundle);

// On every message — wrap the WebSocket
const shadeWs = new ShadeWebSocket(rawWs, manager, 'peer-id');
shadeWs.onMessage((plaintext) => handleMessage(plaintext));
await shadeWs.send('Hello peer');

The key differences:

  1. No static shared secret — keys ratchet forward with each message
  2. Identity is persistent — same identity across reconnects, but session keys regenerate
  3. The transport wrapper is transparent — your application code doesn't change

Example B: Replacing per-device push encryption

Before (per-device static AES key):

// Server side
const device = db.pushDevices.findFirst({ where: { id } });
const key = Buffer.from(device.encryptionKey, 'base64');
const encrypted = encryptPayload(notificationJson, key);
sendToFCM({ data: { enc: encrypted, v: '1' } });

After (Shade per-device session):

// Server side
const manager = new ShadeSessionManager(crypto, storage);
await manager.initialize();

// First time per device: fetch their bundle and establish session
if (!await storage.getSession(`device:${deviceId}`)) {
  const bundle = await prekeyTransport.fetchBundle(`device:${deviceId}`);
  await manager.initSessionFromBundle(`device:${deviceId}`, bundle);
}

const envelope = await manager.encrypt(`device:${deviceId}`, notificationJson);
sendToFCM({ data: { enc: encodeEnvelope(envelope), v: '2' } });

Client side:

// Decode the envelope, decrypt via Shade
val envelope = decodeEnvelope(data["enc"]!!)
val plaintext = shadeManager.decrypt("server", envelope)

Database migration

If your existing system stores symmetric keys in the database:

Before

CREATE TABLE devices (
  id TEXT PRIMARY KEY,
  encryption_key TEXT NOT NULL  -- base64 AES-256
);

After

CREATE TABLE devices (
  id TEXT PRIMARY KEY,
  shade_address TEXT NOT NULL  -- e.g. "device:abc123"
  -- Shade tables (created automatically by SQLiteStorage):
  -- shade_identity, shade_sessions, shade_signed_prekeys, etc.
);

The Shade tables are auto-created when you instantiate the storage backend. No manual migration needed.

Migration for Orchestrator

The Orchestrator project's orchestrator-shared/src/crypto/e2ee.ts provides a static ECDH-derived AES-256-GCM key for the workstation↔server sync tunnel. To migrate:

  1. Add Shade dependencies to orchestrator-shared/package.json
  2. Replace e2ee.ts with imports from @shade/core and @shade/transport
  3. Update the pairing flow in sync-server.ts and sync-client.ts to exchange Shade prekey bundles instead of raw ECDH public keys
  4. Wrap the sync WebSocket with ShadeWebSocket for transparent encryption
  5. Migrate the serverConnection table to a shade_sessions table (or run dual-write during the rollout)

The key insight: Shade replaces the static sharedSecret column with a full ratcheting session, but the WebSocket transport, message types, and application logic don't change.

Migration for Nova (push notifications)

Nova's pushDevices.encryptionKey column is a per-device static AES key. To migrate:

  1. Run a Shade prekey server (Docker container, see examples/05-dokploy-deployment)
  2. On Android device registration, generate Shade identity + upload prekey bundle to the server (instead of generating a raw AES key)
  3. In the Nova backend, fetch the device's bundle and establish a Shade session per device
  4. Encrypt notifications via the Shade session instead of encryptPayload()
  5. On the Android client, decrypt with Shade instead of the static key
  6. Cross-platform interop: this requires the shade-android Kotlin module (not yet built — planned for the M8 milestone)

During the rollout, send notifications with a v: 1 (legacy) or v: 2 (Shade) field so old and new clients coexist.

Migration to at-rest encryption (V3.2)

Shade 0.4.0 ships @shade/storage-encrypted — opt-in AES-256-GCM encryption of every sensitive payload in the local SQLite/Postgres store. Existing 0.3.x deploys keep their unencrypted DB and behave exactly as before; encryption is enabled per-deployment with one CLI command.

One-shot migration (SQLite)

# Encrypts in place, drops unencrypted tables, leaves a .bak alongside.
shade migrate-storage \
  --key-source passphrase \
  --passphrase "$SHADE_STORAGE_PASSPHRASE" \
  --salt-file /data/shade-client.db.salt

For a dry run that validates every row without writing: shade migrate-storage … --dry-run.

Code-level switch

Replace:

import { SQLiteStorage } from '@shade/storage-sqlite';
const storage = new SQLiteStorage('/data/shade-client.db');

with:

import { KeyManager, EncryptedSQLiteStorage } from '@shade/storage-encrypted';
const km = await KeyManager.open({
  kind: 'passphrase',
  passphrase: process.env.SHADE_STORAGE_PASSPHRASE!,
  salt: loadSaltFromDisk(),
});
const storage = await EncryptedSQLiteStorage.open({
  dbPath: '/data/shade-client.db',
  keyManager: km,
});

The encrypted store implements the same StorageProvider, so ShadeSessionManager and the rest of the wiring is unchanged.

See docs/storage-encryption.md for the full design, key sources (passphrase / OS keychain / app-injected) and rotation.

Migrating from 0.3.x to 4.0 (GA)

Shade 4.0 is the GA-frozen baseline. Everything from V3.2V3.12 is merged, externally reviewed, and the wire format is locked. Nothing is breaking on the wire compared to 0.4.x — peers continue to interoperate. The 4.0 migration is therefore mostly opt-in surface activation plus a version-bump.

What stays the same

  • Wire envelope 0x02 (RatchetMessage) with u32 length-prefixes.
  • Wire envelope 0x11 (stream-chunk) for @shade/streams.
  • HTTP shape of all /v1/keys/... and /v1/transfer/... endpoints.
  • All StorageProvider core method signatures.
  • Identity fingerprints, X3DH flow, Ed25519 signature format.

A 0.3.x peer that has not enabled any opt-ins talks to a 4.0 peer without code changes. The version bump is semantic ("we have completed the audit cycle"), not breaking.

What's new (opt-in)

Surface Package How to enable
At-rest encryption @shade/storage-encrypted shade migrate-storage (see above)
Async store-and-forward @shade/inbox, @shade/inbox-server createInboxServer() + new Inbox()
Bridge transports (SSE, long-poll) @shade/transport-bridge, createBridgeRoutes() mount bridge routes; FallbackBridgeTransport
Web Workers crypto @shade/crypto-web/worker shade.configureWorkerCrypto({ workerUrl })
Social key recovery @shade/recovery setupRecovery / attachGuardian / requestRecovery
WebRTC P2P transport @shade/transport-webrtc (peer-dep) shade.configureWebRTC({ factory })
Key Transparency @shade/key-transparency, createPrekeyServerWithKT(...) server: keyTransparency: { ... } config; client: keyTransparency: { mode, logPublicKey } on createShade
Trust UX gates built-in to @shade/sdk shade.beforeFirstLargeFile / beforeBackupImport / beforeNewDeviceTrust(...)
Files RPC @shade/files shade.files.serve(handler) + shade.files.client(peer)

Pulling in none of these gives you the 1.0-shape API at 4.0 quality (audit-completed, soak-tested). Pulling in all of them gives the full 4.0 stack.

Schema additions

StorageProvider implementations (sqlite, postgres, encrypted variants) auto-create the additional tables on ensureTables() / initialize(). The 4.0 superset:

-- V3.2 (storage encryption) — only when EncryptedSQLiteStorage / EncryptedPostgresStorage is used
shade_master_key_meta(...)        -- KeyManager fingerprint + scrypt params
shade_field_keys(...)             -- per-(table, column) wrapped DEKs

-- V3.3 (fingerprint gates)
peer_verifications(...)           -- markPeerVerified persistence
peer_identity_versions(...)       -- bump on acceptIdentityChange

-- V3.6 (inbox relay)
shade_inbox_register(...)         -- TOFU bind address ↔ signing key
shade_inbox_blobs(...)            -- ciphertext blobs with TTL + msgId

-- V3.10 (recovery)
shade_recovery_setup(...)         -- per-recoverer state
shade_recovery_deposits(...)      -- per-guardian deposited shares

-- V3.12 (KT — server only)
shade_kt_leaves(...)              -- append-only Merkle leaves
shade_kt_index(...)               -- address-sorted commitment
shade_kt_sths(...)                -- signed tree heads

-- streams resume (V0.2.0+, listed for completeness)
stream_state(...)                 -- at-rest encrypted streamSecret

A 0.3.x deploy that upgrades the package without enabling any new surface gets these tables created on first start; they stay empty unless the corresponding feature is wired. There is no destructive migration. To verify before upgrading production:

shade doctor --db-path /data/shade-client.db

The CLI reports any mismatch between the on-disk schema and the version the installed packages expect.

Step-by-step upgrade (typical app)

  1. Bump dependencies. Update every @shade/* to ^4.0.0 in your package.json. Bun / npm / pnpm pull from the Gitea registry as per .npmrc.
  2. Re-run install. bun install (or your tool of choice). The new table definitions ship with the storage backends — no schema-edit PRs against your DB.
  3. Boot once with no new opt-ins. Existing send/receive should work byte-identically. shade doctor should print all green.
  4. Pick the opt-ins you actually want. Wire them one at a time (storage-encryption first, then fingerprint gates, then any of the recovery / KT / WebRTC / inbox surfaces). Each surface has its own doc under docs/ (storage-encryption.md, trust-ux.md, recovery.md, key-transparency.md, webrtc.md, inbox.md, transport.md, web-workers.md, files.md).
  5. Run cross-version smoke. Boot a 0.3.x peer next to a 4.0 peer in staging; exchange a session; confirm shade fingerprint matches on both ends and a round-trip message decrypts cleanly.
  6. Ship 4.0 to a canary. Roll forward; revert path is bun install @shade/sdk@^0.4.0 — there is no DB write that 0.4 cannot also read.

Operator checklist (prekey container)

If you operate the standalone container (gt.zyon.no/stian/shade-prekey):

  1. Pull the 4.0 image: docker pull gt.zyon.no/stian/shade-prekey:4.0.0.
  2. Add new env vars only if you are turning the corresponding surface on:
    • SHADE_INBOX_PG_URL / SHADE_INBOX_DB_PATH — async store-and-forward.
    • SHADE_INBOX_PRUNE_INTERVAL_MINUTES — inbox prune cadence.
    • SHADE_BRIDGE_* — bridge / SSE / long-poll surface.
    • SHADE_KT_* — Key Transparency mode + signing key path.
    • SHADE_TRANSFER_* — transfer routes mounted on the same Hono app.
  3. Restart with the existing volume; the inbox / KT tables auto-create on first request.
  4. Update docs/PRODUCTION-CHECKLIST.md items for any new surface you've enabled (rate-limit budgets, retention policies, KT witness-pinning).
  5. Verify the OpenAPI endpoints you advertise to clients now include the routes you mounted.

What about 4.0 → 4.x?

V4.x is bug-fix only. No wire-bump until V5.0 (voice/video) which is additive — it allocates new envelope types (frame-key prefixes) that 4.0 clients ignore by design.

Common pitfalls

  1. Don't store private keys in shared databases without encryption at rest — for shared infrastructure, enable @shade/storage-encrypted (V3.2) or use filesystem encryption / PostgreSQL TDE. The default SQLiteStorage and PostgresStorage write unencrypted.
  2. Don't skip identity verification — Shade gives you fingerprints (getIdentityFingerprint()), but it's the user's responsibility to compare them out-of-band on first contact.
  3. Don't reuse session storage between identities — each user/device should have its own Shade storage. Mixing identities in one storage will corrupt the ratchet state.
  4. Keep prekey stocks topped up — call ensurePreKeyStock() periodically (e.g., on app start or every hour). When the server runs out of one-time prekeys, new sessions will fall back to using just the signed prekey, which is slightly less secure.