# Web Workers Crypto Status: Implemented (V3.8 — `0.4.0`). `@shade/crypto-web` ships with an opt-in dedicated Web Worker that keeps AES-GCM, HKDF, HMAC, X25519 and Ed25519 — and full per-lane stream state — off the main thread. Big in-browser uploads (100 MB+) stay smooth without frame drops. This doc covers: - [When to use it](#when-to-use-it) - [Setup](#setup) - [API](#api) - [Bundler recipes](#bundler-recipes) - [Safari notes](#safari-notes) - [SharedArrayBuffer (COOP/COEP)](#sharedarraybuffer-coopcoep) - [Lifecycle and rotation](#lifecycle-and-rotation) - [Threat-model considerations](#threat-model-considerations) --- ## When to use it The default `SubtleCryptoProvider` runs on whatever thread you give it. For the SDK that means the main thread. AES-GCM via SubtleCrypto is fast (hardware-accelerated), but a 100 MB file at 256 KiB chunks is ~400 AEAD calls — each one queues a microtask on the main thread. Layered on top of React reflows and large `postMessage` payloads to the network worker, you *will* see frame drops. Reach for the Worker pipeline when: - You upload or download files that don't fit in a single AEAD chunk (≥ ~1 MB) inside a UI-bearing browser tab. - You generate or rotate identity / device keys in a UI thread that must stay interactive. - You do batch AEAD (e.g. backup export over many records). You can keep using `SubtleCryptoProvider` for short ops (Signal session encrypt/decrypt for a chat message). The cost of a `postMessage` round- trip dwarfs the cost of a single 256-byte AES call. --- ## Setup `@shade/crypto-web` exposes the worker as a separate subpath, so your bundler can resolve it through the standard `new Worker(new URL(..., import.meta.url))` idiom. ```ts import { createShade } from '@shade/sdk'; const shade = await createShade({ /* ... */ }); shade.configureWorkerCrypto({ workerUrl: new URL('@shade/crypto-web/worker', import.meta.url), }); ``` After `configureWorkerCrypto`, the SDK exposes: - `shade.encryptStream({ streamId, streamSecret, ... })` — returns a `TransformStream` and a `laneSha256` promise. - `shade.decryptStream({ streamId, streamSecret, ... })` — inverse. - `shade.getWorkerCrypto()` — direct access to the `WorkerCryptoProvider` for one-off ops (HKDF batches, X25519 batch DH, etc.). The worker is spawned on first use and self-terminates after `idleTimeoutMs` (default 30 s) — no manual lifecycle management required. --- ## API ### Stream encryption ```ts const { stream, laneSha256 } = await shade.encryptStream({ streamId: streamId, // 16 random bytes, agreed with peer streamSecret: streamSecret,// 32 random bytes, derived via Double Ratchet laneId: 0, // lane index (use multi-lane for parallel HTTP) chunkSize: 256 * 1024, // optional; default 256 KiB }); await file.stream() .pipeThrough(stream) .pipeTo(transferSink); // your HTTP-shipping WritableStream const sha256 = await laneSha256; // for end-to-end integrity proof ``` `stream` consumes plaintext and emits one wire-encoded `stream-chunk` envelope per write. `flush` always emits a final chunk with `isLast=true` (even if the trailing slice is empty), so receivers see a clean termination. ### Stream decryption ```ts const { stream, laneSha256 } = await shade.decryptStream({ streamId, streamSecret, laneId: 0, }); await incomingChunkStream .pipeThrough(stream) .pipeTo(fileSink); const sha = await laneSha256; if (!equal(sha, peerLaneSha256)) throw new IntegrityError(); ``` Each input chunk MUST be a complete wire envelope. The transport-layer caller is responsible for framing (one envelope per write). Out-of-order or replayed chunks reject the stream — the lane key never crosses thread boundaries, so a man-in-the-middle script in the page can't recover key material to replay against. ### Direct provider access ```ts const crypto = await shade.getWorkerCrypto(); // Implements `CryptoProvider` — drop-in replacement for SubtleCryptoProvider const { ciphertext, nonce } = await crypto.aesGcmEncrypt(key, plaintext); ``` `randomBytes`, `randomUint32`, `constantTimeEqual`, `zeroize` execute on the calling thread (no round-trip). Async ops forward to the worker. --- ## Bundler recipes ### Vite ```ts shade.configureWorkerCrypto({ workerUrl: new URL('@shade/crypto-web/worker', import.meta.url), }); ``` Vite resolves the URL via `import.meta.url` and emits a discrete chunk for the worker. No additional config required for Vite ≥ 5. If your build complains about `?worker` syntax, use the explicit URL form (above) — it's the standard Vite idiom. ### Webpack 5 / Rspack Same idiom — Webpack 5 understands `new URL('./worker.js', import.meta.url)` natively as long as the source is ESM: ```ts new Worker(new URL('@shade/crypto-web/worker', import.meta.url), { type: 'module', }); ``` For Webpack 4 or non-ESM builds, you need `worker-loader` (legacy). We do not officially support Webpack 4. ### Rollup Rollup needs `@rollup/plugin-web-worker-loader` or a recent `rollup-plugin-import-meta-url`. The standard idiom works once the plugin is wired: ```ts new URL('@shade/crypto-web/worker', import.meta.url) ``` If your bundler can't resolve `@shade/crypto-web/worker`, copy `node_modules/@shade/crypto-web/src/worker.ts` (or the compiled `.js` once we ship dist artefacts) into your `public/` directory and pass an absolute URL: ```ts shade.configureWorkerCrypto({ workerUrl: '/shade-crypto.worker.js' }); ``` --- ## Safari notes Safari ≤ 17 has a smaller `postMessage` transferable budget than Chrome / Firefox. Single transfers above ~64 MB occasionally fail silently. The shipped pipeline already chunks plaintext to 256 KiB before AEAD, so each `postMessage` carries ≤ ~256 KiB + AEAD overhead — well under any known Safari limit. If you override `chunkSize`, keep individual buffers below 16 MiB: ```ts shade.encryptStream({ streamId, streamSecret, chunkSize: 8 * 1024 * 1024, // 8 MiB — safe across all browsers }); ``` We do not officially support Safari ≤ 14 (no module workers). --- ## SharedArrayBuffer (COOP/COEP) The default pipeline uses `ArrayBuffer` transfer (zero-copy ownership hand-off). It does **not** require COOP/COEP headers. For multi-lane parallel transfers across multiple workers, you may opt in to `SharedArrayBuffer` for the AEAD plaintext buffers. That requires your origin to serve: ``` Cross-Origin-Opener-Policy: same-origin Cross-Origin-Embedder-Policy: require-corp ``` `SharedArrayBuffer` support is gated behind a future `useSharedBuffers` option and is not enabled in V3.8. See `docs/V4.0.md` if/when this lands. --- ## Lifecycle and rotation ```ts const crypto = await shade.getWorkerCrypto(); await crypto.rotate(); // tear down the current worker, respawn lazily await crypto.destroy(); // permanent — every subsequent call rejects ``` `shade.shutdown()` calls `destroy()` automatically. The idle-timer fires 30 seconds after the last response (configurable via `configureWorkerCrypto({ idleTimeoutMs })`); if the timer fires while calls are pending, it does nothing and reschedules. --- ## Threat-model considerations - The worker runs in the same origin and the same browsing context as the main thread. It is **not** a sandbox against a compromised page; any script that can `eval` in your tab can also `postMessage` to the worker. The Worker is a *performance* boundary, not a *security* boundary. - Lane keys derived inside the worker stay there; they are never postMessage'd to the main thread. This narrows the window during which a key sits in main-thread heap, which helps against post-mortem heap inspection by a curious extension. It does not help against an active in-page attacker. - `randomBytes` runs on the calling thread (uses `crypto.getRandomValues` directly). The worker has its own random source for ops that derive inside it (nonces are derived deterministically from `(laneId, seq)`). For the full picture, see `THREAT-MODEL.md`. --- ## Verifying main-thread budget V3.8 acceptance: 100 MB upload in Chrome without main thread blocked > 16 ms in P99. To verify in your app: 1. Open Chrome DevTools → Performance. 2. Record a 100 MB upload. 3. Inspect the main-thread flame chart. Look at "Long Tasks" and "Self time" of `Shade.encryptStream`. 4. Confirm no contiguous block exceeds ~16 ms (one frame at 60 fps). If you observe long tasks, lower `chunkSize` (more frequent yields) or report the trace — see [`docs/archive/V3.8.md`](./archive/V3.8.md) for the original acceptance criteria.