Files
Shade/docs/web-workers.md

277 lines
8.5 KiB
Markdown
Raw Permalink Normal View History

# Web Workers Crypto
Status: Implemented (V3.8 — `0.4.0`).
`@shade/crypto-web` ships with an opt-in dedicated Web Worker that keeps
AES-GCM, HKDF, HMAC, X25519 and Ed25519 — and full per-lane stream state —
off the main thread. Big in-browser uploads (100 MB+) stay smooth without
frame drops.
This doc covers:
- [When to use it](#when-to-use-it)
- [Setup](#setup)
- [API](#api)
- [Bundler recipes](#bundler-recipes)
- [Safari notes](#safari-notes)
- [SharedArrayBuffer (COOP/COEP)](#sharedarraybuffer-coopcoep)
- [Lifecycle and rotation](#lifecycle-and-rotation)
- [Threat-model considerations](#threat-model-considerations)
---
## When to use it
The default `SubtleCryptoProvider` runs on whatever thread you give it.
For the SDK that means the main thread. AES-GCM via SubtleCrypto is fast
(hardware-accelerated), but a 100 MB file at 256 KiB chunks is ~400 AEAD
calls — each one queues a microtask on the main thread. Layered on top of
React reflows and large `postMessage` payloads to the network worker, you
*will* see frame drops.
Reach for the Worker pipeline when:
- You upload or download files that don't fit in a single AEAD chunk
(≥ ~1 MB) inside a UI-bearing browser tab.
- You generate or rotate identity / device keys in a UI thread that must
stay interactive.
- You do batch AEAD (e.g. backup export over many records).
You can keep using `SubtleCryptoProvider` for short ops (Signal session
encrypt/decrypt for a chat message). The cost of a `postMessage` round-
trip dwarfs the cost of a single 256-byte AES call.
---
## Setup
`@shade/crypto-web` exposes the worker as a separate subpath, so your
bundler can resolve it through the standard `new Worker(new URL(...,
import.meta.url))` idiom.
```ts
import { createShade } from '@shade/sdk';
const shade = await createShade({ /* ... */ });
shade.configureWorkerCrypto({
workerUrl: new URL('@shade/crypto-web/worker', import.meta.url),
});
```
After `configureWorkerCrypto`, the SDK exposes:
- `shade.encryptStream({ streamId, streamSecret, ... })` — returns a
`TransformStream<Uint8Array, Uint8Array>` and a `laneSha256` promise.
- `shade.decryptStream({ streamId, streamSecret, ... })` — inverse.
- `shade.getWorkerCrypto()` — direct access to the `WorkerCryptoProvider`
for one-off ops (HKDF batches, X25519 batch DH, etc.).
The worker is spawned on first use and self-terminates after
`idleTimeoutMs` (default 30 s) — no manual lifecycle management required.
---
## API
### Stream encryption
```ts
const { stream, laneSha256 } = await shade.encryptStream({
streamId: streamId, // 16 random bytes, agreed with peer
streamSecret: streamSecret,// 32 random bytes, derived via Double Ratchet
laneId: 0, // lane index (use multi-lane for parallel HTTP)
chunkSize: 256 * 1024, // optional; default 256 KiB
});
await file.stream()
.pipeThrough(stream)
.pipeTo(transferSink); // your HTTP-shipping WritableStream
const sha256 = await laneSha256; // for end-to-end integrity proof
```
`stream` consumes plaintext and emits one wire-encoded
`stream-chunk` envelope per write. `flush` always emits a final chunk
with `isLast=true` (even if the trailing slice is empty), so receivers
see a clean termination.
### Stream decryption
```ts
const { stream, laneSha256 } = await shade.decryptStream({
streamId,
streamSecret,
laneId: 0,
});
await incomingChunkStream
.pipeThrough(stream)
.pipeTo(fileSink);
const sha = await laneSha256;
if (!equal(sha, peerLaneSha256)) throw new IntegrityError();
```
Each input chunk MUST be a complete wire envelope. The transport-layer
caller is responsible for framing (one envelope per write). Out-of-order
or replayed chunks reject the stream — the lane key never crosses thread
boundaries, so a man-in-the-middle script in the page can't recover key
material to replay against.
### Direct provider access
```ts
const crypto = await shade.getWorkerCrypto();
// Implements `CryptoProvider` — drop-in replacement for SubtleCryptoProvider
const { ciphertext, nonce } = await crypto.aesGcmEncrypt(key, plaintext);
```
`randomBytes`, `randomUint32`, `constantTimeEqual`, `zeroize` execute on
the calling thread (no round-trip). Async ops forward to the worker.
---
## Bundler recipes
### Vite
```ts
shade.configureWorkerCrypto({
workerUrl: new URL('@shade/crypto-web/worker', import.meta.url),
});
```
Vite resolves the URL via `import.meta.url` and emits a discrete chunk
for the worker. No additional config required for Vite ≥ 5.
If your build complains about `?worker` syntax, use the explicit URL
form (above) — it's the standard Vite idiom.
### Webpack 5 / Rspack
Same idiom — Webpack 5 understands `new URL('./worker.js', import.meta.url)`
natively as long as the source is ESM:
```ts
new Worker(new URL('@shade/crypto-web/worker', import.meta.url), {
type: 'module',
});
```
For Webpack 4 or non-ESM builds, you need `worker-loader` (legacy). We
do not officially support Webpack 4.
### Rollup
Rollup needs `@rollup/plugin-web-worker-loader` or a recent
`rollup-plugin-import-meta-url`. The standard idiom works once the
plugin is wired:
```ts
new URL('@shade/crypto-web/worker', import.meta.url)
```
If your bundler can't resolve `@shade/crypto-web/worker`, copy
`node_modules/@shade/crypto-web/src/worker.ts` (or the compiled `.js`
once we ship dist artefacts) into your `public/` directory and pass an
absolute URL:
```ts
shade.configureWorkerCrypto({ workerUrl: '/shade-crypto.worker.js' });
```
---
## Safari notes
Safari ≤ 17 has a smaller `postMessage` transferable budget than Chrome /
Firefox. Single transfers above ~64 MB occasionally fail silently. The
shipped pipeline already chunks plaintext to 256 KiB before AEAD, so
each `postMessage` carries ≤ ~256 KiB + AEAD overhead — well under any
known Safari limit.
If you override `chunkSize`, keep individual buffers below 16 MiB:
```ts
shade.encryptStream({
streamId, streamSecret,
chunkSize: 8 * 1024 * 1024, // 8 MiB — safe across all browsers
});
```
We do not officially support Safari ≤ 14 (no module workers).
---
## SharedArrayBuffer (COOP/COEP)
The default pipeline uses `ArrayBuffer` transfer (zero-copy ownership
hand-off). It does **not** require COOP/COEP headers.
For multi-lane parallel transfers across multiple workers, you may opt
in to `SharedArrayBuffer` for the AEAD plaintext buffers. That requires
your origin to serve:
```
Cross-Origin-Opener-Policy: same-origin
Cross-Origin-Embedder-Policy: require-corp
```
`SharedArrayBuffer` support is gated behind a future `useSharedBuffers`
option and is not enabled in V3.8. See `docs/V4.0.md` if/when this lands.
---
## Lifecycle and rotation
```ts
const crypto = await shade.getWorkerCrypto();
await crypto.rotate(); // tear down the current worker, respawn lazily
await crypto.destroy(); // permanent — every subsequent call rejects
```
`shade.shutdown()` calls `destroy()` automatically. The idle-timer fires
30 seconds after the last response (configurable via
`configureWorkerCrypto({ idleTimeoutMs })`); if the timer fires while
calls are pending, it does nothing and reschedules.
---
## Threat-model considerations
- The worker runs in the same origin and the same browsing context as
the main thread. It is **not** a sandbox against a compromised page;
any script that can `eval` in your tab can also `postMessage` to the
worker. The Worker is a *performance* boundary, not a *security*
boundary.
- Lane keys derived inside the worker stay there; they are never
postMessage'd to the main thread. This narrows the window during which
a key sits in main-thread heap, which helps against post-mortem heap
inspection by a curious extension. It does not help against an active
in-page attacker.
- `randomBytes` runs on the calling thread (uses `crypto.getRandomValues`
directly). The worker has its own random source for ops that derive
inside it (nonces are derived deterministically from `(laneId, seq)`).
For the full picture, see `THREAT-MODEL.md`.
---
## Verifying main-thread budget
V3.8 acceptance: 100 MB upload in Chrome without main thread blocked
> 16 ms in P99.
To verify in your app:
1. Open Chrome DevTools → Performance.
2. Record a 100 MB upload.
3. Inspect the main-thread flame chart. Look at "Long Tasks" and
"Self time" of `Shade.encryptStream`.
4. Confirm no contiguous block exceeds ~16 ms (one frame at 60 fps).
If you observe long tasks, lower `chunkSize` (more frequent yields) or
report the trace — see [`docs/archive/V3.8.md`](./archive/V3.8.md) for
the original acceptance criteria.