309 lines
11 KiB
Markdown
309 lines
11 KiB
Markdown
|
|
# Social Key Recovery (`@shade/recovery`)
|
||
|
|
|
||
|
|
V3.10 closes the biggest UX hole in any E2EE system: **"What happens
|
||
|
|
if I lose my phone?"**. Shade's social-recovery flow lets a user
|
||
|
|
designate `n` guardians (family / friends / co-workers) at setup time
|
||
|
|
such that any threshold-many `k` of them can together restore the
|
||
|
|
user's identity onto a new device — without any single guardian
|
||
|
|
being able to do it alone, and without the prekey server ever seeing
|
||
|
|
the recovered key material.
|
||
|
|
|
||
|
|
The whole flow ships entirely over existing 1:1 Shade sessions; no
|
||
|
|
server-side recovery agent, no escrow service, no "cloud guardian".
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Threat model recap
|
||
|
|
|
||
|
|
| # | Adversary | Recovered? |
|
||
|
|
|---|-----------|------------|
|
||
|
|
| 1 | Coalition of ≤ k-1 guardians | **No** (information-theoretic, by Shamir construction) |
|
||
|
|
| 2 | Prekey server alone | **No** (server only relays Double-Ratchet ciphertext) |
|
||
|
|
| 3 | Single malicious guardian who forges a share | **Detected** — AES-GCM tag mismatch on the backup blob; `requestRecovery` exhaustively tries threshold-sized subsets and rejects when none authenticate |
|
||
|
|
| 4 | Social engineering (impersonator calls a guardian) | **Mitigated, not eliminated** — guardians MUST OOB-confirm the new device's safety number before approving (see `<RecoveryApprove />`) |
|
||
|
|
| 5 | Compromised guardian device | **Out of scope** — see "Guardian compromise" below |
|
||
|
|
| 6 | Compromised primary device at setup time | **Out of scope** — recovery only protects the device; if setup material is exfiltrated, all bets are off |
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Setup
|
||
|
|
|
||
|
|
### What the user does
|
||
|
|
|
||
|
|
1. Pick `n` guardians from their existing peers.
|
||
|
|
2. Pick a threshold `k` (typically `⌈n/2⌉ + 1` to avoid pure-majority
|
||
|
|
dominance but still survive losing one or two).
|
||
|
|
3. Run `setupRecovery(...)`.
|
||
|
|
4. Print / record a **recovery card** with:
|
||
|
|
- The user's own address
|
||
|
|
- `setupId`
|
||
|
|
- `k` and `n`
|
||
|
|
- The list of guardian addresses
|
||
|
|
- Setup-time safety number
|
||
|
|
|
||
|
|
The recovery card is the only piece of state the user must remember
|
||
|
|
out-of-band (or store in a password manager). Without it, the user
|
||
|
|
cannot drive recovery on a new device — the new device needs to know
|
||
|
|
who the guardians are.
|
||
|
|
|
||
|
|
### What happens cryptographically
|
||
|
|
|
||
|
|
```text
|
||
|
|
recoveryKey = random(32 bytes)
|
||
|
|
backupBlob = Shade.exportBackup(passphrase = "shade-rk:" + base64url(recoveryKey),
|
||
|
|
knownAddresses = [...])
|
||
|
|
shares[i] = Shamir-split(recoveryKey, k, n)
|
||
|
|
```
|
||
|
|
|
||
|
|
For each guardian `i`:
|
||
|
|
|
||
|
|
```text
|
||
|
|
share-deposit envelope:
|
||
|
|
shadeRecovery: 1
|
||
|
|
type: "share-deposit"
|
||
|
|
flowId, setupId, originalAddress
|
||
|
|
threshold (k), guardianCount (n), shareIndex (i)
|
||
|
|
shareBytes: base64url( encodeShare(shares[i]) )
|
||
|
|
backupBlob: Shade.exportBackup output (identical for every guardian)
|
||
|
|
setupFingerprint, createdAt
|
||
|
|
```
|
||
|
|
|
||
|
|
The envelope rides through `Shade.send` like any other plaintext —
|
||
|
|
double-ratchet encrypted, AAD-bound, replay-safe.
|
||
|
|
|
||
|
|
The `recoveryKey` is **zeroized** on the primary device immediately
|
||
|
|
after the split returns. The primary therefore retains nothing
|
||
|
|
except `setupId` and the public roster.
|
||
|
|
|
||
|
|
### What each guardian stores
|
||
|
|
|
||
|
|
Per (`originalAddress`, `setupId`):
|
||
|
|
|
||
|
|
```text
|
||
|
|
{
|
||
|
|
shareIndex, // 1..n
|
||
|
|
shareBytes, // base64url-encoded Shamir share
|
||
|
|
backupBlob, // identical for every guardian
|
||
|
|
setupFingerprint, // for sanity-checks at recovery time
|
||
|
|
guardianCount, threshold,
|
||
|
|
receivedAt
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
The guardian's app provides a `RecoveryStore` implementation. The
|
||
|
|
package ships `MemoryRecoveryStore` for tests and small one-shot
|
||
|
|
demos; production guardian apps MUST supply a persistent store
|
||
|
|
(IndexedDB, AsyncStorage, SQLite, etc.). See "Persistence
|
||
|
|
recommendations" below.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Recovery
|
||
|
|
|
||
|
|
### What the user does on the new device
|
||
|
|
|
||
|
|
1. Boot a fresh Shade with a temporary identity.
|
||
|
|
2. Read the recovery card.
|
||
|
|
3. In the recovery widget, type / paste:
|
||
|
|
- `originalAddress`
|
||
|
|
- `setupId`
|
||
|
|
- `threshold`
|
||
|
|
- The guardian roster
|
||
|
|
4. Read the new device's safety number (the widget displays it
|
||
|
|
prominently) to each guardian over a side channel — phone call,
|
||
|
|
in person, whatever they trust.
|
||
|
|
5. Wait for `≥ k` guardians to approve.
|
||
|
|
|
||
|
|
### What happens cryptographically
|
||
|
|
|
||
|
|
For each guardian, the new device sends:
|
||
|
|
|
||
|
|
```text
|
||
|
|
recovery-request envelope:
|
||
|
|
shadeRecovery: 1
|
||
|
|
type: "recovery-request"
|
||
|
|
flowId, originalAddress, setupId
|
||
|
|
requesterFingerprint (= safety number of the temporary identity)
|
||
|
|
requestedAt
|
||
|
|
```
|
||
|
|
|
||
|
|
Each guardian's `attachGuardian` handler:
|
||
|
|
|
||
|
|
1. Looks up its stored deposit by `(originalAddress, setupId)`. If
|
||
|
|
missing, replies with `share-decline` (`reason = "unknown setup"`).
|
||
|
|
2. Invokes the `approve` callback with the requester's address +
|
||
|
|
fingerprint + the original device's setup-time fingerprint. The
|
||
|
|
callback is the **OOB-confirmation gate** — it MUST require an
|
||
|
|
explicit user click after they verified the fingerprint. The
|
||
|
|
`<RecoveryApprove />` widget enforces this with a two-checkbox
|
||
|
|
gate.
|
||
|
|
3. On approve → ships `share-grant`. On reject → ships
|
||
|
|
`share-decline` with a short reason.
|
||
|
|
|
||
|
|
The new device collects grants, and as soon as `k` arrive:
|
||
|
|
|
||
|
|
1. Combines the `k` shares via Lagrange interpolation at `x = 0` to
|
||
|
|
reconstruct `recoveryKey`.
|
||
|
|
2. Derives `passphrase = "shade-rk:" + base64url(recoveryKey)`.
|
||
|
|
3. Calls `Shade.importBackup(backupBlob, passphrase)` — the
|
||
|
|
AES-GCM tag in the blob authenticates the reconstruction. **A
|
||
|
|
forged share is detected here.**
|
||
|
|
4. If a guardian forged a share, `importBackup` throws. The
|
||
|
|
reconstruction loop then tries every other threshold-sized subset
|
||
|
|
of grants until one authenticates (the V3.10 acceptance criterion
|
||
|
|
"no coalition of (k-1) guardians can rebuild the secret" is the
|
||
|
|
safety invariant; the AEAD authenticates which subset is
|
||
|
|
honest).
|
||
|
|
5. If every subset fails, `RecoveryReconstructionError` is raised
|
||
|
|
and the user is told that at least one guardian is malicious.
|
||
|
|
|
||
|
|
After `importBackup` succeeds, the new device hosts the original
|
||
|
|
identity and immediately calls `Shade.rotate()` to retire the
|
||
|
|
recovery-recovered key material from the conversation graph (the
|
||
|
|
old session keys persisted in the backup blob are now considered
|
||
|
|
"compromised — used for recovery").
|
||
|
|
|
||
|
|
> **The `Shade.beforeBackupImport` gate fires automatically.**
|
||
|
|
> Without a registered handler the SDK falls back to TOFU-with-warning
|
||
|
|
> (consistent with the V3.3 contract). Production apps SHOULD register
|
||
|
|
> a handler that pops the user one more confirmation before the
|
||
|
|
> identity rotates.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Acceptance criteria status
|
||
|
|
|
||
|
|
- [x] **3-of-5 recovery works end-to-end on two separate Shade
|
||
|
|
instances.** See `tests/integration.test.ts`.
|
||
|
|
- [x] **No coalition of (k-1) guardians can reconstruct
|
||
|
|
`recoveryKey`.** Property test asserts this with `fast-check`
|
||
|
|
across random k/n configurations.
|
||
|
|
See `tests/shamir.test.ts` and
|
||
|
|
`tests/adversarial.test.ts`.
|
||
|
|
- [x] **Guardian-side widget requires fingerprint-confirmation
|
||
|
|
before sending.** `<RecoveryApprove />` enforces a
|
||
|
|
two-checkbox gate; `tests/adversarial.test.ts` exercises
|
||
|
|
both the matching-OOB and rejecting-OOB code paths.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Persistence recommendations
|
||
|
|
|
||
|
|
The `RecoveryStore` interface is intentionally small (4 methods).
|
||
|
|
Pick the implementation that fits your platform:
|
||
|
|
|
||
|
|
| Platform | Suggested backing store |
|
||
|
|
|--------------------------|----------------------------------------|
|
||
|
|
| Browser (PWA) | IndexedDB (one object store, idb) |
|
||
|
|
| Browser (extension) | `chrome.storage.local` |
|
||
|
|
| React Native | AsyncStorage (with crypto-protected blob) |
|
||
|
|
| Bun / Node server | SQLite via `@shade/storage-sqlite` extension table OR a side file |
|
||
|
|
| Android (native) | Room / EncryptedSharedPreferences |
|
||
|
|
|
||
|
|
Whatever you pick, the records ARE NOT secret on their own — without
|
||
|
|
threshold-many other guardians' shares they're useless — but they
|
||
|
|
should still be stored encrypted-at-rest like any other Shade state.
|
||
|
|
Do not commit them to plaintext logs or network-replicated state.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Guardian-UX guide
|
||
|
|
|
||
|
|
### How many guardians?
|
||
|
|
|
||
|
|
| n | Survives | Comment |
|
||
|
|
|---|----------|---------|
|
||
|
|
| 3, k=2 | 1 lost guardian | Minimum useful — one device away from danger |
|
||
|
|
| 5, k=3 | 2 lost guardians | Sweet spot for most users |
|
||
|
|
| 7, k=4 | 3 lost guardians | Suitable when you genuinely have 7+ trustworthy people |
|
||
|
|
| n=k | 0 lost | DO NOT USE — single point of failure |
|
||
|
|
|
||
|
|
The widget defaults to `k = ⌈n/2⌉` which is liberal but
|
||
|
|
collusion-resistant for `n ≥ 3`. Apps targeting paranoid users may
|
||
|
|
want to bump that to `⌈2n/3⌉`.
|
||
|
|
|
||
|
|
### Replacing a guardian
|
||
|
|
|
||
|
|
If a guardian dies, loses their device permanently, or you no longer
|
||
|
|
trust them:
|
||
|
|
|
||
|
|
1. Pick a replacement.
|
||
|
|
2. Run `setupRecovery` again with the new roster — this generates a
|
||
|
|
fresh `setupId` and a fresh `recoveryKey`. The old shares become
|
||
|
|
garbage (no guardian set can use them, because the
|
||
|
|
`backupBlob` is different).
|
||
|
|
|
||
|
|
The widget records the new `setupId` on the recovery card. Treat
|
||
|
|
this as a hard rotation; the user MUST re-record the card.
|
||
|
|
|
||
|
|
### Guardian health checks
|
||
|
|
|
||
|
|
Periodically (the V3.10 plan suggests a quarterly prompt), the user
|
||
|
|
should confirm each guardian is still reachable. Any guardian who
|
||
|
|
can't be reached in two consecutive prompts SHOULD trigger a
|
||
|
|
re-setup with a fresh roster. The widget UX track is to be added in
|
||
|
|
a follow-up release; the primitive is in place.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Wiring example
|
||
|
|
|
||
|
|
```ts
|
||
|
|
import {
|
||
|
|
setupRecovery,
|
||
|
|
attachGuardian,
|
||
|
|
requestRecovery,
|
||
|
|
MemoryRecoveryStore,
|
||
|
|
} from '@shade/recovery';
|
||
|
|
|
||
|
|
// On the primary device:
|
||
|
|
const result = await setupRecovery({
|
||
|
|
shade,
|
||
|
|
guardians: ['bob', 'carol', 'dan', 'eve', 'faythe'],
|
||
|
|
threshold: 3,
|
||
|
|
deliver: async (to, envelope) => {
|
||
|
|
// wire to your app's existing message-delivery layer
|
||
|
|
await myMessageOutbox.send(to, envelope);
|
||
|
|
},
|
||
|
|
});
|
||
|
|
console.log(result.setupId);
|
||
|
|
|
||
|
|
// On each guardian device:
|
||
|
|
const stop = attachGuardian({
|
||
|
|
shade,
|
||
|
|
store: myPersistentStore, // see "Persistence" above
|
||
|
|
approve: async (ctx) => {
|
||
|
|
// Show ctx.requesterFingerprint to the user.
|
||
|
|
// Block until they confirm OOB and click "Release share".
|
||
|
|
return await myUI.askApproval(ctx);
|
||
|
|
},
|
||
|
|
deliver: myMessageOutbox.send,
|
||
|
|
});
|
||
|
|
|
||
|
|
// On the new device:
|
||
|
|
const recovered = await requestRecovery({
|
||
|
|
shade: temporaryShade, // fresh identity for now
|
||
|
|
originalAddress: 'alice',
|
||
|
|
setupId: 'sid-from-recovery-card',
|
||
|
|
threshold: 3,
|
||
|
|
guardians: ['bob', 'carol', 'dan', 'eve', 'faythe'],
|
||
|
|
deliver: myMessageOutbox.send,
|
||
|
|
onProgress: (p) => myUI.showProgress(p),
|
||
|
|
});
|
||
|
|
// `temporaryShade` now hosts the original identity.
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Out of scope (V3.10)
|
||
|
|
|
||
|
|
- **Cloud guardian / Shade-operated recovery agent.** Explicit
|
||
|
|
non-goal; the spec rejects any centralized component that can
|
||
|
|
recover on its own.
|
||
|
|
- **Auto-distribution.** The user must explicitly pick guardians.
|
||
|
|
- **Multi-share-per-guardian.** Each guardian holds exactly one
|
||
|
|
share. Apps that need redundancy should bump `n`, not give the
|
||
|
|
same guardian multiple shares.
|
||
|
|
- **Guardian ZK-proofs of liveness.** A guardian who refuses to
|
||
|
|
respond is treated as offline; we don't try to compel them.
|