release(v4.2.1): fix concurrent-ratchet desync via OutboundQueue waiter cursor
Pull-mode httpClient + drainer + parallel RPCs against the same peer deteriorated after ~10s with `DecryptionError`. Two bugs combined: - `OutboundQueue.enqueue` woke `drain` waiters with a `since=0` snapshot, replaying already-processed events into `Shade.acceptTransferEnvelope` → `manager.decrypt` twice. The duplicate consumed an already-used skipped key and corrupted the Double Ratchet receive chain. - `ratchetDecrypt` then propagated the corruption: a same-DH message behind the chain with no cached skipped key fell through to `kdfChainKey` on the ahead state and rewound `chain.counter`, permanently desyncing the chain. Fix `OutboundQueue` to honor each waiter's `since`, and harden `ratchetDecrypt` so any future duplicate fails cleanly without mutating state. Adds regression coverage at all three layers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
43
CHANGELOG.md
43
CHANGELOG.md
@@ -5,6 +5,49 @@ All notable changes to Shade are documented in this file.
|
||||
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
|
||||
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
||||
|
||||
## [4.2.1] — 2026-05-04 — Concurrent-ratchet desync under pull-mode drainer
|
||||
|
||||
A consumer running `shade.files.httpClient(server, { outboundQueueUrl, ... })`
|
||||
alongside parallel RPC traffic against the same peer would, after ~10s of
|
||||
load, see every subsequent message fail with
|
||||
`DecryptionError: Failed to decrypt message — wrong key or tampered data`.
|
||||
Two bugs combined to cause this; both are fixed in `4.2.1` with regression
|
||||
coverage.
|
||||
|
||||
### Fixed
|
||||
|
||||
#### `@shade/transfer` — `OutboundQueue` waiter cursor
|
||||
`enqueue` woke pending `drain` waiters with a `since=0` snapshot — the
|
||||
full event log — instead of using the waiter's own `since`. A poll that
|
||||
parked at the head and was woken by a fresh enqueue therefore replayed
|
||||
every event the waiter had already processed. Downstream the queue
|
||||
fed `Shade.acceptTransferEnvelope`, so the duplicate replayed an
|
||||
envelope into `manager.decrypt` twice. The second decrypt consumed an
|
||||
already-used skipped key and corrupted the Double Ratchet receive
|
||||
chain. Each `PendingWaiter` now records its `since` cursor and is
|
||||
delivered only events with `id > since`.
|
||||
|
||||
#### `@shade/core` — `ratchetDecrypt` defense-in-depth
|
||||
A same-DH message whose `counter` was already behind the chain — and
|
||||
that did NOT match a cached skipped key — fell through to a path that
|
||||
called `kdfChainKey` on the *current* (ahead) chain key and then set
|
||||
`chain.counter = message.counter + 1`, permanently desyncing the
|
||||
ratchet so every subsequent decrypt returned wrong-key. Such messages
|
||||
are now rejected with `DecryptionError` without any state mutation, so
|
||||
a downstream replay (transport bug, retry, intermitent network) cannot
|
||||
poison the session.
|
||||
|
||||
### Tests
|
||||
- `packages/shade-files/tests/integration/concurrent-ratchet.test.ts` —
|
||||
100 parallel `httpClient` RPCs while the drainer runs, plus a mixed
|
||||
workload of 50 RPCs + 50 raw `shade.send` deliveries with Bob
|
||||
echoing replies through the queue. Both surface the bug pre-fix.
|
||||
- `packages/shade-transfer/tests/outbound-queue.test.ts` — direct
|
||||
regression on the waiter `since` cursor.
|
||||
- `packages/shade-core/tests/ratchet.test.ts` — replay of an
|
||||
already-decrypted message must throw cleanly without breaking
|
||||
subsequent decrypts on the same chain.
|
||||
|
||||
## [4.2.0] — 2026-05-03 — Pull-mode streams for browser @shade/files
|
||||
|
||||
`4.1.0` shipped HTTP RPC for browser clients but capped them at inline
|
||||
|
||||
Reference in New Issue
Block a user