Phase D.6.0-B findings — "quiescent state" was observation-window bias; CDS disconnects ~30 s after handshake¶
Update 2026-04-19: hypothesis H1 CONFIRMED
D.6.1-B tested H1 (zero UUID as session gate) by swapping the 16-byte placeholder for a fresh RFC 4122 v4 UUID per session. Result: CDS held the connection ESTABLISHED for the full 130 s observation window, zero EOF, only TCP keepalives. The "CDS disconnects at 23-36 s" behaviour documented below was entirely caused by the zero UUID. H2/H3/H4 no longer need testing. See ../iter-07-uuid-patched/findings.md for the full breakthrough decode and D.6.2's next axis (what CDS expects from us now that handshake is accepted).
Status: verified — 2026-04-19
First D.6 research step. Deployed the D.6.0-A verbose-logging
Go backend (commit 5d47ae5) on havoc, ran three interactive
devicectl triggers meant to push CDS into a post-handshake
RemoteXPC service exchange, captured the full session. The
core finding is not the post-handshake request we expected
to see — it's the discovery that CDS does not sit silently
after our handshake. It closes the TCP connection ~30 s
after our #8 big Handshake lands, then CDS retries with a
fresh session (same bytes, same result). Every frame of every
session decoded cleanly through internal/xpc/ + the
D.6.0-A verbose logger.
TL;DR¶
The iter-4 / D.5 "quiescent state" finding needs to be
reinterpreted: CDS's silence after our #8 big Handshake is
not "happy and waiting" — it is "evaluating our Handshake, then
deciding it's unusable, then disconnecting." In iter-1 through
iter-5 we always pkill'd the listener within ~5-13 s of
handshake completion, well before the natural CDS timeout at
~30 s. We never observed the timeout because we never waited
for it.
D.6.0-B waited. Natural timeout observed three times in the same run:
| session | handshake complete | CDS EOF | delta |
|---|---|---|---|
| 1 | 02:13:51 | 02:14:27 | 36 s |
| 2 | 02:14:27 | 02:14:59 | 32 s |
| 3 | 02:14:59 | 02:15:22 | 23 s |
The client-facing symptom in devicectl:
com.apple.Mercury.error 1000 "The connection was interrupted".
None of our three triggers (device info, manage pair, or
device info processes) succeeded in pushing CDS past the
handshake, because each one starts by opening a fresh session
that just runs through our byte-exact handshake again and then
is torn down by CDS.
So D.6.0-B did not capture a "first post-handshake request" — there is no such request to capture until we fix whatever makes CDS reject the session. What it did capture is the failure mode that iter-1 through iter-5 could never surface under their short pkill windows.
What the sessions look like in verbose¶
Each of the three sessions emits the same 38-line verbose trace. Session 1 excerpt (identical to iter-5 smoke, PLUS the EOF line at the end that iter-5 never waited for):
[02:13:51] accepted: remote=[::1]:54675 session=1
[02:13:51] session=1 server handshake sent (SETTINGS + WINDOW_UPDATE)
[02:13:51] session=1 frame type=SETTINGS stream=0 len=12 flags=0x00
[02:13:51] session=1 sent SETTINGS-ACK
[02:13:51] session=1 frame type=WINDOW_UPDATE stream=0 len=4 flags=0x00
[02:13:51] session=1 frame type=HEADERS stream=1 len=0 flags=0x04
[02:13:51] session=1 dispatcher: emit #3 HEADERS(s1, len=0)
[02:13:51] session=1 frame type=SETTINGS stream=0 len=0 flags=0x01
[02:13:51] session=1 recv SETTINGS-ACK
[02:13:51] session=1 frame type=DATA stream=1 len=44 flags=0x00
[02:13:51] session=1 verbose recv <empty dict>
[02:13:51] session=1 verbose send <empty dict>
[02:13:51] session=1 dispatcher: emit #4 DATA(s1, 44 B empty-dict)
[02:13:51] session=1 frame type=HEADERS stream=3 len=0 flags=0x04
[02:13:51] session=1 dispatcher: emit #6 HEADERS(s3, len=0)
[02:13:51] session=1 frame type=DATA stream=1 len=24 flags=0x00
[02:13:51] session=1 verbose recv [no payload] # flags=0x00000201
[02:13:51] session=1 verbose send [no payload]
[02:13:51] session=1 dispatcher: emit #5 DATA(s1, 24 B sync)
[02:13:51] session=1 frame type=DATA stream=3 len=24 flags=0x00
[02:13:51] session=1 verbose recv [no payload] # flags=0x00400001 INIT_HANDSHAKE
[02:13:51] session=1 verbose send [no payload]
[02:13:51] session=1 dispatcher: emit #7 DATA(s3, 24 B INIT_HANDSHAKE mirror)
[02:13:51] session=1 verbose send DATA(s1, 14124 B): flags=0x00000101 msgid=2
[02:13:51] session=1 verbose send MessageType => string "Handshake"
[02:13:51] session=1 verbose send MessagingProtocolVersion => uint64 7
[02:13:51] session=1 verbose send Services => <dict 62 entries>
[02:13:51] session=1 verbose send Properties => <dict 46 entries>
[02:13:51] session=1 verbose send UUID => <uuid 00000000000000000000000000000000>
[02:13:51] session=1 dispatcher: emit #8 DATA(s1, 14124 B big Handshake)
[02:14:27] session=1 read loop: peer closed (EOF) ← 36 seconds of silence, then CDS gives up
Sessions 2 and 3 are byte-identical up to the dispatcher output (our Go backend is deterministic) and end with EOF after 32 s and 23 s respectively. The three retries are CDS's own client-side logic reconnecting after each Mercury error.
Why this reinterprets iter-4 and the D.5 smoke¶
iter-4's findings.md said:
CDS enters a quiescent state after receiving our 9-frame reply (#0–#8) and simply waits. [...] The TCP connection stays open indefinitely from CDS's side; iter-4 was ended by
pkill.
And the D.5 smoke said:
Go framer did NOT return EOF naturally; exited via our pkill at t+13s (log line:
read loop: local close). Same as iter-4 behavior (iter-4 listener was also killed manually).
Both are literally true (CDS did not close within the observation window) but interpretively incomplete (the observation window was too short to see CDS's decision). The "quiescent" language implied stable acceptance; the actual behaviour is delayed rejection.
This does not invalidate the iter-1 → iter-4 dispatch-table work.
The HTTP/2 framing is correct, the XPC envelope is correct, and
the frame sequence matches what real iPhone emits in the Q2
pcap. The problem is at the semantic content of #8 Handshake
(or whatever CDS validates after #8). Our frame is
structurally accepted by CDS's h2c + RemoteXPC parsers
(no PROTOCOL_ERROR, no RST_STREAM, no GOAWAY); it is
semantically rejected once CDS tries to use it for
anything.
Candidate hypotheses for the rejection¶
These are ranked by likelihood, NOT verified. Each is one test away from confirm-or-refute.
H1: the placeholder UUID¶
The top-level UUID field in our #8 big Handshake is 16 zero
bytes — redacted at source in iphone_replay_bytes.py (see the
REDACTED_AT_SOURCE constant). Real iPhones emit a session-bound
UUID there. CDS may use this field as a session identifier and
refuse to progress when it is the zero UUID.
Cheap to test: synthesize a valid random UUID on each session, patch it into the Handshake bytes before emit, re-run.
H2: stale or mismatched Services dict¶
Services advertises 62 services with specific port numbers.
Real tunneld binds those ports on the real iPhone; CDS later
tries to connect to some of them. Our backend does not bind any
of them. If CDS post-handshake-validates by trying to connect to
com.apple.lockdown.remote.trusted on the advertised port and
getting ECONNREFUSED, it would disconnect.
Test: grep the pcap (iosmux-d6.pcap) for any post-handshake
TCP SYN to a port in the Services dict. If yes, hypothesis is
likely. If no, it's ruled out.
H3: missing heartbeat / keepalive¶
pymobiledevice3's XPC flag enum has PING (0x00000002) and
there's a HEARTBEAT_REQUEST / HEARTBEAT_RESPONSE pair
(0x00010000 / 0x00020000) documented but not used in the iter-0
pcap. CDS may expect periodic server-initiated PING frames after
handshake, and the ~30 s timeout is CDS's own keepalive window
expiring.
Test: patch the dispatcher to emit a PING frame every 10 s on stream 0 after #8, see if that extends the window.
H4: the tunneld UDID mismatch¶
Tunneld advertises its UDID in the JSON response CDS reads
before connecting. Our #8 Handshake has a zeroed UniqueDeviceID
inside the dict. CDS may be cross-validating and disconnecting
on mismatch.
Test: patch the Handshake bytes so UniqueDeviceID matches the
tunneld-advertised UDID. Same shape of change as H1 (in-place
byte patch on the embedded fixture).
Implications¶
Iter-1 through iter-4's Q3 "closed under replay" conclusion needs a footnote, not a retraction. The replay-layer investigation IS closed — we proved the HTTP/2 + XPC framing is correct. But "reached quiescent" should have said "reached a semantically-invalid handshake that CDS will discard after 30 s". We update iter-04 and iter-05-go-backend findings with that clarification.
Phase D.6.1 does NOT start with a handler, because there is no incoming request to handle. It starts with fixing the handshake so CDS does not disconnect. The four hypotheses above are the first-level experiment candidates.
Phase D's original theory — that our Go backend replies with real data sourced from a real iPhone-adjacent capture — remains correct. The test we ran today used the redacted iter-01 capture as the reply source. Redaction at source invalidates some of those fields (UUID zeroed, UDID zeroed, etc.). The backend needs a non-redacted source for session-bound fields. That source is either:
- The tunneld JSON response (authoritative for UDID)
- A fresh random value per session (legitimate for UUID, which is session-scoped by protocol definition)
- The actual services that an iosmux-backend-owned tunneld replacement knows how to speak (future D.7 scope)
What's open after this¶
- Confirm which of H1 / H2 / H3 / H4 is load-bearing (one dispatch-bisect iteration per hypothesis, same cadence as iter-1 through iter-4)
- Once the 30 s timeout goes away, THEN the original D.6.0 goal becomes tractable — observe what CDS sends when it believes our handshake
Artifacts¶
Under
iter-06-pair-trigger/:
verbose-session.log— 8,936 B, 116 lines total across 3 sessions. All handshake-only; no post-handshake frames. Pre-scan confirmed zero identifiers leaked; no redaction applied (file stored as produced).iosmux-d6.pcap— 56,099 B loopback capture across the three sessions. Scanned for UDID/GUID/hostname patterns: 0 matches. Safe to commit as-is per iter-02/03/04 policy.
How to reproduce¶
On havoc (assumes iter-4-era SPIKE tunneld still running):
# Cross-compile + deploy (with verbose)
GOOS=darwin GOARCH=amd64 CGO_ENABLED=0 go build \
-o /tmp/iosmux-backend-darwin ./cmd/iosmux-backend/
scp /tmp/iosmux-backend-darwin havoc:/tmp/iosmux-backend
ssh havoc 'chmod +x /tmp/iosmux-backend'
# Run verbose backend
ssh havoc 'nohup env IOSMUX_BACKEND_VERBOSE=1 \
/tmp/iosmux-backend -listen [::1]:34719 \
> /tmp/iosmux-backend.log 2>&1 &'
# Trigger CDS
ssh havoc-root 'killall CoreDeviceService'
ssh havoc 'devicectl device info details --device <UDID>'
# WAIT — this is the critical bit iter-1-5 missed.
# Do not pkill the backend for at least 60 seconds.
sleep 60
# Then collect
ssh havoc 'pkill -f iosmux-backend'
scp havoc:/tmp/iosmux-backend.log /local/path/
The session log should contain at least one peer closed (EOF)
entry with a delta of 20-40 s from the corresponding
accepted: line. That's the reproducible timeout signature.