Phase 2: Wrapper Read Overflow Fix — Test Results¶

Date: 2026-04-13 (session 7) Commit tested: 59ca5cd — Fix wrapper read overflow at sdr+104 + research synthesis Status: Fix works as intended but surfaced a NEW issue (CDS+0xB896 now blocks check-in)

Goal of this test¶

Deploy ONLY the wrapper read-overflow fix (without touching CDS+0xB896 or any other hook) and observe behavior. The hypothesis was that some of the crashes previously attributed to action dispatch were actually caused by g_hook_wrapper being a garbage pointer read from sdr+104 (16 bytes beyond the 88-byte SDR object).

Deployment¶

scp iosmux_inject.m havoc:~/iosmux/inject/
ssh havoc "cd ~/iosmux/inject && make clean && make"
ssh havoc-root "cp .../iosmux_inject.dylib /Library/Developer/CoreDevice/..."

Build succeeded. dylib signed ad-hoc. Deployed to /Library/Developer/CoreDevice/iosmux_inject.dylib with timestamp 2026-04-12 18:41.

Observed behavior¶

What works (new, positive)¶

CDS process starts on devicectl trigger (launchd spawns)
Our inject constructor runs, full init sequence completes
OS_remote_device created with real RSD connection
RSDDeviceWrapper init returned: 0x600001448180 — success
g_hook_wrapper = 0x600001448180 — real valid pointer, not garbage
Wrapper for CDS helper hook: 0x600001448180 (set in Step 9b) — Step 12 no longer reads from sdr+104; takes the value directly from Step 9b's local variable
All hooks install without errors
PoC registration complete. Device visible in Xcode. logged
CDS stays alive across repeated devicectl calls (PID 1301 stable, exit code 0)
No crash reports for CoreDeviceService
62 services parsed from RSD Handshake (was 74 pre-iOS update — iPhone now on iOS 26.4.1)

What doesn't work (new issue)¶

devicectl list devices times out with XPCError 1000: connection interrupted after ~15 seconds:

devicectl[1342:7112] DeviceManager sending check-in request: 3D152F75-...
(15 seconds of silence)
devicectl[1342:7113] Timed out waiting for CoreDeviceService to fully initialize
CoreDeviceService[1301:7118] [peer] invalidated because client process exited

System log analysis:

devicectl connects to CDS XPC service — peer connection established
devicectl sends DeviceManagerCheckInRequest { identifier: <uuid> }
CDS does not log any response
No "Client connected: ... Handling DeviceManagerCheckInRequest" message
No "Published DeviceManagerCheckInCompleteEvent" message
devicectl times out after 15 seconds and disconnects
CDS sees the disconnect and logs the invalidation — but itself stays alive

Interpretation¶

The check-in request reaches CDS XPC layer but does NOT reach the Mercury XPCMessageDispatcher that would normally produce the "Handling DeviceManagerCheckInRequest" log line. Something in our hook set is swallowing it silently.

Leading hypothesis: CDS+0xB896 hook blocks check-in¶

Prior research (docs/research/action-interception-full-picture.md, E+F section) identified CDS+0xB896 as the single callsite for CoreDeviceUtilities.invoke(anyOf:usingContentsOf:). The current hook is mov al, 1; ret — return true unconditionally, never call the original function.

If DeviceManagerCheckInRequest dispatch also goes through this same invoke(anyOf:usingContentsOf:) path, then our hook intercepts it, returns "handled", and Swift async continuation is left dangling. devicectl never gets a reply.

This matches the observed symptom: check-in request reaches CDS but never gets an async reply, so devicectl times out.

Why did this not affect earlier sessions?¶

Before the wrapper fix, g_hook_wrapper was a garbage pointer from the heap. CDS would crash somewhere in the flow — often via CDS+0xCCB0 hook returning that garbage to CDS, which then dereferenced it. The crash happened BEFORE check-in dispatch could reach invoke(anyOf:usingContentsOf:), so the CDS+0xB896 hook never became the blocker.

Now that the wrapper is valid, CDS does not crash, and check-in dispatch reaches invoke(anyOf:usingContentsOf:) — where CDS+0xB896 swallows it.

In other words: the wrapper bug was masking the CDS+0xB896 bug. Fixing one exposes the other. This is exactly what E+F research predicted:

The return true path is the very bug that causes "error communicating" — it short-circuits the Swift async continuation and leaves Mercury holding an uncompleted reply context. Keeping it as a safety net makes things WORSE, not better, because the leak only matters for messages that actually reached Mercury.

What this means about previous "working" sessions¶

The sessions where we believed "device visible in Xcode as connected" may have been showing CACHED state from devicectl/Xcode, or the success window was very narrow between CDS startup and the crash. We do not have strong evidence that check-in ever actually succeeded through our injected CDS — we have log lines from the inject showing successful construction, but no system log from CDS showing a successful check-in response.

Need to verify against system log history from earlier sessions (may not be available after reboots).

Implications for trust in earlier findings¶

We found ONE fatal latent bug (wrapper read overflow) that hid for multiple sessions. It caused silent garbage-pointer returns to CDS via our hooks. Nothing in our test output or inject logs ever flagged it.

This means any other silent memory/ABI bugs in our inject code could be hiding behind crashes attributed to other causes. A systematic code audit is warranted before continuing to build on the current implementation.

See code-audit-findings.md (to be written after audit research completes) for follow-up.

Next minimal test¶

Convert CDS+0xB896 hook from mov al, 1; ret (return true) to either:

Remove the hook entirely (restore original callq)
Passthrough trampoline (movabs r10, <orig_invoke>; jmp r10)
Conditional: only intercept if XPC event has our device UUID

All three predict successful check-in. Predictions differ on what happens when Xcode Pair is clicked:

Option	Expected Pair click behavior
Remove hook	Original invoke() runs — may crash as before (but on valid wrapper now)
Passthrough trampoline	Same as remove — original invoke() runs
Conditional intercept	Our device UUID → swallowed (may still break async); others → passthrough

The passthrough trampoline is the recommended option from E+F research — it preserves the hook infrastructure for future evolution while making it benign.