Forward Roadmap — Consolidated Plan¶
Date: 2026-04-13 (session 7)
Supersedes: plan-next-steps.md (which is now a historical snapshot of
workstream progress through session 6)
This document consolidates the findings from:
- Phase 2 test results (wrapper fix deployed, check-in now hangs) —
phase2-wrapper-fix-test-results.md - Behavior analysis (CDS+0xB896 confirmed as the hang cause) —
phase2-behavior-analysis.md - Code audit (16 more bugs found of various severity) —
code-audit-findings.md - Action interception strategy (Mercury Codable vs flat dict) —
action-interception-full-picture.md
Current state of the repo¶
Confirmed working¶
- Device registration pipeline: inject builds, dyld loads our dylib, all hooks install without errors
- RSD connection to iPhone via pymobiledevice3 tunnel (62 services in Handshake on iOS 26.4.1)
OS_remote_deviceconstruction and attachment to SDR viahandleDiscoveredSDRRSDDeviceWrapperinit completes and returns a real pointer- DYLD_INTERPOSE on 7 shared-cache functions (linked against RemoteServiceDiscovery.framework and RemoteXPC.framework)
- CDS stays alive across repeated
devicectlcalls after the wrapper fix
Confirmed broken (verified)¶
-
CDS+0xB896 hook (
mov al, 1; ret) swallows every incoming XPC message viainvoke(anyOf:)prefilter, includingDeviceManagerCheckInRequest.devicectl list devicestimes out with "connection interrupted" after 15s. -
Wrapper read overflow at
sdr+104— FIXED in commit 59ca5cd.g_hook_wrapperwas reading 16 bytes past the end of an 88-byte SDR object. Fix holds,g_hook_wrapperis now a valid pointer.
Suspected broken (code audit, not yet runtime-verified)¶
Three bugs of the exact same class as the wrapper read overflow (code audit
C1, H1, H6) plus several likely Swift ABI mistakes (H3, H4). All currently
"work" by accident. See code-audit-findings.md for full details.
The broader concern: if one latent memory bug hid for multiple sessions, the other listed bugs could bite us any time Apple ships an update or we reorder a hook.
Trust posture¶
We cannot trust the inject code at face value until we eliminate at least the CRITICAL audit findings. Every further test is noise until we do this — each new experiment could be blocked by a bug we already know exists but haven't fixed.
The approach going forward must be:
- Fix CRITICAL bugs first — they are cheap and their presence invalidates any test result
- Then make the minimal intended change (remove Step 19)
- Then test
- Then one step at a time
Stage 0 — Safety belt (fix CRITICAL audit findings first)¶
Effort: ~1 hour of mechanical fixes, no research needed.
| # | Fix | File:Line |
|---|---|---|
| S0.1 | C2: compute orig_target from rel32 of callq instead of hardcoded cds_base + 0xB0ECE |
iosmux_inject.m:1298 |
| S0.2 | C1: replace immediate bake of g_hook_wrapper with mov rax, [rip+global] pattern. Assert g_hook_wrapper != NULL before installing hook pages. |
iosmux_inject.m:1207–1213, 1339–1345 |
| S0.3 | C4: wrap iosmux_hook_with_wrapper and iosmux_with_wrapper_replacement in #if 0 with explanation comment |
iosmux_inject.m:498–524 |
| S0.4 | C5: add "r12","r13","rbx" to clobber list on the updateIdentifier asm block |
iosmux_inject.m:1014–1031 |
| S0.5 | C6: Block_copy before invoking the connected_callback block, Block_release after |
iosmux_inject.m:1065–1078 |
Gate: after Stage 0, verify the build still works and inject still completes init successfully. No behavior change expected — these are defensive fixes only.
Stage 1 — Unblock check-in (the intended next test)¶
Effort: 5 minutes to disable + test.
S1.1 Disable Step 19 (CDS+0xB896 hook) entirely¶
Simplest option: wrap the whole Step 19 block in if (0) with an explanatory
comment pointing at phase2-behavior-analysis.md. Functionally equivalent to
a passthrough trampoline but with zero extra machinery.
S1.2 Validate check-in¶
Tests in order:
- Build + deploy dylib
killall CoreDeviceServiceto force relaunch on next client- Open a CDS system log stream in one terminal:
log stream --predicate 'process == "CoreDeviceService"' --style compact --info - Run
devicectl list devicesin another terminal - Verify within 2 seconds:
- devicectl returns successfully with device listed
- CDS log shows
Client connected: [pid] (no name). Handling DeviceManagerCheckInRequest - CDS log shows
Published DeviceManagerCheckInCompleteEvent - Run
devicectl list devicesseveral more times to confirm stability
S1.3 Stage 1 exit criteria¶
All three positive checks pass. No SIGSEGV. No timeout. If any fail, fall through to Stage 1 fallback.
S1.4 Stage 1 fallback¶
If check-in STILL fails after Stage 1:
- Look at the CDS log to determine WHERE it's blocking:
- Log shows
Handling DeviceManagerCheckInRequest→ blocker is downstream (typed handler, ServiceDeviceManager, publish path). Next suspect:handleDiscoveredSDRside effects on our injected SDR. - Log does NOT show
Handling DeviceManagerCheckInRequest→ blocker is still in the pre-dispatch path. Next suspects:- DYLD_INTERPOSE on one of the 7 functions interfering with initialisation
- CDS+0xCCB0 or CDS+0x5E2D0 hook firing in unexpected context
- One of the Stage 0 fixes introduced a new bug
Stage 2 — Xcode visibility smoke test¶
Effort: 5 minutes to observe.
- With Stage 1 verified, open Xcode → Window → Devices and Simulators
- Observe whether the device appears in the sidebar
- Capture which fields Xcode shows (name, OS version, busy/paired state)
- Do NOT click Pair — that's a separate gated test
- Capture Xcode's XPC traffic to CDS via the upcoming Stage 3 logging interpose
Stage 2 exit criteria¶
Device row appears in Xcode. Its state may be "connecting" or similar — that's OK at this point. The goal is just to confirm Xcode sees it.
Stage 3 — Logging-only Mercury interpose¶
Effort: 2-4 hours including capture + analysis.
S3.1 Add a pure observational hook¶
Interpose xpc_connection_send_message and xpc_connection_send_message_with_reply
(not set_event_handler — we want to capture both directions). Filter the
incoming peer connection name to "com.apple.CoreDevice.CoreDeviceService".
For matching messages:
xpc_copy_description(msg)→ write to/tmp/iosmux_mercury.log- Also attempt to introspect the
mangledTypeNamevalue and log it separately - Pass through unchanged — absolutely no behavior change
S3.2 Capture the traffic we care about¶
With the interpose active:
devicectl list devices— capturesDeviceManagerCheckInRequest,ProvisioningProvidersListRequest, theCompleteEvent, etc.- Open Xcode Devices window — captures whatever Xcode's device-manager does on initial connect
- (Without clicking Pair) look for any
acquireusageassertiontraffic — this might be triggered by just opening the window
S3.3 Build the envelope catalog¶
Deliverable: docs/research/mercury-envelope-catalog.md with actual
xpc_copy_description output for each captured message. For each envelope:
- Identify the top-level keys
- Identify the
mangledTypeNamevalue - Extract the Codable payload shape (as much as we can see)
This replaces all our current guesses about the wire format.
Stage 3 exit criteria¶
At least the following envelopes captured in full:
- DeviceManagerCheckInRequest + CompleteEvent
- ProvisioningProvidersListRequest (the "Ignoring" message)
- acquireusageassertion request + reply (if Xcode sends one)
- Any Pair-flow messages Xcode sends before we click the button
Stage 4 — Targeted acquireusageassertion interceptor¶
Effort: 4-8 hours depending on envelope complexity.
Primary target: acquireusageassertion. If Xcode gets a successful reply, its
client-side _shadowUseAssertion is set, hasConnection=true, and the Pair
button disappears entirely. We then never need to handle PairAction — which
is good because PairAction has a ChallengeAnswer sub-protocol we can't
currently implement.
Approach selection (gated on Stage 3 output)¶
Decide based on envelope complexity:
Option A — XPC-level Mercury Codable faker
Interpose xpc_connection_set_event_handler (per research A+B), wrap the
handler for com.apple.CoreDevice.CoreDeviceService connections. For messages
with a mangledTypeName matching CoreDevice.AcquireBUsageAssertionActionDeclaration
(or similar — TBD from Stage 3), forge a reply dict by cloning a known-good
reply's byte layout and substituting UUIDs. Send via
xpc_connection_send_message on the reply connection obtained from
xpc_dictionary_get_remote_connection(reply).
Pros: doesn't need Swift async ABI work. Cons: requires mimicking exact Codable byte layout.
Option B — Swift-level ActionImplementation.invoke() hook
Hook the specific AcquireBUsageAssertionActionDeclaration.invoke() function
(find offset in CoreDeviceUtilities, patch callsites in CDS binary). At this
level, the action is already decoded into a typed Swift object and we interact
with a strongly-typed continuation.
Pros: no Codable wire format work. Cons: Swift async ABI, need to synthesize and resume a continuation correctly from C code.
S4.1 Prototype + deploy¶
Implement chosen option, deploy, test that:
- CDS still stays alive
devicectl list devicesstill works- Xcode Devices window now shows device in a different (more "connected") state
- The Pair button disappears from Xcode Devices UI
_shadowUseAssertionis set on the Xcode side (observable via DVTCoreDeviceCore logging or lldb attach to Xcode)
Stage 4 exit criteria¶
Pair button no longer shown. Xcode treats the device as usable. No crashes.
Stage 5 — DeviceInfo completeness and service wiring¶
Effort: multiple days.
Now that Xcode considers the device usable, fill in the holes:
S5.1 DeviceInfo fields¶
Set the 11 DeviceInfo fields listed in pair-button-and-cfnetwork.md (UNVERIFIED
list — set one at a time and observe Xcode behavior):
- transportType, platform, deviceType, reality, osVersion, osBuild, udid, authenticationType, developerModeStatus, bootState, isMobileDeviceOnly
Source values from the RSD Handshake Properties dict we already parse.
S5.2 Service routing verification¶
Ensure that when Xcode asks CDS to open a service socket (e.g. for
com.apple.streaming_zip_conduit.shim.remote or
com.apple.internal.dt.remote.debugproxy), our existing DYLD_INTERPOSE on
remote_service_create_connected_socket actually gets invoked and returns a
working TCP socket to the tunnel.
Test: try to install a simple app from Xcode. Even if it doesn't complete, we should see the service connection reach iPhone via the tunnel.
S5.3 Forward vs handle action routing¶
Come back to the list from Stage 4 research and actually implement forwarding for the 8 "forward" actions (createservicesocket, appinstall, transferfiles, etc.). Each needs its own Mercury reply construction if Option A was chosen in Stage 4, or its own invoke() hook if Option B.
Stage 6 — Integration verification¶
End-to-end:
- Install a test app from Xcode to the "remote" iPhone
- Attach debugger from Xcode
- Set a breakpoint, run the app, hit the breakpoint
- Stop cleanly
This is the ultimate goal. If we hit it, we have working Xcode integration.
Risk register¶
| Risk | Impact | Mitigation |
|---|---|---|
| Stage 0 fix introduces a new bug | Stage 1 test is invalid | Test Stage 0 output by running inject once, checking log for successful registration. No behavior change expected. |
| Stage 1 removal of CDS+0xB896 hook re-crashes on Pair | Pair path regresses to Session 5 state | Stage 1 does NOT click Pair. That's for later. Current state is non-functional anyway. |
| Stage 3 interpose misses something | Envelope catalog incomplete | Can re-run with more capture. Observational only, no side effects. |
| Stage 4 Mercury Codable forging fails | Stuck on Pair button | Fall back to Option B (Swift invoke hook) |
| Swift async continuation ABI is too hard | Option B fails | Consider DVTCoreDeviceCore-side interpose on Xcode process (last resort) |
Deferred items¶
Medium-severity audit findings (M1-M10) are deferred until we reach Stage 5. They're unlikely to bite in the current Stage 1-4 path but should be fixed before productisation.
Other deferred:
create_service_endpoint128-listener array fix- Wire decoder input validation
- Go relay blocking HTTP calls (Go relay is deprecated anyway)
Stopping rule¶
At any stage, if reality diverges from prediction:
- STOP. Do not layer more changes on unknown state.
- Run a minimal diagnostic (log stream, lldb attach, etc.)
- Document finding in a new research doc
- Update this roadmap if necessary
- Only then resume
The wrapper bug is a direct consequence of violating this rule previously.