AUA history — D24..D35a empirical localisation era (2026-04-30..2026-05-01)¶
Status: verified — historical archive
D24 (post-D23 dynamic re-probe — common funnel) + D30 (AFM peer
keep-alive deploy — peer lifetime is not the gate) + D31 (lldb v2
— peer[1013] race, SUCCESS path observed) + D32 (lldb to live CDS
— kernel-driven invalidation) + D33 (devicectl injection
feasibility) + D34/D35a (Heisenbug discovery). Archived from
aua-side-channel-mechanism.md
on 2026-05-02 anchor refactor.
D24 update: common funnel finding (post-D23 deploy, 2026-04-30 evening)¶
D24 dynamic re-probe with apparatus already in post-D23 state (P-1a NOP-evict deployed). Goal: localise residual errorCode 3 that survives D23. Three pre-registered hypotheses (W1 secondary-validation, W2 separate-cache, W3 analytics-gate) — D24 falsified all three; named the actual mechanism W4.
W4 — assertionq dispatcher is the common funnel; side-channel peer is the second input¶
Empirical (notes/d24 §F.1 stack walk, §L unified-log correlation):
- D23 P-1a NOP-evict still effective — bp4 (the D23-NOP'd
removeCachedXPCConnectionsymbol) NEVER fired during the D24 trigger run. Action-XPC connection eviction path is fully suppressed. - XPCError 1001 IS BACK — but on
peer[2019]of a SIDE-CHANNEL XPC connection (different connection from the action XPC). Mercury'sXPCSideChannel-installed event handler delivers it (unified.log line 47, t=14:34:57.171). Verbatim:[com.apple.dt.coredevice:useassertion] Recieved error from side channel peer: XPCError(errorCode: 1001, "The connection was invalidated.") - CD's [useassertion] subsystem subscribes to the Mercury event and posts a block to
com.apple.dt.coredevice.remotedevice.default.assertionq. - The assertionq dispatcher runs
___30eb0AGAIN, but at a DIFFERENT internal call site this time:- D21 (cache-eviction input):
___30eb0+2034→___324a0+98→swift_continuation_throwingResume - D24 (side-channel input):
___30eb0+1910→___324a0+56→swift_continuation_throwingResumeWithError - Same wrapper functions, different internal RetPC offsets — proves the dispatcher chain is a SHARED FUNNEL with multiple input paths.
- D21 (cache-eviction input):
- errorCode 3 user-visible at unified.log line 50 (t=14:34:57.285):
Failed to acquire usage assertion ... CoreDeviceError(errorCode: 3, "Failed to acquire assertion"). Literal stringFailed to acquire assertionlives in CoreDevice__cstringat file offset 0x369860 (notes/d24 §J.1).
D24 fix vector ranking (notes/d24 §N)¶
Q-1a wins for the same reason D23 P-1a won: hook the FUNNEL, not the inputs. The dispatcher receives errors from at least two upstream paths (cache eviction + side-channel teardown); patching the funnel addresses both with one hook. Expected D25 implementation pattern: NOP-style or wrap-callback patch at CD+0x31621 (= ___30eb0+1905, the call instruction whose return PC is the captured frame-3 RetPC CD+0x31626).
If D25 Q-1a deploys cleanly and works, D23 P-1a NOP-evict on removeCachedXPCConnection becomes redundant (funnel hook already covers cache-eviction input). Decision to revert/retain D23 deferred to post-D25 acceptance test.
W1/W2/W3 falsifications (notes/d24 §M)¶
- W1 (post-success validation on reply payload) — rejected. Action XPC reply decodes as
success()with no error key (notes/d24 §I.2). Construction happens AFTER reply-decode path completes. - W2 (separate cache or registry) — rejected. Regex bps for
UsageAssertionCache,checkAssertion,validateAssertion,registerAssertionmatched ZERO symbols in CoreDevice (notes/d24 §C). No second cache. - W3 (analytics-side gate) — rejected. Analytics emit at unified.log lines 44-45 fires BETWEEN the side-channel error (.171) and the user-visible error (.285) — analytics is passive observer, not a gate.
Apparatus integrity (D24 also non-destructive)¶
iosmux_inject.dylib sha256 still 52df2cc643dbc6c24d3debd6eba8682ded9fd8cf5d1f9ee3c57962e0171b4803 (D23 v2). CoreDevice sha256 unchanged. iPhone (iosmux) connected (no DDI). Zero CDS/devicectl crashes during D24 probe. D23 hook still installed at runtime (verified by inject-log lines 562-564 in notes/d24 §A).
D30 update: AFM peer keep-alive — FAILED (2026-04-30 evening)¶
D30 was the first empirical test of "Why GAMBIT's AFM endpoint emit alone does not satisfy V2" §"AFM is emitted over an outbound connection iosmux creates with xpc_connection_create_from_endpoint. Apple expects a persistent inbound peer connection FROM CDS to live for the assertion's lifetime." Specifically D30 tested only the lifetime half of that statement: keep the peer alive past success() reply and observe whether errorCode 3 still fires.
Hypothesis¶
Explicit xpc_connection_cancel + xpc_release of the AFM peer connection (the send_barrier { cancel } ; release pattern introduced by commit 28d7d89) makes Apple's AUA wrapper at CD+0x31750 observe XPCError(connectionInvalid, 1001) on the same Mercury.XPCSideChannel-installed event handler that delivers AFM, and InProgressClientAssertion.state transitions to .invalidated(error) instead of .fulfilled — continuation resumes with .failure → CoreDeviceError(errorCode: 3) thrown. Note: D20 iter-3 (c49429b, reverted) had previously tested "no peer at all" (suppressed AFM emit entirely) → identical 9-13 ms latency to errorCode 3. D30 is the genuinely-untested case "open + send + keep alive forever".
Fix candidate (deployed)¶
Three edits to inject/iosmux_gambit.m:
- Globals at top of TU (after LOG macro):
static NSMutableArray<id> *g_afm_retained_peers = nil; static NSLock *g_afm_retained_lock = nil; - Replace
gambit_emit_afm_via_endpointcancel-after-send block: instead ofxpc_connection_send_barrier(ep_conn, ^{ xpc_connection_cancel(ep_conn); }); xpc_release(ep_conn);— append(__bridge id)ep_connintog_afm_retained_peersunderg_afm_retained_lock. ARC keeps refcount > 0 for the lifetime of the CDS process. No xpc_release. - Init lock + array at top of
gambit_install_hook(the constructor entry called fromiosmux_inject_init).
Full source diffs preserved in commit history of inject/iosmux_gambit.m (subsequently reverted at distillation time).
Apparatus state at deploy:
- New build sha:
6c9b07bad7ac383c8a66d4660a1e04e41411608ff65286cce0eb808afde89f49(D23 v2 was52df2cc6...4803). - Size delta: 193680 − 193664 = +16 bytes. Tiny because
NSLockandNSMutableArrayare existing Foundation classes (no new ObjC class metadata emitted) — only handful ofobjc_msgSendcall-sites + 2 zero-init static pointer slots in__DATA. Build sha differs from baseline → code change landed (not cached artefact). - CDS respawned via
killall CoreDeviceService(PID 749 → 1940). Both hook install paths (AUA keepalive NOP at0x10d817be0+ GAMBIT trampoline at0x10c637f50) succeededrc=0. ASLR slide differed from prior run (0x10d80a000vs0x111765000) — install code computed new slide correctly. - Backup of D23 v2 dylib at
/home/op/backups/iosmux/d30-pre-deploy/iosmux_inject.dylib.d23v2(sha verified52df2cc6...4803).
Verdict: FAILURE¶
Same outcome class as D23 v2 baseline. Citations from the captured 60s unified-log window (timestamps verbatim from havoc):
Citation 1 — failure is the only AUA outcome present, no Successfully acquired line anywhere:
2026-04-30 21:29:52.008620+0700 localhost devicectl[2001]:
(CoreDevice) [com.apple.dt.coredevice:useassertion]
Failed to acquire usage assertion on device
E8A190DD-64F5-44A4-8D57-28E99E316D60 due to error:
CoreDeviceError(errorCode: 3, errorUserInfo:
["NSLocalizedDescription": "Failed to acquire assertion"])
errorCode 3, identical NSLocalizedDescription string, ~10 ms after success() reply at 21:29:51.998775. Latency identical to D23-v2 baseline.
Citation 2 — GAMBIT path executed correctly: AFM emitted on the same peer the brief identifies, the keep-alive code path DID run (no cancel barrier scheduled, peer kept in g_afm_retained_peers):
[inject] GAMBIT: synthesised reply for aid=com.apple.coredevice.action.acquireusageassertion label=acquireusageassertion
[inject] GAMBIT: emitted AFM endpoint=0x7fc754107cb0 did=E8A190DD-64F5-44A4-8D57-28E99E316D60
(/tmp/iosmux-d30-inject-tail.log lines 49-50, immediately prior to AUA failure at 21:29:52.008.)
The action reply was a success() per devicectl's own log:
2026-04-30 21:29:51.998775+0700 localhost devicectl[2001]:
(CoreDeviceUtilities) [com.apple.dt.coredevice:action]
Received reply from action
(type=AcquireDeviceUsageAssertionActionDeclaration,
invocation=33377F43-05D2-4117-9C02-BF4C1EE7848C):
success()
So the synchronous action-RPC path is fine. AUA wrapper's later state-machine resolution to .fulfilled does NOT depend on the AFM peer connection's lifetime.
Critical absence — XPCError 1001 path eliminated¶
D24's W4 evidence chain showed peer[2019] of the side-channel surfaces XPCError(errorCode: 1001, "The connection was invalidated.") on Mercury's event handler, and CD's [useassertion] subsystem subscribes to that and posts to the assertionq dispatcher. In the D30 unified-log window: zero XPCError 1001 lines, zero peer[2019] references, zero Recieved error from side channel peer strings. The 1001 surfacing was caused by the cancel-after-send pattern; D30 eliminated the cancel; the 1001 surfacing stopped. But errorCode 3 still fires.
This proves the dispatcher chain has a third uncatalogued input source (input #3) that produces Failed to acquire assertion even when (a) action XPC reply is success(), (b) cache-eviction is NOP'd (D23 P-1a), and © AFM peer never invalidates (D30).
What D30 ruled out¶
- Hypothesis: AFM peer cancel-after-send is what causes AUA wrapper to see connection invalidate and resolve
.invalidated(error). RULED OUT. With cancel removed and peer held for entire CDS process lifetime, AUA outcome is byte-identical (same errorCode 3, same NSLocalizedDescription). Peer lifetime is not the gate.
What D30 leaves on the table (open hypothesis space for D31 research probe)¶
Listed exhaustively in the D30 REOPEN admonition near the top of this doc: AFM identifier value mismatch, Mercury XPCSideChannel framing layer, AFM timing race vs success-callback, separate notification channel (darwin notify / property update), DSS field validation (monotonicIdentifier etc).
Apparatus integrity (D30 also non-destructive)¶
- iPhone (iosmux) still
connected (no DDI)— no degradation. - CoreDevice sha256 unchanged (
bea205e2c64622d144bcc7664ee104083d0e192aca206739cca345dc7c420495). - Zero new diagnostic reports in
/Library/Logs/DiagnosticReports/— final dump's 5 entries are byte-identical to baseline (apfsd,simdiskimaged,shutdown_stallnoise types only). No CDS / devicectl / remoted crash. - Backup at
/home/op/backups/iosmux/d30-pre-deploy/iosmux_inject.dylib.d23v2intact (sha52df2cc6...4803) — D23 v2 baseline restorable via singlescp+killall CoreDeviceService.
Distillation status¶
D30 raw notes (notes/d30-afm-peer-keepalive.md, ~32 KB, gitignored) deleted at distillation time per feedback_notes_are_temporary_buffer. Empirical findings, FALSIFIED hypothesis, citations, and apparatus state preserved in this section + the REOPEN admonition near the top of this doc + the Q-D66-15 D30 Resolution log entry in docs/plans/d66-research-questions.md.
D31 update: input #3 IS peer[1013] race — same dispatcher as input #2; SUCCESS path observed (2026-05-01)¶
D31 lldb dynamic probe localised the third input source named in the D30 admonition. The empirical finding changes the architectural question from "what's the third gate" to "how to win a race that is deterministically winnable".
Construction site for input #3¶
| Symbol | Module | File offset | Role |
|---|---|---|---|
CoreDeviceError.init(code:userInfo:) |
CoreDeviceUtilities | +0xbabf0 |
Final ctor — receives code:Int32 = 3 in rdi |
Swift.Error<...CoreDevice._Error>.init(Int32, String) |
CoreDeviceUtilities | +0xc10c6 |
Inner-call call-site |
Swift.Error<...CoreDevice._Error>.init(τ_0_0, Optional<String>) |
CoreDeviceUtilities | +0xc2780 |
Outer-wrap adds userInfo["NSLocalizedDescription"] = "Failed to acquire assertion" |
static Swift.Error<...CoreDevice._Error>.xpcError.getter |
CoreDeviceUtilities | +0xc2978 |
Bridge from XPCError to CoreDeviceError |
Caller chain (input #3 anchor inside CoreDevice)¶
| Symbol | Module | File offset | PC at hit | Role |
|---|---|---|---|---|
___lldb_unnamed_symbol_30230 |
CoreDevice | +0x30230 |
+0x30368 (inner) / +0x303cd (outer wrap) |
Input #3 synthesis caller — calls xpcError.getter then wraps with NSLocalizedDescription |
___lldb_unnamed_symbol_31700 |
CoreDevice | +0x31700 |
+0x31734 |
Outer caller — invokes ___30230 |
Throw funnel — same dispatcher as D24 W4 input #2¶
D31 confirms input #3 traverses the same ___30eb0/___3bc80/___324a0 dispatcher as input #2; only the internal RetPC offset differs (build-dependent):
| Symbol | File offset entry | PC at HIT (D31, current build) | D24 measurement (input #2, prior build) |
|---|---|---|---|
___324a0 |
+0x324a0 |
+0x324d8 (offset +0x38) |
+0x324f8 (offset +0x56) |
___3bc80 |
+0x3bc80 |
+0x3bcbd (offset +0x3d) |
+0x3bcbd (offset +0x3d) |
___30eb0 |
+0x30eb0 |
+0x31626 (offset +0x776) |
+0x317c0 (offset +0x910 — D24 reported "+1910" decimal = +0x776 actually; offsets reconcile, prior doc had ambiguous formatting) |
___1e300 |
+0x1e300 |
+0x1e319 (offset +0x19) |
+0x1e319 (offset +0x19) |
Inputs #2 and #3 share the same final dispatcher; they differ only in WHICH XPC peer's invalidation drives the 1001.
THE PEER¶
The XPC peer that drives the 1001 in input #3:
<SystemXPCPeerConnection 0x... { <connection: 0x... {
name = com.apple.xpc.anonymous.0x...peer[1013].0x...,
listener = false, pid = 1013, euid = 501, egid = 20,
asid = 100024 } }>
pid = 1013 is the CoreDeviceService daemon. This is the primary CDS↔devicectl side-channel connection (distinct from the AFM peer[2019] that D30 retained).
D30's AFM-peer keep-alive addressed peer[2019]. peer[1013] was untouched. Its post-reply invalidation continued to fire and continued to be caught by the same [useassertion] observer that synthesises errorCode 3.
Race condition — success is deterministically winnable¶
D31 captured two outcomes under identical apparatus state:
- Run #2 (success):
[useassertion] Successfully acquired usage assertion E8A190DD-...followed by graceful invalidation. The 1001 STILL fires post-reply, but the success-resume thread won the race. - Run #3 (failure, iter 1 of 5-iteration retry):
[useassertion] Failed to acquire usage assertion ... CoreDeviceError(errorCode: 3). The side-channel observer drain pump won the race.
Two competing threads in devicectl:
-
Thread A — success-resume path: drives
swift_continuation_throwingResume(no error variant). Triggered by GAMBIT-synthesisedsuccess(...)reply unwinding through CoreDevice's reply-handling code. Stack:___324a0 → ___3bc80 → ___30eb0 → ___1e300 → _dispatch_call_block_and_release. -
Thread B — side-channel observer drain pump: catches XPCError 1001 from peer[1013] invalidation, synthesises
CoreDeviceError(code:3, userInfo:[NSLocalizedDescription: "Failed to acquire assertion"])via the construction chain above. Stack: CDU+0x55510 / +0x55540 / +0x52510 / +0x52630 / +0x54060 / +0x56350 / +0x54ab0family on_dispatch_lane_serial_drain→___31700+0x34→___30230+0x368(inner) /+0x3cd(outer) → ctor chain.
Whichever thread first calls into the AUA continuation wins. Run #2 proves Thread A CAN win deterministically if the post-reply teardown path is short-circuited or delayed long enough.
Image base addresses (current build, CDS PID 1013)¶
For lldb breakpoint resolution and offset arithmetic on the post-D30-revert apparatus:
| Image | __TEXT base | sha256 |
|---|---|---|
| CoreDeviceService.xpc | 0x102b1d000 |
(Apple — unchanged) |
| CoreDevice.framework | 0x104c7b000 |
bea205e2...0495 |
| CoreDeviceUtilities.framework | 0x103a99000 |
(Apple — unchanged) |
| CoreDeviceInternal.framework | 0x1032ad000 |
(Apple — unchanged) |
| iosmux_inject.dylib | 0x1032d4000 |
52df2cc6...4803 |
Process locality¶
Client-side (devicectl). Both BP2 hits and the throw at BP1 fire inside the devicectl process. The CDS daemon (PID 1013) reply is not the immediate trigger — the trigger is the post-reply peer invalidation propagated from CDS to devicectl via the side-channel.
This reconciles with D30 unified-log line 14 (devicectl[2001]: ... Failed to acquire usage assertion ...).
Strategy implications for D32¶
Three layered options ranked by stability:
-
Suppress peer[1013] invalidation upstream of the 1001 emission — find what triggers peer[1013] invalidation CDS-side and block at AUA reply time. Most stable: closes the race at its cause instead of chasing downstream symptoms. Selected for D32 per dispatcher decision (rationale: prior D23/D30 NOP attempts pulled chained downstream gates one after another; cause-side blocking is the only architecturally clean exit).
-
NOP the side-channel observer drain pump (CDU
+0x55510family) — risky. Same pump may handle legitimate errors elsewhere. Blast radius unknown. -
NOP CD+0x30230 synthesis — most localised but not AUA-specific. CD+0x303cd path also reached for non-AUA errors. Likely breaks other CoreDevice surfaces.
D32 (next): research probe to localise peer[1013] invalidation source CDS-side.
Apparatus integrity (D31 also non-destructive)¶
- iosmux_inject.dylib sha unchanged (
52df2cc6...4803). - CoreDevice sha unchanged (
bea205e2...0495). - iPhone (iosmux)
connected (no DDI). - CDS PID stable at 1013 across all 3 lldb attaches (lldb attached to devicectl, not CDS — no daemon respawn).
- Zero new diagnostic reports.
- Backup at
/home/op/backups/iosmux/d30-pre-deploy/iosmux_inject.dylib.d23v2intact.
Distillation status¶
D31 raw notes (notes/d31-aua-errorcode3-localisation.md, ~575 lines, gitignored) deleted at distillation time per feedback_notes_are_temporary_buffer. Empirical findings + race-condition evidence + construction site + caller chain + strategy disposition preserved in this section + the Q-D66-15 D31 Resolution log entry in docs/plans/d66-research-questions.md.
D32 update: peer[1013] invalidation is KERNEL-driven via MACH_NOTIFY_NO_SENDERS — no CDS-side hook point exists (2026-05-01)¶
D32 lldb dynamic probe (general-purpose research agent, attached to live CDS PID 1013, 12 BPs covering xpc_connection_cancel, _xpc_connection_cancel, xpc_remote_connection_cancel, _xpc_connection_dispose, Mercury XPCSideChannel.{deinit,__deallocating_deinit}, Mercury RemoteXPCConnection.{deinit,__deallocating_deinit}, RemoteXPCConnection.unsafePeer(from:forServiceNamed:), XPCSideChannel.send(message:), CoreDevice.XPCSideChannel.sendCancelledMessage(), and the CoreDeviceUtilities.invoke(anyOf:usingContentsOf:) action sentinel) falsified D31's "cause-side suppression CDS-side" strategy direction. The peer[1013] invalidation has NO CDS-application-level instruction site.
What D32 found¶
Two distinct XPC connection lifecycle traces during AUA window:
Connection A (iosmux's AFM peer, address 0x7f89c6e075e0): cancelled by iosmux itself via __gambit_emit_afm_via_endpoint_block_invoke_2 → xpc_connection_cancel → _xpc_connection_cancel. Confirms current D23 v2 baseline state (cancel-after-send pattern, post-D30-revert). Connection A lifetime is irrelevant to errorCode 3 (D30 already proved this).
Connection B (peer[1013], address 0x7f89c6a09040): NEVER cancelled by any CDS-side application code. Cancel chain has zero CD/CDU/Mercury frames. Triggered by kernel-delivered MACH_NOTIFY_NO_SENDERS mach notification.
The kernel-driven invalidation chain¶
Frame Function Module / offset
#00 _xpc_connection_cancel libxpc + 0x4d2f7
#01 do_mach_notify_no_senders +0x3c libxpc + 0x407c0
#02 _Xmach_notify_no_senders +0x21 libxpc + 0x40761
#03 notify_server +0x4e libxpc + 0x3ff32
#04 _xpc_connection_pass2mig +0x8e libxpc + 0x3fe77
#05 _xpc_connection_mach_event +0x4d5 libxpc + 0x39755
#06+ _dispatch_client_callout4 libdispatch
_dispatch_mach_msg_invoke +0x1b7
_dispatch_lane_serial_drain
This chain is entirely inside libxpc.dylib (system library), responding to a mach kernel notification. There is no CoreDevice / CoreDeviceUtilities / Mercury / CoreDeviceService function in the invalidation chain. There is no iosmux_inject function in the chain either.
Secondary cleanup chain at t+126ms confirms peer identity via _xpc_connection_remove_peer_impl +0x3c → _xpc_connection_remove_peer — this is internal libxpc code that removes an anonymous accepted peer from a listener's peer table. The connection IS exactly the name = com.apple.xpc.anonymous.0x...peer[1013]... shape D31 documented.
What this means¶
MACH_NOTIFY_NO_SENDERS fires when the OTHER side (devicectl) drops its last mach send right on the connection's underlying mach port. The trigger is in devicectl's process, not in CDS. When devicectl finishes processing the AUA reply and releases its handle to peer[1013], the kernel delivers the notification to CDS, libxpc tears down its peer table entry, and devicectl-side immediately observes its own connection handle's invalidation on the side-channel event handler — XPCError(errorCode: 1001, "The connection was invalidated.").
peer[1013] cannot be kept alive CDS-side without devicectl also keeping its end alive. devicectl's release is governed by its own process-internal logic — no CDS-side hook can reach it.
What D31's preferred strategy actually requires¶
D31's "suppress peer[1013] invalidation upstream of 1001 emission" strategy is sound in principle but the upstream lives in devicectl, not in CDS. Cause-side suppression therefore requires a new injection surface — a separate inject dylib loaded into devicectl via DYLD_INSERT_LIBRARIES or equivalent. Pre-existing project plan documented in ADR-0009 §Decision Option C consequences and stage2.md Phase D.6.6 alternatives ("P-1b: devicectl-side inject"); previously rejected as "REJECTED — overkill, 12-20 hours, new injection mechanism". The empirical D32 finding reopens that rejection because the previously-preferred CDS-side path is now empirically falsified.
Three remaining strategy options for D33¶
-
devicectl-side inject — new dylib, loaded via DYLD_INSERT_LIBRARIES (or LC_LOAD_DYLIB on a patched devicectl copy following the CDS pattern). Hook the function inside devicectl that releases peer[1013]'s handle; defer the release past AUA continuation resolution. Most stable architectural exit but expands inject footprint to a new process. devicectl may have hardened runtime / library validation that needs codesign accommodation.
-
NOP side-channel observer drain pump in devicectl (CDU
+0x55510family). Hook the drain pump that catches XPCError 1001 and short-circuit when the error is from peer[1013] in AUA context. More targeted than blanket-NOPing the pump — surgical hook onxpcError.getter(CDU+0xc2978) entry filtered by error code + caller context. Blast radius bounded to AUA path. -
NOP CD+0x30230 synthesis in devicectl. Same critique as D31 — not AUA-specific, breaks other CoreDevice surfaces.
D32 deliverable per brief: source localised (in libxpc, not in CDS application code). Strategy decision for D33 deferred to dispatcher.
Apparatus integrity (D32 also non-destructive)¶
- iosmux_inject.dylib sha unchanged (
52df2cc6...4803). - CoreDevice sha unchanged (
bea205e2...0495). - iPhone (iosmux)
connected (no DDI). - CDS PID stable at 1013 across lldb attach/detach cycles (running-attach worked first try; no need for launch-under-lldb fallback).
- Zero new diagnostic reports.
Distillation status¶
D32 raw notes (notes/d32-peer1013-invalidation-source.md, ~349 lines, gitignored) deleted at distillation time per feedback_notes_are_temporary_buffer. Empirical findings + kernel-driven invalidation chain + Connection A/B distinction + strategy disposition preserved in this section + the Q-D66-15 D32 Resolution log entry in docs/plans/d66-research-questions.md.
D33 update: devicectl-side injection feasible; Option 1 reframed (symptom not cause); Option 2 recommended (2026-05-01)¶
D33 lldb dynamic probe (general-purpose research agent, lldb attach to launching devicectl, three sub-probes covering codesign/entitlements survey, DYLD_INSERT_LIBRARIES empirical test, and BP-based Option ½ hook target localisation) closed three concrete questions before D34 implementation work begins.
Q1 — DYLD_INSERT_LIBRARIES injection: FEASIBLE¶
devicectl is at /Library/Developer/PrivateFrameworks/CoreDevice.framework/Versions/A/Resources/bin/devicectl, sha256 4fede2dd...bf6c, Mach-O universal [x86_64 + arm64e], signed by Apple com.apple.CoreDevice.devicectl, TeamIdentifier 59GAB85EFG.
Codesign properties relevant to injection:
- Library validation flag (
0x2000) PRESENT. - Hardened runtime flag NOT present (
runtimeflag absent). - Three Apple-private entitlements (
com.apple.private.CoreDevice.takeDeviceSysdiagnose,com.apple.private.sysdiagnose,com.apple.videoconference.allow-conferencing) — none restrict dyld inserts. - NO
com.apple.security.cs.allow-dyld-environment-variables(not needed because hardened runtime is off). - NO
com.apple.security.cs.disable-library-validation(LV is on, no explicit disable).
Empirical test: ad-hoc-signed no-op probe dylib /tmp/iosmux-d33-probe.dylib loaded into devicectl via DYLD_INSERT_LIBRARIES=... env var. Probe constructor printed banner to stderr ([d33] DYLD_INSERT_LIBRARIES probe loaded pid=3533); devicectl exited 0; no DYLD_INSERT_LIBRARIES ignored / library validation / AMFI rejection. Library-validation flag does NOT block ad-hoc signed dylibs in this configuration.
Conclusion: D34 does NOT need to design an insert_dylib LC_LOAD_DYLIB patch path. Standard DYLD_INSERT_LIBRARIES injection works.
Q2 — Option 1 hook target REFRAMED — peer release is libxpc-internal, not app code¶
Original D32 framing assumed Option 1 = "hook devicectl's release of peer[xpc_release BP for connections matching xpc_connection_get_pid == CDS-PID) found:
- Two peer connections from CDS to devicectl per AUA invocation:
0x7faad11056d0— first peer, carries AUA reply (HITs 1-7 in P3)0x7faad11043c0— second peer, side-channel "result peer" namedpeer[1013](HIT 8, the cancel-event recipient)- Second peer's
xpc_releasefires from a libdispatch worker thread:
#0 xpc_release+0x0
#1 libxpc.dylib`_xpc_connection_mach_event+0x418
#2 libdispatch.dylib`_dispatch_client_callout4+0x7
#3 libdispatch.dylib`_dispatch_mach_cancel_invoke+0x40
#4 libdispatch.dylib`_dispatch_mach_invoke+0x399
#5 libdispatch.dylib`_dispatch_root_queue_drain_deferred_wlh+0x113
#6 libdispatch.dylib`_dispatch_workloop_worker_thread+0x367
ZERO frames in the release stack pass through CD/CDU/Mercury Swift app code. The release is a passive libxpc Mach-event reaction to CDS-side dropping the peer (the symmetric image of D32's CDS-side kernel-driven invalidation).
The actionable Option 1 hook target that the agent identified: Mercury.SystemXPCPeerConnection.__deallocating_deinit at Mercury+0x4ea10 (sister .deinit at +0x4e9d0). BUT this is a Swift type's deinit chain that fires AFTER libxpc has already released the connection — hooking it doesn't prevent the release, just modifies post-release Swift cleanup.
Crucial reframing: Option 1 cannot achieve "prevent peer release" as the user originally hoped. The release is libxpc-internal Mach handling. Both Option 1 and Option 2 are now symptom-suppression strategies (intercepting different layers of the post-release error propagation), not cause-prevention.
Q3 — Option 2 hook target viability: CONFIRMED¶
CDU __TEXT base in current devicectl run: 0x1020a4000 (slide differs per launch — must be resolved at runtime). All D31 §E offsets resolve to expected symbols:
| File offset | Resolved symbol | Notes |
|---|---|---|
CDU+0xbabf0 |
CoreDeviceUtilities.CoreDeviceError.init(code:userInfo:) |
NAMED Swift symbol — direct dlsym target |
CDU+0xc2940 |
static Swift.Error<...>.xpcError.getter (entry) |
NAMED Swift symbol |
CDU+0xc2978 |
static Swift.Error<...>.xpcError.getter +0x38 |
(same function, +0x38 PC) |
CDU+0x55510 |
___lldb_unnamed_symbol_55510 |
Unnamed Swift function (drain pump) |
CDU+0x54ab0 |
___lldb_unnamed_symbol_54ab0 |
Unnamed Swift function (drain pump) |
Mercury+0x57b40 |
Mercury.XPCError.xpcError.getter |
NAMED — the wire-XPCError layer |
Mercury+0x5cd90 |
Mercury generic xpcError getter variant | NAMED |
Named symbols are dlsym-resolvable directly. Unnamed pump symbols are reachable via slide arithmetic from a sibling named symbol — the same approach already proven for the CDS-side inject's CD+0xdbe0 NOP-evict.
Bonus finding: CoreDevice.ActionConnectionCache.removeCachedXPCConnection at CD+0xdbe0 is ALSO present in devicectl process — the SAME class type with the SAME method exists on both sides of the CDS↔devicectl boundary. iosmux's CDS-side inject NOPs this in CDS context; the devicectl-side instance currently runs unmodified. Whether devicectl-side removeCachedXPCConnection is part of the AUA failure path is a possible follow-up but out of scope for D33.
Strategy comparison for D34¶
| Dimension | Option 1 (Mercury deinit) | Option 2 (CDU error path) |
|---|---|---|
| Symptom vs cause | Symptom-side (Swift cleanup chain after libxpc release) | Symptom-side (error construction) |
| Hook point named | Yes — Mercury.SystemXPCPeerConnection.__deallocating_deinit |
Yes — CoreDeviceError.init, xpcError.getter |
| Discrimination needed | Yes — must filter for CDS-PID-peer connections in hook | Yes — must filter for errorCode == 1001 AND AUA invocation context |
| Side-effects | Memory-leak risk (preventing deinit chain leaves connection-bookkeeping graph in CD/Mercury caches) | Lower — error construction is stateless; suppression doesn't keep dead state alive |
| Failure mode | Hook misses → connection still released (deinit was already too late anyway) | Hook misses → error surfaces normally |
| Implementation complexity | Higher — Swift deinit + refcount semantics on Mercury internal type | Lower — Swift function prologue rewrite, well-understood pattern |
Recommended option for D34: Option 2¶
Agent recommendation: Option 2. Rationale:
-
Both options are symptom-suppression — Option 1's "cause-side" framing was empirically falsified (release is libxpc-internal, not app code; Mercury deinit fires AFTER libxpc release). Given symmetric symptom-side scope, simpler symbol set wins.
-
Named Swift symbols dlsym-resolvable —
CoreDeviceError.init(code:userInfo:)andxpcError.getterare NOT stripped. dlsym → patch prologue → trampoline back. Same proven pattern as CDS-sidegambit_install_hookfrominject/iosmux_gambit.m. -
No refcount/deinit hazards — Option 1 prevents a Swift type's deinit chain; Mercury's internal connection-bookkeeping graph may rely on the deinit cascade, and preventing it could leak Mercury state across AUA invocations. Option 2 hooks stateless error-construction functions — no state-machine entanglement.
-
Filter discrimination is bounded — Option 2 filter:
code == 1001AND caller-stack contains AUA path symbols. Option 1 filter: peer'sxpc_connection_get_pid == CDS-PIDAND deinit context is AUA-related. Both feasible but Option 2's discriminator is on values (errorCode integer) not on object pointers (xpc_pid).
Apparatus integrity (D33 also non-destructive)¶
- iosmux_inject.dylib sha unchanged (
52df2cc6...4803). - CoreDevice sha unchanged (
bea205e2...0495). - devicectl sha unchanged (
4fede2dd...bf6c) — Apple binary untouched throughout the probe. - iPhone (iosmux)
connected (no DDI). - CDS PID stable at 1013 across all probe steps.
- Zero new diagnostic reports.
- Test probe dylib at
/tmp/iosmux-d33-probe.dylibon havoc — will not survive reboot.
Distillation status¶
D33 raw notes (notes/d33-devicectl-feasibility.md, ~562 lines, gitignored) deleted at distillation time per feedback_notes_are_temporary_buffer. Empirical findings + injection feasibility verdict + Option 1 reframing + Option 2 verification + comparison table + recommendation preserved in this section + the Q-D66-15 D33 Resolution log entry in docs/plans/d66-research-questions.md.
D34+D35a updates: Swift Deinit chain is LATE cleanup; HIT #18 trigger UNRESOLVED; Heisenbug (path-A vs path-B); dtrace recommended for D36 (2026-05-01)¶
D34 (lldb attach to launching devicectl, BPs on connection lifecycle) and D35a (follow-up lldb probe for cancel-initiation trigger) both ran on 2026-05-01. Together they REFINE the picture but leave the cause-side trigger partly UNRESOLVED.
D34 — peer[] full lifecycle captured¶
D34's run captured peer[<CDS-PID>] = 0x7fdbe6404340:
| t | event | source |
|---|---|---|
| 103.5 | BIRTH _xpc_connection_init |
libxpc internal |
| 107.36 | xpc_retain ×2 | inside _dispatch_mach_cancel_invoke handler |
| 107.510 (HIT #18) | xpc_connection_cancel | libxpc — _xpc_connection_mach_event+0x310, ZERO Swift/app frames |
| 107.556-108.500 | release/dispose cascade | libxpc cleanup |
| 111.269 (HIT #23) | Swift DeviceUsageAssertion.deinit chain | runJobInEstablishedExecutorContext → devicectl+sym_100016970+0x395 → CoreDevice.DeviceUsageAssertion.deinit+0x227 → Swift._IndexBox.__deallocating_deinit → Mercury.SystemXPCPeerConnection.__deallocating_deinit+0x2d → [OS_xpc_connection _xref_dispose] → _xpc_connection_last_xref_cancel → xpc_connection_cancel |
Critical finding: HIT #23 Swift chain fires 3.7 seconds AFTER HIT #18 libxpc cancel. The Swift DeviceUsageAssertion.deinit (with _IndexBox capture of Mercury.SystemXPCPeerConnection) is late lateral cleanup, NOT the trigger. By time Swift Task continuation runs and DeviceUsageAssertion goes out of scope, libxpc has already cancelled the connection.
H1 (Swift action wrapper holds peer) is structurally true — CoreDevice.DeviceUsageAssertion does capture peer[_IndexBox of Mercury.SystemXPCPeerConnection. But hooking its deinit doesn't prevent HIT #18's earlier libxpc cancel.
D35a — Heisenbug: lldb perturbation alters AUA path¶
D35a's brief was to find the trigger of HIT #18 by BP'ing on dispatch_source_cancel family + _xpc_connection_mach_event ENTRY (capturing dispatch_mach_reason_t in $rsi). Empirical result: path-A failure mode (D34 reference) did NOT reproduce under lldb instrumentation — the failure took a DIFFERENT path.
Path-A vs Path-B (NEW empirical finding):
| Path-A (D34 reference) | Path-B (D35a under lldb) | |
|---|---|---|
| Main XPC reply | success(...) |
success(...) |
| Successfully acquired log | YES ([useassertion] Successfully acquired usage assertion E8A190DD-...) |
NO (skipped — analytics shows executionDuration=0) |
| Side-channel peer error log | YES ([useassertion] Recieved error from side channel peer: XPCError 1001 ... peer[1013].0x...) |
NO — log line absent entirely |
| Final outcome | Failed to acquire usage assertion ... CoreDeviceError errorCode 3 |
Failed to acquire usage assertion ... CoreDeviceError errorCode 3 |
| Path | success → side-channel cancel → 1001 → errorCode 3 | success → errorCode 3 directly, no side-channel error |
D35a observed dispatch_channel_cancel ONCE at t=3.958 from devicectl-internal Swift Concurrency runJobInEstablishedExecutorContext → devicectl+sym_100095660+0x75 → devicectl+sym_100092c60+0x548. This is app-level dispatch cancel initiation but it operates on a generic dispatch_channel_t, not specifically peer[
D35a's _xpc_connection_mach_event BP fired 17 times, zero matched CDS-PID filter (xpc_connection_get_pid returned non-CDS PID for all hits). The side-channel peer connection either wasn't created at all in this run, or was created but the filter mechanism didn't recognize it.
Implication: HIT #18's libxpc cancel handler runs only when path-A is taken. Under lldb BP-based instrumentation timing perturbation, path-B fires instead and the side-channel peer is never even instantiated — explaining why $rsi could never be captured.
Verdict on H1/H2/H3/H4 from D35a: NONE EMPIRICALLY CONFIRMED¶
The original four hypotheses for HIT #18 trigger remain UNRESOLVED:
- H1 (GAMBIT cancel-after-send) — neither confirmed nor falsified; couldn't observe under lldb.
- H2 (Apple-CDS cleanup separate from D23 path) — same.
- H3 (libxpc internal heuristic) — same.
- H4 (devicectl Swift dispatch_source_cancel) — partially weak signal (
dispatch_channel_cancelhit) but on wrong target.
Plus a NEW finding: errorCode 3 has at least two coalesce paths in production. GAMBIT-defeat may need to address both, not just the side-channel peer error one.
Recommendation for D36 — switch observation method¶
D35a recommendation: abandon lldb BP-based instrumentation for HIT #18 trigger. lldb's per-BP overhead (SIGSTOP/SIGCONT cycle, Python callback eval) perturbs the timing enough to suppress path-A reliably. Alternatives:
-
dtrace pid provider on devicectl —
dtrace -n 'pid$target:libxpc:_xpc_connection_mach_event:entry { @[arg0, arg1] = count(); }'. Lower per-probe overhead (no SIGSTOP), captures arg0=conn arg1=reason directly to ring buffer. Bypasses lldb's per-hit Python-callback latency. -
CDS-side inject capture — extend GAMBIT logging in
iosmux_inject_dylibto log every mach msg emitted viaxpc_connection_create_from_endpointpeer (destination port name, msgh_id, msgh_bits). Sidesteps devicectl-side observation entirely. Correlates with devicectlos_logstream timing for indirect proof of path-A trigger. -
os_logpredicate stream withsubsystem:com.apple.coredevice category:useassertionat--debuglevel — captures internal trace points that don't print at default level. May reveal whether side-channel peer was created in path-A vs not in path-B without lldb in the loop.
Strongest combination: (1) dtrace + (3) os_log, both running concurrently with a normal xcrun devicectl device info details invocation, no lldb. dtrace's lower overhead preserves path-A timing; os_log gives userspace-level observability without affecting trace.
Apparatus integrity (D34 + D35a both non-destructive)¶
- iosmux_inject.dylib sha unchanged (
52df2cc6...4803). - CoreDevice sha unchanged (
bea205e2...0495). - devicectl sha unchanged (
4fede2dd...bf6c). - iPhone (iosmux)
connected (no DDI). - CDS PID stable at 1013 across both probes.
- Zero new diagnostic reports.
Distillation status¶
D34 raw notes (~2.9 KB stub, §A only populated; rest analyzed inline in dispatcher chat) and D35a raw notes (notes/d35a-libxpc-cancel-trigger.md, ~17 KB, §A-§H populated, §I PENDING) deleted at distillation time per feedback_notes_are_temporary_buffer. Empirical findings + path-A vs path-B distinction + Heisenbug observation + D36 method recommendations preserved in this section + the Q-D66-15 D34/D35a Resolution log entry in docs/plans/d66-research-questions.md.
See also¶
aua-side-channel-mechanism.mdaua-history-d36-d37.md— D36 method pivot to host-side static disasm following D35a Heisenbug.aua-history-d38-d40.md— D38..D40 falsifications and pivot to gate-side approach.