|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Issues with the device eject path in XenVif
On 13/04/2026 15:35, Paul Durrant wrote: > On 13/04/2026 14:22, Tu Dinh wrote: >> On 13/04/2026 14:41, Paul Durrant wrote: >>> On 09/04/2026 16:29, Tu Dinh wrote: >>>> Hi all, >>>> >>>> I'm currently trying to fix some lingering issues with VIF unplug, >>>> which >>>> will let me replace the MRSW lock with a simpler and faster >>>> implementation. >>>> >>>> Pdo->Eject/PdoRequestEject (e.g. in XenVif) is signaled by the >>>> FrontendEject worker thread, which watches backend/vif/DOMID/X/online >>>> among a few other things. I've run into several issues with this code >>>> path: >>>> >>>> - When removing the VIF using `xe vif-unplug force=true`, the entire >>>> xenstore key of the backend is removed without a chance to tear down >>>> the >>>> connection. However, the watch on BACKEND/online will be triggered >>>> before the watch on device/vif, which causes the PDO to be marked as >>>> ejected, and so goes through the QUERY_REMOVE_DEVICE/REMOVE_DEVICE >>>> instead of being a surprise removal. >>>> - In the REMOVE_DEVICE case, NDIS will wait for packets to be returned >>>> before continuing. Yet we cannot make progress because the backend has >>>> already disappeared, so the system will hang. This can be reproduced by >>>> doing an unplug with force=true while having some outbound traffic, but >>>> the timing is quite tight with the current code. >>>> - BACKEND/online is an internal, backend-specific value that is not >>>> documented in xenstore-paths or netif.h. So frontends should not use >>>> this value. I also find converting a VIF unplug to a query remove based >>>> on reading BACKEND/online somewhat dubious. >>>> >>>> I've considered several options for a fix, which I have documented >>>> below: >>>> >>>> 1. Make FrontendIsBackendOnline return a status code if BACKEND/online >>>> doesn't exist, and treat an error to read the key as a surprise >>>> removal. >>>> - This ends up being unworkable, since QEMU will always first set >>>> BACKEND/online to 0 even if the VIF is being force-unplugged. >>> >>> I still think this is the right way to deal with force unplug. Is there >>> a tell-tale you can look for to see if it is forced? (E.g. has the >>> frontend xenstore area completely gone?) >>> >> >> What I observe during a force VIF unplug is an unplug request >> (BACKEND/online=0 / PdoRequestEject) shortly followed by the backend >> being wiped out. I couldn't find any tell I could use to distinguish the >> force unplug case from the normal one. >> >> Maybe it can be fixed by attaching the watchdog thread's event to a >> watch on the backend, then (for transmitters) faking responses in the >> watchdog thread if we detect that the backend has disappeared. >> > > The PdoRequestEject is trigger off the state change watch though isn't > it. In the case of force does the state still change to 'closing'? I'd > have thought the node would be removed, in which case the state would go > to 'unknown' instead. > There's no watchdog thread waiting for BACKEND/state, and NDIS waits for packet return during initial handling of IRP_MN_REMOVE_DEVICE before xennet/xenvif is entered. So for now there's no opportunity for FrontendClose/FrontendWaitForBackendXenbusStateChange to be called in order to update the state to Unknown. >>>> >>>> 2. Make FrontendIsBackendOnline check the backend's existence (i.e. >>>> reading the backend key instead of backend/online). >>>> - This changes the unplug order slightly, but looks like the >>>> cleanest >>>> solution. Though I'm not sure if it breaks cancelling of device removal >>>> requests. >>>> >>>> 3. Remove the eject codepath and rely on FdoScan instead. >>>> - This might break a few things that assume the presence of this >>>> codepath. >>>> >>>> I'd be glad to hear your opinions on this matter. >>>> >>>> Thanks, >>>> >>>> >>>> -- >>>> Ngoc Tu Dinh | Vates XCP-ng Developer >>>> >>>> XCP-ng & Xen Orchestra - Vates solutions >>>> >>>> web: https://vates.tech >>>> >>>> >>> >>> >> >> >> >> -- >> Ngoc Tu Dinh | Vates XCP-ng Developer >> >> XCP-ng & Xen Orchestra - Vates solutions >> >> web: https://vates.tech >> >> > > -- Ngoc Tu Dinh | Vates XCP-ng Developer XCP-ng & Xen Orchestra - Vates solutions web: https://vates.tech
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |