[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: XenVif div by zero on Tx path after resume.


  • To: win-pv-devel@xxxxxxxxxxxxxxxxxxxx
  • From: "Durrant, Paul" <xadimgnik@xxxxxxxxx>
  • Date: Thu, 21 Apr 2022 14:23:21 +0100
  • Delivery-date: Thu, 21 Apr 2022 13:32:18 +0000
  • List-id: Developer list for the Windows PV Drivers subproject <win-pv-devel.lists.xenproject.org>

On 21/04/2022 14:15, Martin Harvey wrote:


-----Original Message-----
From: win-pv-devel <win-pv-devel-bounces@xxxxxxxxxxxxxxxxxxxx> On Behalf Of 
Martin Harvey
Sent: 21 April 2022 14:09
To: paul@xxxxxxx; win-pv-devel@xxxxxxxxxxxxxxxxxxxx
Subject: RE: XenVif div by zero on Tx path after resume.

So, this is simple:

- Were are failing because TransmitterQueuePacket gets called before 
VifSuspendCallbackLate, something which "in theory" should be impossible.

So, perhaps, although the processor corral once worked, it no longer does?


Looking in xenbus's sync.c and suspend.c, I can't see any issue... but there could be something subtle going on (like Windows deciding all DPCs will be threaded under the hood).

An alternative explanation, which I forgot to mention is that the suspend 
callbacks are being registered too late, so there's a window between the device 
appearing to be set-up and functional to windows, and the suspend callback 
being properly registered.


It should not be possible to suspend if anything is running at DISPATCH because the CPU corralling will be deferred whilst that is the case (since the capture requires a DPC to run) so as long as all the init is being done at DISPATCH there should be no race.

  Paul




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.