[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: XenVif div by zero on Tx path after resume.


  • To: "paul@xxxxxxx" <paul@xxxxxxx>, "win-pv-devel@xxxxxxxxxxxxxxxxxxxx" <win-pv-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Martin Harvey <martin.harvey@xxxxxxxxxx>
  • Date: Thu, 21 Apr 2022 13:09:14 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=0BM5G7OC0dotFH1QRYgXbWqJ0EuXhr7h+08B+xfzgek=; b=Y/4boezwi9i7kjbKKXyjHi672g/X8x9W1lF/Y/WQg8/pgLXcVjMpBPpGoD0qsJ8wnIkkE0CP956xmrAqAau/3DpRPBv5qN+3Ub0g8T+0/8mI5BHcyHP5/CYG8ZOdXkb2ZG05B4Cvorw3y0m/r2/nBuLP7xgnP0BW7wPJO13TzXwBmyeM3m9uwzLS1shok0P0aSehG5QsZteL/thjwK4SveadB/IhvYslbniU3msqpmZ1Ua1cGWv3zR9LS/7bN1bl5+JbFCVz1TWyDy8vaYCqksSOxUp6rVGef7u7DsPv6FMFvW9cpE9ww/ln89klyZ70hQSkxGV1HPJVBbd2XnAQpA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=WjSUe4U6Oh0nn8CGXmKvcDXVsmJ/xm7bfEXt4iDW/++Q2GyKMbVuU8VNqJb9xrNF4KglQmwwbYHKtz/GlOWfH937kcvcwRGcaTMChWjzkOjAYWBZ+aF69b5AldKgvmNYvXioQt4AUxJRl4C/kH5/H1nUFRhElL7v3bkCgow4eMgY/NfwNsZ4lHK2aw2abPuZfhQIPg0+UnLl5plZ6Rhe3vtgzYW1EXXHLVrvOFX9+scESlZnV0ANYkbYv9GaHbVrQnlXtFxPmRFybeAXoFqadsGwJv1GJkdHbBGTIgScDbgC2cNqN0LnCwXMGAAFXek/ARnRsKD5+cp1wTlh2Lgizw==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Delivery-date: Thu, 21 Apr 2022 13:09:26 +0000
  • Ironport-data: A9a23:rr3Keam144BjQKskGG63bSXo5gzjJ0RdPkR7XQ2eYbSJt1+Wr1Gzt xIZWW2Gb/jbZmPwLYh1btux9hsP757Vy9JnSgZlpHxhQyMWpZLJC+rCIxarNUt+DCFioGGLT Sk6QoOdRCzhZiaE/n9BCpC48T8kk/vgqoPUUIYoAAgoLeNfYHpn2EoLd9IR2NYy24DlWVrV4 LsenuWEULOb828sWo4rw/rrRCNH5JwebxtB4zTSzdgS1LPvvyF94KA3fMldHFOhKmVgJcaoR v6r8V2M1jixEyHBqD+Suu2TnkUiGtY+NOUV45Zcc/DKbhNq/kTe3kunXRa1hIg+ZzihxrhMJ NtxWZOYd1gzBJf1uv0hXlpaHB5dB4t3/IeZGC3q2SCT5xWun3rE5dxLVRhzFqpBv+F9DCdJ6 OASLy0LYlabneWqzbmnS+5qwMM+MM3sO4BZsXZlpd3bJa9+HdafHOOXu5kBg2xYasNmRJ4yY +IcYCBzbRKGYxRVJFoGIJk/gP2plj/0dDgwRFe9+/trszOClFQZPL7FMfjHYfiqH+FpuX3Hh Xvj2ULnPR8lO4nKodaC2jf27gPVpgvjUZ8WPK218LhtmlL77nMSDlgOSx63rOe0jma6WslDM AoE9yw2t68w+Ue3CN7nUHWFTGWsuxcdX59bFLQ84QTUk67MuV/GXS4DUyJLb8EguIkuXzs22 1SVntTvQztyrLmSTnHb/bCRxd+vBRUowaY5TXdsZWM4DxPL/+nfUjqnog5fLZOI
  • Ironport-hdrordr: A9a23:eRFpsa+Ek7mk7VjqeORuk+Fsdb1zdoMgy1knxilNoENuH/Bwxv rFoB1E73TJYW4qKQkdcKO7SdK9qBTnhNRICOgqTPyftWzd1FdAQ7sSibcKrweAJ8S6zJ8l6U 4CSdkyNDSTNykcsS+S2mDRfLgdKZu8gcaVbIzlvhRQpHRRGsRdBnBCe2Sm+yNNJDVuNN4cLt 6x98BHrz2vdTA8dcKgHEQIWODFupniiI/mSQRuPW9p1CC+yReTrJLqGRmR2RkTFxlVx605zG TDmwvloo2+rvCAzAPG3WO71eUZpDKh8KoDOCW/sLlXFtzesHfrWG2nYczGgNkBmpDu1L/tqq iJn/5vBbU115qbRBDOnfKk4Xic7N9p0Q6v9bbQuwqeneXpAD09EMZPnoRfb1/Q7Fchpsh11O ZR03uerIc/N2K3oM3R3am9a/hRrDvCnZPiq59is1VPFY8FLLNBp40W+01YVJ8GASLh8YgiVO 1jFtvV6vpaeU6TKymxhBgm/PW8GnAoWhuWSEkLvcKYlzBQgXBi1kMdgMgShG0J+p4xQ4RNo+ 7ELqNrnrdTSdJ+V9M1OM4RBc+sTmDdSxPFN2yfZVzhCaEcInrI74X65b0kjdvaDaDgDKFC6q gpfGkoy1LaIXiedvFm9Kc7gyzlUSG6QSnnzN1Y6txwpqD8LYCbQhG+dA==
  • List-id: Developer list for the Windows PV Drivers subproject <win-pv-devel.lists.xenproject.org>
  • Thread-index: AdhP6kAbDA19BbYtRECv6F4UUU/GxAD72daAAGmnvwA=
  • Thread-topic: XenVif div by zero on Tx path after resume.


-----Original Message-----
From: win-pv-devel <win-pv-devel-bounces@xxxxxxxxxxxxxxxxxxxx> On Behalf Of 
Durrant, Paul
Sent: 19 April 2022 11:39
To: win-pv-devel@xxxxxxxxxxxxxxxxxxxx
Subject: Re: XenVif div by zero on Tx path after resume.

> Even the late suspend callback runs on a single vCPU at DISPATCH, with all 
> other vCPUs spinning at DISPATCH. Thus the only thing that should be able to 
> pre-empt it is an interrupt. Hence there *should* be no scope for the the 
> network stack to send any packets until the callback has completed its work.

> With all the power state management done in thread context, it is 
> automatically blocked by any suspend/resume because of the vCPU corralling 
> and the fact that the active vCPU runs the entire cycle at DISPATCH or 
> higher. Hence no need for any further synchronization.

 
> xen|BUGCHECK: ====>
> xen|BUGCHECK: ASSERTION_FAILURE: FFFFF80113373A40 FFFFF80113373A60 
> 000000000000144E 0000000000000000
> xen|BUGCHECK: FILE: 
> E:\jenkins\workspace\nvif_private_martinhar_CA-355670\local\src\xenvif\transmitter.c
>  LINE: 5198
> xen|BUGCHECK: TEXT: !NT_SUCCESS(status)
> 

> So the question remains, how are we hitting the failure? Your source 
lines and mine clearly don't match. Exactly which assertion is failing?

Ahha. I simply included an assertion in TransmitterQueuePacket if 
Frontend->NumQueues = 0
(actually a n assertion of !NT_SUCCESS(status) of some local workaround that 
checks # queues).

So, this is simple:

- Were are failing because TransmitterQueuePacket gets called before 
VifSuspendCallbackLate, something which "in theory" should be impossible.

So, perhaps, although the processor corral once worked, it no longer does?

MH.

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.