[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [win-pv-devel] [PATCH 2/2] Fix race calculating SrbExt->Count
In the analysis of my hangs it also turned out that an incorrect value of SrbExt->Count caused the loss of the SRB. In PdoCompleteResponse I added the following debug code - if (InterlockedDecrement(&SrbExt->ReqCount) == 0) { + c = (ULONG) InterlockedDecrement(&SrbExt->Count); + if (c + 1 > Pdo->maxReqsPerSrb) + { + Pdo->maxReqsPerSrb = c + 1; + Warning("new maximum reqs/SRB = %08x, PDO = %I64x, SRB = %I64x\n", Pdo->maxReqsPerSrb, (ULONGLONG) Pdo, (ULONGLONG) Srb); + } + if (c == 0) { Then in the debug log I saw the following line XENVBD|PdoCompleteResponse:new maximum reqs/SRB = ffffffff, PDO = ffffe001ff1b2020, SRB = ffffe00200d439f0 just seconds before the PDO resets started. That means the count had dropped to -2. However I did not (yet) find the reason for that. Thanks for finding the race. Since the issue is quite clear now, I don't think the debug code is needed. Attached is my patch (only for the count issue) with a renaming of the "Count" field to make it more clear what kind of count it is. Regards Andreas On 16.02.2017 11:11, owen.smith@xxxxxxxxxx wrote: From: Owen Smith <owen.smith@xxxxxxxxxx> It is possible under heavy loads for the backend to start completing sub-requests of a Srb before the SrbExt Count is set. This would leave the count unable to reach 0 (as 1 or more requests are skipped by count being overridden) Attachment:
srbext-count.patch _______________________________________________ win-pv-devel mailing list win-pv-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/cgi-bin/mailman/listinfo/win-pv-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |