Xen project Mailing List

Re: [Xen-devel] [PATCH] mem_event: use wait queue when ring is full

From: "Andres Lagar-Cavilla" <andres@xxxxxxxxxxxxxxxx>

Date: Thu, 12 Jan 2012 08:11:42 -0800

Cc: xen-devel@xxxxxxxxxxxxxxxxxxx, tim@xxxxxxx, adin@xxxxxxxxxxxxxx

Delivery-date: Thu, 12 Jan 2012 16:10:54 +0000

Domainkey-signature: a=rsa-sha1; c=nofws; d=lagarcavilla.org; h=message-id :in-reply-to:references:date:subject:from:to:cc:reply-to :mime-version:content-type:content-transfer-encoding; q=dns; s= lagarcavilla.org; b=uD+0jfU4QtdUNh9YGSqAUAUoiACFC94lgTI/8BuOheMG SaZbLT237tRgnwBtzE1rzyg2KLb4yNqe07Lom5i61DVkwHpYREUXby8td3JII9mu SEeXm6nTMYgZzLj4lyXyXPToOCtjGQU0kBZB7tClNmgNzvICrzTXshRL9qIDhsc=

List-id: Xen developer discussion <xen-devel.lists.xensource.com>

> On Wed, Jan 11, Andres Lagar-Cavilla wrote: > >> > mem_event: use wait queue when ring is full >> > >> > This change is based on an idea/patch from Adin Scannell. >> >> Olaf, >> thanks for the post. We'll have to nack this patch in its current form. >> It >> hard reboots our host during our testing. > > Thats very unfortunate. I have seen such unexpected reboots myself a few > weeks ago. I suspect they were caused by an incorrect debug change which > I had on top of my waitqueue changes. Also the fixes Keir provided a few > weeks ago may have helped. > > Is it an otherwise unmodified xen-unstable build, or do you use other > patches as well? Whats your environment and workload anyway in dom0 and > domU? > > It would be very good to know why the reboots happen. Perhaps such > failures can not be debugged without special hardware, or a detailed > code review. > > > I just tested an otherwise unmodified xen-unstable build and did not > encounter reboots while ballooning a single 2G guest up and down. The > guest did just hang after a few iterations, most likely because v7 of my > patch again (or still?) has the math wrong in the ring accounting. I > will check what the issue is. I think v6 was ok in that respect, but I > will double check that older version as well. > > >> What we did is take this patch, amalgamate it with some bits from our >> ring >> management approach. We're ready to submit that, along with a >> description >> of how we test it. It works for us, and it involves wait queue's for >> corner cases. > > Now if the patch you just sent out uses wait queues as well, and using > wait queues causes sudden host reboots for reasons not yet known, how is > your patch any better other that the reboots dont appear to happen > anymore? I believe you were missing some unlocks, which were triggering ASSERTs going into a wait queue. In any case, the patch was crashing, we spent quite some time merging it all towards the endgame we all want (wait queues and better ring logic) and now it doesn't seem to crash. But obviously our testing rigs are quite different, which is a good thing. I'll post the mem access testing code, with a description of how we drive that test. Thanks! Andres > > I did not use anything but paging for testing, perhaps I should also run > some access tests. How should I use tools/tests/xen-access/xen-access.c? > > Olaf > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.