[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] mem_event: use wait queue when ring is full


  • To: "Olaf Hering" <olaf@xxxxxxxxx>
  • From: "Andres Lagar-Cavilla" <andres@xxxxxxxxxxxxxxxx>
  • Date: Thu, 12 Jan 2012 08:11:42 -0800
  • Cc: xen-devel@xxxxxxxxxxxxxxxxxxx, tim@xxxxxxx, adin@xxxxxxxxxxxxxx
  • Delivery-date: Thu, 12 Jan 2012 16:10:54 +0000
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=lagarcavilla.org; h=message-id :in-reply-to:references:date:subject:from:to:cc:reply-to :mime-version:content-type:content-transfer-encoding; q=dns; s= lagarcavilla.org; b=uD+0jfU4QtdUNh9YGSqAUAUoiACFC94lgTI/8BuOheMG SaZbLT237tRgnwBtzE1rzyg2KLb4yNqe07Lom5i61DVkwHpYREUXby8td3JII9mu SEeXm6nTMYgZzLj4lyXyXPToOCtjGQU0kBZB7tClNmgNzvICrzTXshRL9qIDhsc=
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

> On Wed, Jan 11, Andres Lagar-Cavilla wrote:
>
>> > mem_event: use wait queue when ring is full
>> >
>> > This change is based on an idea/patch from Adin Scannell.
>>
>> Olaf,
>> thanks for the post. We'll have to nack this patch in its current form.
>> It
>> hard reboots our host during our testing.
>
> Thats very unfortunate. I have seen such unexpected reboots myself a few
> weeks ago. I suspect they were caused by an incorrect debug change which
> I had on top of my waitqueue changes. Also the fixes Keir provided a few
> weeks ago may have helped.
>
> Is it an otherwise unmodified xen-unstable build, or do you use other
> patches as well? Whats your environment and workload anyway in dom0 and
> domU?
>
> It would be very good to know why the reboots happen. Perhaps such
> failures can not be debugged without special hardware, or a detailed
> code review.
>
>
> I just tested an otherwise unmodified xen-unstable build and did not
> encounter reboots while ballooning a single 2G guest up and down. The
> guest did just hang after a few iterations, most likely because v7 of my
> patch again (or still?) has the math wrong in the ring accounting. I
> will check what the issue is. I think v6 was ok in that respect, but I
> will double check that older version as well.
>
>
>> What we did is take this patch, amalgamate it with some bits from our
>> ring
>> management approach. We're ready to submit that, along with a
>> description
>> of how we test it. It works for us, and it involves wait queue's for
>> corner cases.
>
> Now if the patch you just sent out uses wait queues as well, and using
> wait queues causes sudden host reboots for reasons not yet known, how is
> your patch any better other that the reboots dont appear to happen
> anymore?

I believe you were missing some unlocks, which were triggering ASSERTs
going into a wait queue.

In any case, the patch was crashing, we spent quite some time merging it
all towards the endgame we all want (wait queues and better ring logic)
and now it doesn't seem to crash.

But obviously our testing rigs are quite different, which is a good thing.

I'll post the mem access testing code, with a description of how we drive
that test.

Thanks!
Andres
>
> I did not use anything but paging for testing, perhaps I should also run
> some access tests. How should I use tools/tests/xen-access/xen-access.c?
>
> Olaf
>



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.