[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Need help with fixing the Xen waitqueue feature


  • To: Olaf Hering <olaf@xxxxxxxxx>
  • From: Keir Fraser <keir.xen@xxxxxxxxx>
  • Date: Wed, 23 Nov 2011 21:03:39 +0000
  • Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
  • Delivery-date: Wed, 23 Nov 2011 21:04:53 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>
  • Thread-index: AcyqFSPigRbmOh5ymkyxVkLvJHSMcgADjugi
  • Thread-topic: [Xen-devel] Need help with fixing the Xen waitqueue feature

On 23/11/2011 19:21, "Keir Fraser" <keir.xen@xxxxxxxxx> wrote:

> On 23/11/2011 18:31, "Olaf Hering" <olaf@xxxxxxxxx> wrote:
> 
>> On Wed, Nov 23, Keir Fraser wrote:
>> 
>>> We have quite a big waitqueue problem actually. The current scheme of
>>> per-cpu stacks doesn't work nicely, as the stack pointer will change if a
>>> vcpu goes to sleep and then wakes up on a different cpu. This really doesn't
>>> work nicely with preempted C code, which may implement frame pointers and/or
>>> arbitrarily take the address of on-stack variables. The result will be
>>> hideous cross-stack corruptions, as these frame pointers and cached
>>> addresses of automatic variables will reference the wrong cpu's stack!
>>> Fixing or detecting this in general is not possible afaics.
>> 
>> Yes, I was thinking about that wakeup on different cpu as well.
>> As a quick fix/hack, perhaps the scheduler could make sure the vcpu
>> wakes up on the same cpu?
> 
> Could save old affinity and then vcpu_set_affinity. That will have to do for
> now. Actually it should work okay as long as toolstack doesn't mess with
> affinity meanwhile. I'll sort out a patch for this.

Attached three patches for you to try. They apply in sequence.
00: A fixed version of "domain_crash on stack overflow"
01: Reorders prepare_to_wait so that the vcpu will always be on the
waitqueue on exit (even if it has just been woken).
02: Ensures the vcpu wakes up on the same cpu that it slept on.

We need all of these. Just need testing to make sure they aren't horribly
broken. You should be able to test multi-processor host again with these.

 -- Keir

>  -- Keir
> 
>> Olaf
> 
> 

Attachment: 00-prep-to-wait-dom-crash
Description: Binary data

Attachment: 01-prep-to-wait-reorder
Description: Binary data

Attachment: 02-waitq-set-vcpu-affinity
Description: Binary data

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.