[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] credit scheduler and HYPERVISOR_yield()


  • To: "George Dunlap" <gdunlap@xxxxxxxxxxxxx>
  • From: Emmanuel Ackaouy <ackaouy@xxxxxxxxx>
  • Date: Mon, 15 Oct 2007 19:13:01 +0200
  • Cc: xen-devel@xxxxxxxxxxxxxxxxxxx, John Levon <levon@xxxxxxxxxxxxxxxxx>
  • Delivery-date: Mon, 15 Oct 2007 10:13:28 -0700
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:in-reply-to:references:mime-version:content-type:message-id:content-transfer-encoding:cc:from:subject:date:to:x-mailer; b=hbjQjo0ihP7rws1rtSNOAuScfDW/u1JFmKHsDoQh5EbPuwJuJFzog18ZXpvhovrS3XLMU1Isp+MSCDTxE6j/+cBR0SwHL8uZqKsCQ7ACpN3qHqwUbTwI8ciWaNqN4qezI1fAYF/LKIJWP01YltX5xiYg/9j3hbV49TFGfCIE+rQ=
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

I suspect yield() was first devised as a simple synchronization mechanism
for uni-processor round robin schedulers.

Then strict priorities were added to make certain tasks (like pagers) run
more aggressively than "normal" ones. As long as these high priority
threads don't use the yield() mechanism, things are fine. I believe you are
pointing out that from the perspective of the yield() mechanism, all
time-share priorities (UNDER and OVER) should be considered one and
the same because they are not strict priorities. This is a good observation
and I agree with you (as long as reasonable uses of yield() don't cause
fairness to go out the window).

However, before you go and fix yield(), you might want to consider this:

1- It's been proposed before that things like dom0 VCPUs be scheduled
with a priority strictly greater than any domU VCPU. If strict priorities are
introduced into the Xen scheduler at some point in the future, code that
assumes that a yield() from a VCPU will allow all other runnable VCPUs
in the system a chance to run ahead of it will break (again).

2- Priorities aside, on an SMP host (ie all computers) with distributed run queues, it is non trivial to guarantee that a VCPU will not be rescheduled until all other runnable VCPUs have had a chance to run first. If you can come up with a simple and scalable way to do it, great. I suspect you will
need to approximate this definition of yield() though, perhaps by using
some form of directed yield, targeted at one or more VCPUs ,as you have
suggested.

3- Yield really isn't a great model to do synchronization in an SMP world. If you're going to para-virtualize your IPI and spinlock paths, as you pointed out in your last mail, you might as well do something that can be directed
and block if necessary.



I guess my point is that instead of working real hard to try and maintain
the old yield behavior ("don't run again until all other runnable VCPUs
have had a chance to run first") on an SMP scheduler which potentially
also has to deal with strict priorities, you'd be better off spending your
energy on building and optimizing simpler and more targeted
synchronization mechanisms and using those instead. User level
threads libraries may be a good place to look for inspiration if you're
really worried about the costs of supervisor to hypervisor context
switches. I'm not a huge fan of share pages but it was popular to
write papers about them for user level thread synchronization back in
the 90s.

In the case of IPIs, you're already going into the hypervisor so you
should be able to do something straightforward with a sleeping
semaphore. Maybe you spin a little before you sleep though to give
running VCPUs a chance to respond before you give up the end of
your time slice.

For spinlocks, I suspect turning a spinlock into a sleeping lock after
a reasonable number of spins would work well too.

In the long run, it would probably be beneficial to remove most uses
of the generic yield mechanism.

Emmanuel.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.