[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC][PATCH] scheduler: credit scheduler for client virtualization

  • To: "NISHIGUCHI Naoki" <nisiguti@xxxxxxxxxxxxxx>
  • From: "George Dunlap" <George.Dunlap@xxxxxxxxxxxxx>
  • Date: Fri, 5 Dec 2008 11:37:11 +0000
  • Cc: Ian.Pratt@xxxxxxxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxx, disheng.su@xxxxxxxxx, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
  • Delivery-date: Fri, 05 Dec 2008 03:37:39 -0800
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references:x-google-sender-auth; b=fb6iNqzkwidgQ1tLEZWIivuNPz/MosP93siFG6YcTWNwywt0ZL3CqI6OWcwPH9vKtf 9Zx5urUfQybjH9VkfTRdP7V2oEE7n1rIz5A89IA9iLwvHiFD5JE+nMkz5KG2yfd6xEY+ P7U051p+VZ2fno8F9vKAXVzGuT+FvzZhrtSfQ=
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

On Fri, Dec 5, 2008 at 2:47 AM, NISHIGUCHI Naoki
<nisiguti@xxxxxxxxxxxxxx> wrote:
> Oh, I misread the word "battery". I understand what "a battery of tests"
> means.
> By the way, what tests do you concretely do? I have no idea on these tests.

For basic workload tests, a couple are pretty handy.  vConsolidate is
a good test, but pretty hard to set up; I should be able to manage it
with our infrastructure here, though.  Other tests include:
* kernel-build (i.e., time how long it takes to build the Linux
kernel) and or ddk-build (Windows equivalent)
* specjbb (a cpu-intensive workload)
* netperf (for networks)

For testing its effect on network, the paper I mentioned has three
workloads that it combines with different ways:
* cpu (just busy spinning)
* sustained network (netbench): throughput
* network ping: latency.

> OK.
> We must consider also a sleeping vcpu. The vcpu will be added to the queue
> by wakeup. So, we can set the timer to 2ms only if the next waiting vcpu on
> the queue or the sleeping vcpu is also BOOST.
> My thought about 2ms is: the period that the vcpu will be executed next is
> 2ms. Therefore, time slice of the vcpu is changed according to the number of
> existing vcpus. In a word, we may set the timer to 2ms or less. But I think
> that the number of vcpus will not be so much. Is this supposition wrong? And
> how about time slice of 2ms or less?

I think I understand you to mean: If we set the timer for 10ms, and in
the mean time another vcpu wakes up and is set at BOOST, then it won't
get a chance to run for another 10 ms.  And you're suggesting that we
run the scheduler at 2ms if there are any vcpus that *may* wake up and
be at BOOST, just in case; and you don't think this situation will
happen very often.  Is that correct?

Unfortunately, in consolidated server workloads you're pretty likely
to have more vcpus than physical cpus, so I think this case would come
up pretty often.  Furthermore, 2ms is really too short a scheduling
quantum for normal use, especially for HVM domains, which have to take
a vmexit/vmenter cycle to handle every interrupt.  (I did some tests
back when we were using the SEDF scheduler, and the scheduling alone
was a 4-5% overhead for HVM domains.)

But I don't think we actually have a problem here: if a vcpu wakes up
and is promoted to BOOST, won't it "tickle" the runqueues to find
somewhere for it to run?  At very least the current cpu should be able
to run it, or if it's already running one at BOOST, it can set its own
timer to 2ms.  In any case, I think handling this corner case with
some extra code is preferrable to running a 2ms timer any time it
*might* happen.

> OK.
> I'll separate individual changes from current patch and post each patch.

Thanks!  I'll take them for a spin today.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.