Xen project Mailing List

[Xen-devel] RE: A credit scheduler issue

To: "Emmanuel Ackaouy" <ack@xxxxxxxxxxxxx>

From: "Kamble, Nitin A" <nitin.a.kamble@xxxxxxxxx>

Date: Fri, 30 Jun 2006 12:23:47 -0700

Cc: Ian Pratt <m+Ian.Pratt@xxxxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxx

Delivery-date: Fri, 30 Jun 2006 12:24:25 -0700

List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Thread-index: AcacMmgV9rQt4VP1SSe9GkWuymCHnwARHyEg

Thread-topic: A credit scheduler issue

Keir, Emmanuel, Thanks for the detailed answers, and your views. I agree that my small change should not affect correctness. I didn't see the migration often, I saw the dom0 vcpus migrations happening 4, 5 times from boot to start of xend. I think we should avoid these migrations; why waste the cache hotness? How solid is the credit scheduler now for DomUs on a SMP box? On 32bit, PAE & 64bit? It would be a useful data point for me to debug the HVM guest issues with the credit scheduler. Thanks & Regards, Nitin ------------------------------------------------------------------------ ----------- Open Source Technology Center, Intel Corp >-----Original Message----- >From: Emmanuel Ackaouy [mailto:ack@xxxxxxxxxxxxx] >Sent: Friday, June 30, 2006 3:46 AM >To: Kamble, Nitin A >Cc: xen-devel@xxxxxxxxxxxxxxxxxxx; Keir Fraser; Ian Pratt >Subject: Re: A credit scheduler issue > >Hi Nitin, > >On Thu, Jun 29, 2006 at 06:13:51PM -0700, Kamble, Nitin A wrote: >> I am trying to debug the credit scheduler to solve the many HVM >domain >> instability issues we have found with the credit scheduler. > >Great. As Keir pointed out though the problems you are seeing >may not actually be in the credit scheduler itself. > >> While debugging I notice an odd behavior; When running on a 2 CPU >> system, dom0 gets 2 vcpus by default. And even if there are no other >> domains running in the system, the dom0 vcpus are getting migrated to >> different pcpus in the load balance. I think it is due to the >preemption >> happening in the credit scheduler; and it is not necessary and is >actually >> wasteful to move vcpus when no of vcpus in the system are equal to no >of >> pcpus. >> >> I would like to know your thinking about this behavior. Is it an >> intended in the design? > >This should be very rare. If a VCPU were woken up and put on >the runq of an idle CPU, a peer physical CPU that is in the >scheduler code at that exact time could potentially pick up >the just woken up VCPU. > >We can do things to shorten this window, like not pick up a >VCPU from a remote CPU that is currently idle and therefore >probably racing with us to run said newly woken up VCPU on >its runq. But I'm not sure this happens frequently enough to >warrant the added complexity. On top of that, it seems to >me this is more likely to happen to VCPUs that aren't doing >very much work and therefore would not suffer a performance >loss from migrating physical CPU on occasion. > >Are you seeing a lot of these migrations? > >> I added this small fix to the scheduler to fix this behavior. And with >it >> I see the stability of Xen improved. Win2003 boot was crashing with >> unhandled MMIO error on xen64 earlier with credit scheduler. I am not >> seeing that crash with this small fix anymore. It is quiet possible >that >> there are more bugs I need to catch for HVM domains in the credit >> scheduler. And I would like to know your thoughts for this change. > >I don't agree with this change. > >When a VCPU is the only member of a CPU's runq, it's still >waiting for a _running_ VCPU to yield or block. We should >absolutely be picking up such a VCPU to run elsewhere on >an idle CPU. Else, you'd end up with two VCPUs time-slicing >on a processor while other processors in the system are idle. > >Your change effectively turns off migration on systems where >the number of active VCPUs is less than 2 multiplied by the >number of physical CPUs. I can see why that would hide any >bugs in the context migrating paths, but that doesn't make >it right. :-) > >> >> csched_runq_steal(struct csched_pcpu *spc, int cpu, int pri) >> >> { >> >> struct list_head *iter; >> >> struct csched_vcpu *speer; >> >> struct vcpu *vc; >> >> >> >> /* If there are only 1 vcpu in the queue then stealing it from the >> queue >> >> * is not going not help in load balancing. >> >> */ >> >> if (spc->runq.next->next == &spc->runq) >> >> return NULL; >> >> >> >> Thanks & Regards, >> >> Nitin >> >> ---------------------------------------------------------------------- >------------- >> >> Open Source Technology Center, Intel Corp >> >> _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.