[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] scheduler rate controller



Hi, George

Sorry for the late. I also met issues to boot up xen according to your patch, 
which is same as credit_3.patch that I attached.
So I modified it to the credit_1.patch and credit_2.patch, both of which work 
well.
1) credit_1 adopts "scheduling frequency counting" to decide the value of 
sched_ratelimit_us, which makes it adaptive.
2) credit_2 adopts the constant sched_ratelimit_us value 1000. 

Although the performance comparison data is still in process, I want to hear 
some feedbacks from you. 
I think I shall share the data very soon when the system becomes stable.


Best regards,

Lv, Hui


-----Original Message-----
From: dunlapg@xxxxxxxxx [mailto:dunlapg@xxxxxxxxx] On Behalf Of George Dunlap
Sent: Thursday, November 03, 2011 12:29 PM
To: Lv, Hui
Cc: George Dunlap; Dario Faggioli; Tian, Kevin; xen-devel@xxxxxxxxxxxxxxxxxxx; 
Keir (Xen.org); Dong, Eddie; Duan, Jiangang
Subject: Re: [Xen-devel] [PATCH] scheduler rate controller

On Sat, Oct 29, 2011 at 11:05 AM, Lv, Hui <hui.lv@xxxxxxxxx> wrote:
> I have tried one way very similar as your idea.
> 1) to check whether current running vcpu runs less than 1ms, if yes, we will 
> return current vcpu directly without preemption.
> It try to guarantee vcpu to run as long as 1ms, if it wants.
> It can reduce the scheduling frequency to some degree, but not very 
> significant. Because 1ms is too light/weak with comparison to 10ms delay (SRC 
> patch used).

Hey Hui,  Sorry for the delay in response -- FYI I'm at the XenSummit Korea 
now, and I'll be on holiday next week.

Do you have the patch that you wrote for the 1ms delay handy, and any numbers 
that you ran?  I'm a bit surprised that a 1ms delay didn't have much effect; 
but in any case, it seems dialing that up should have a similar effect -- e.g., 
if we changed that to 10ms, then it should have a similar effect to the patch 
that you sent before.

> As you said, if applying the seveal_ms_delay, it will happen whenever 
> system is normal or not (excessive frequency). It may possible have 
> the consequence that 1)under normal condition, it will produce worse 
> Qos than that without applying such delay,

Perhaps; but the current credit scheduler may already allow a VM to run 
exclusively for 30ms, so I don't think that overall it should have a big 
influence.

> 2) under excessive frequency condition, the mitigation effect of 1ms-delay 
> may be too weak. In addition, your idea is to delay scheduling instead of 
> reducing, which means the total number of scheduling would probably not 
> change.

Well it will prevent preemption; so as long as at least one VM does not yield, 
it will reduce the number of schedule events to 1000 times per second.  If all 
VMs yield, then you can't really reduce the number of scheduling events anyway 
(even with your preemption-disable patch).

> I think one possible solution, is to make the value of 1ms-delay 
> adaptive according to the system status (low load or high load). If 
> so, SRC patch just covered the excessive condition currently :). 
> That's why I mentioned to treat normal and excessive conditions 
> separately and don't influence the normal case as much as possible. 
> Because we never know the consequence without amount of testing work. 
> :)

Yes, exactly. :-)

> Some of my stupid thinking :)

Well, you've obviously done a lot more looking recently than I have. :-)

I'm attaching a prototype minimum timeslice patch that I threw together last 
week.  It currently hangs during boot, but it will give you the idea of what I 
was thinking of.

Hui, can you let me know what you think of the idea, and if you find it 
interesting, could you try to fix it up, and test it?  Testing it with bigger 
values like 5ms would be really interesting.

 -George

>
> Best regards,
>
> Lv, Hui
>
>
> -----Original Message-----
> From: George Dunlap [mailto:george.dunlap@xxxxxxxxxx]
> Sent: Saturday, October 29, 2011 12:19 AM
> To: Dario Faggioli
> Cc: Lv, Hui; George Dunlap; Duan, Jiangang; Tian, Kevin; 
> xen-devel@xxxxxxxxxxxxxxxxxxx; Keir (Xen.org); Dong, Eddie
> Subject: RE: [Xen-devel] [PATCH] scheduler rate controller
>
> On Fri, 2011-10-28 at 11:09 +0100, Dario Faggioli wrote:
>> Not sure yet, I can imagine it's tricky and I need to dig a bit more 
>> in the code, but I'll let know if I found a way of doing that...
>
> There are lots of reasons why the SCHEDULE_SOFTIRQ gets raised.  But I think 
> we want to focus on the scheduler itself raising it as a result of the 
> .wake() callback.  Whether the .wake() happens as a result of a HW interrupt 
> or something else, I don't think really matters.
>
> Dario and Hui,  neither of you have commented on my idea, which is simply 
> don't preempt a VM if it has run for less than some amount of time (say, 
> 500us or 1ms).  If a higher-priority VM is woken up, see how long the current 
> VM has run.  If it's less than 1ms, set a 1ms timer and call schedule() then.
>
>> > > More generally speaking, I see how this feature can be useful, 
>> > > and I also think it could live in the generic schedule.c code, 
>> > > but (as George was saying) the algorithm by which rate-limiting 
>> > > is happening needs to be well known, documented and exposed to 
>> > > the user (more than by means of a couple of perf-counters).
>> > >
>> >
>> > One question is that, what is the right palace to document such 
>> > information? I'd like to make it as clear as possible to the users.
>> >
>> Well, don't know, maybe a WARN (a WARN_ONCE alike thing would 
>> probably be better), or in general something that leave a footstep in 
>> the logs, so that one can find out by means of `xl dmesg' or related. 
>> Obviously, I'm not suggesting of printk-ing each suppressed schedule 
>> invocation, or the overhead would get even worse... :-P
>>
>> I'm thinking of something that happens the very first time the 
>> limiting fires, or maybe oncee some period/number of suppressions, 
>> just to remind the user that he's getting weird behaviour because 
>> _he_enabled_ rate-limiting. Hopefully, that might also be useful for 
>> the user itself to fine tune the limiting parameters, although I 
>> think the perf-counters are already quite well suited for this.
>
> As much as possible, we want the system to Just Work.  Under normal 
> circumstances it wouldn't be too unusual for a VM to have a several-ms delay 
> between receiving a physical interrupt and being scheduled; I think that if 
> the 1ms delay works, having it on all the time would probably be the best 
> solution.  That's another reason I'm in favor of trying it -- it's simple and 
> easy to understand, and doesn't require detecting when to "turn it on".
>
>  -George
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
>
>

Attachment: credit_1.patch
Description: credit_1.patch

Attachment: credit_2.patch
Description: credit_2.patch

Attachment: credit_3.patch
Description: credit_3.patch

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.