[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v8]xen: sched: convert RTDS from time to event driven model



On Mon, Mar 14, 2016 at 7:48 AM, Dario Faggioli
<dario.faggioli@xxxxxxxxxx> wrote:
> On Sun, 2016-03-13 at 11:43 -0400, Meng Xu wrote:
>> On Sat, Mar 12, 2016 at 5:21 PM, Chen, Tianyang <tiche@xxxxxxxxxxxxxx
>> > wrote:
>> > On 03/11/2016 11:54 PM, Meng Xu wrote:
>> > > One more thing we should think about is:
>> > > How can we "prove/test" the correctness of the scheduler?
>> > > Can we use xentrace to record the scheduling trace and then write
>> > > a
>> > > userspace program to check the scheduling trace is obeying the
>> > > priority rules of the scheduler.
>> > >
>> > Thanks for the review Meng, I am still exploring xentrace and it
>> > can output
>> > scheduling events such as which vcpu is running on a pcpu. I think
>> > it's
>> > possible for the userspace program to check RTDS, based on
>> > cur_budget and
>> > cur_deadline. We need to have a very clear outline of rules, for
>> > the things
>> > we are concerned about. When you say correctness, what does it
>> > include? I'm
>> > thinking about rules for when a vcpu should preempt, tickle and
>> > actually be
>> > picked.
>> What you said should be included...
>> What's in my mind is checking the invariants in the EDF scheduling
>> policy.
>> For example, at any time, the running VCPUs should have higher
>> priority than the VCPUs in runq;
>> At any time, the runq and replenish queue should be sorted based on
>> EDF policy.
>>
> This would be rather useful, but it's really difficult. It was "a
> thing" already when I was doing research on RT systems, i.e., a few
> years ago.
>
> Fact is, there always be (transitory, hopefully) situations where the
> schedule is not compliant with EDF, because of scheduling overhead,
> timers resolution, irq waiting being re-enabled, etc.
> The, as far as I can remember, is how to distinguish with an actual
> transient state and a real bug in the coding of the algorithm.
>
> At the time, there was some work on it, and my research group was also
> interested in doing something similar for the EDF scheduler we were
> pushing to Linux. We never got to do much, though, and the only
> reference I can recall of and find, right now, of others' work is this:
>
> https://www.cs.unc.edu/~mollison/unit-trace/index.html
> http://www.cs.unc.edu/~mollison/pubs/ospert09.pdf

Right! I knew this one in LITMUS and it is great! Every time when
Bjorn update LITMUS, he only needs to run a bunches of test to make
sure the update does not mess things up.
If we could have something like this, that will be awesome!
I suspect that they should also have the similar situation as we face
here. They also have the scheduling latency, timers resolution, etc.
We could probably ask them.

Actually, we can bound the time spent in the transient state, that
will be also useful! This will at least tell us how well the scheduler
follows the gEDF scheduling policy. :-)

>
> It was for LITMUS-RT, so adaptation would be required to make it
> process a xentrace/xenalyze event log (provided it's finished and
> working, which I don't think).
>
> I can ask my old colleagues if they remember more, and whether anything
> happened since I left, in the RT community about that (although, the
> latter, you guys are in a way better position than me to check! :-P).
>

Sure! I will also ask around and will get back to this list later.

Thanks,

Meng

-----------
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.