[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2] Modified RTDS scheduler to use an event-driven model instead of polling.



[Trimming the Cc-list a bit, to avoid bothering Wei and Jan]

On Tue, 2015-07-07 at 22:56 -0700, Meng Xu wrote:
> Hi Dario,
> 
Hi,

> 2015-07-07 7:03 GMT-07:00 Dario Faggioli <dario.faggioli@xxxxxxxxxx>:
> >
> > On Mon, 2015-07-06 at 22:51 -0700, Meng Xu wrote:

> > So, it looks to me that, as far as (1) and (2) are concerned, since we
> > are "just" inserting a vCPU in the runq, if we have M pCPUs, and we know
> > whether we inserted it within the first M spots, we already have what we
> > want, or am I missing something? And if __runq_insert() now (with Dagaen
> > patch) tells us this, well, we can simplify the tickling logic, can't
> > we?
> 
> I think you might assume that the first M VCPUs  in the runq are the
> current running VCPUs on the M pCPUs. Am I correct? (From what you
> described in the following example, I think I'm correct. ;-) )
> 
Mmm... Interesting. Yes, I was. I was basing this assumption on this
chunk on Dagaen's patch:

    // If we become one of top [# CPUs] in the runq, tickle it
    // TODO: make this work when multiple tickles are required
    if ( new_position > 0 && new_position <= prv->NUM_CPUS )
        runq_tickle(ops, svc);

And forgot (and did not go check) about the __q_remove() in
rt_schedule(). My bad again.

But then, since we don't have the running vCPUs in the runq, how the
code above is supposed to be correct?

> > With an example:
> > We are waking up (or re-inserting, in rt_context_saved()) vCPU j. We
> > have 6 pCPUs. __runq_insert() tells us that it put vCPU j at the 3rd
> > place in the runq. This means vCPU j should be set to run as soon as
> > possible. So, if vCPU j is 3rd in runq, either
> >  (a) there are only 3 runnable vCPUs (i.e., if we are waking up j, there
> >      were 2 of them, and j is the third; if we are in context_saved,
> >      there already where 3, and j just got it's deadline postponed, or
> >      someone else got its one replenished);
> >  (b) there are more than 3 runnable vCPUs, i.e., there is at least a 4th
> >      vCPU --say vCPU k-- in the runq, which was the 3rd before vCPU j
> >      were woken (or re-inserted), but now became the 4th, because
> >      deadline(j)<deadline(k).
> > In case (a), there are for sure idle pCPUs, and we should tickle one of
> > them.
> 
> I tell that you make the above assumption from here.
> 
> However, in the current implementation, runq does not hold the current
> running VCPUs on the pCPUs. We remove the vcpu from runq in
> rt_schedule() function. What you described above make perfect sense
> "if" we decide to make runq hold the current running VCPUs.
> 
Yep. And it indeed seems to me that we may well think about doing so. It
will make it possible to base on the position for making/optimizing
scheduling decisions, and at the same time I don't think I see much
downsides in that, do you?

> Actually, after thinking about the example you described, I think we
> can hold the current running VCPUs *and* the current idle pCPUs in the
> scheduler-wide structure; 
>
What do you mean with 'current idle pCPUs'? I said something similar as
well, and what I meant was a cpumask with bit i set if i-eth pCPU is
idle, do you also mean this?

About the running vCPUs, why just not leave them in the actual runq?

> In other words, we can have another runningq
> (not runq) and a idle_pcpu list in the rt_private; Now all VCPUs are
> stored in three queues: runningq, runq, and depletedq, in increasing
> priority order.
> 
Perhaps, but I'm not sure I see the need for another list. Again, why
just not leave them in runq? I appreciate this is a rather big  change
(although, perhaps it looks bigger said than done), but I think it could
be worth pursuing.

For double checking, asserting, and making sure that we are able to
identify the running svc-s, we have the __RTDS_scheduled flag.

> When we make the tickle decision, we only need to scan the idle_pcpu
> and then runningq to figure out which pCPU to tickle. All of other
> design you describe still hold here, except that the position where a
> VCPU is inserted into runq cannot directly give us which pCPU to
> tickle. What do you think?
> 
I think that I'd like to know why you think adding another queue is
necessary, instead of just leaving the vCPUs in the actual runq. Is
there something bad about that which I'm missing?

> > In case (b) there may be idle pCPUs (and, if that's the case, we
> > should tickle one of them, of course) or not. If not, we need to go
> > figure out which pCPU to tickle, which is exactly what runq_tickle()
> > does, but we at least know for sure that we want to tickle the pCPU
> > where vCPU k runs, or others where vCPUs with deadline greater than vCPU
> > k run.
> >
> > Does this make sense?
> 
> Yes, if we decide to hold the currently running VCPUs in
> scheduler-wide structure: it can be runq or runningq.
> 
Yes, but if we use two queues, we defeat at least part of this
optimization/simplification.

> > Still, I think I gave enough material for an actual optimization. What
> > do you think?
> 
> Yes. It is very clear.
> The only thing is how we are going to "realize" your assumption. :-)
>
:-D

> I'm not so sure if we do need the runningq  to hold the current
> running VCPUs since we can still use "for_each_cpu()" to find all
> pCPUs used in the scheduler and get the current running VCPU on it.
> Adding a new field to hold all such running VCPUs might help speed up
> the searching, but need more space. So there is a balance between
> space and time.
> 
I don't think the added space would be a problem, but I don't see why we
need it.

> Thank you for your guidance and best regards,
> 
Thanks for you for actually doing this work. :-)

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.