[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v5][RFC]xen: sched: convert RTDS from time to event driven model
On Thu, 2016-02-25 at 12:29 -0500, Tianyang Chen wrote: > On 2/25/2016 5:34 AM, Dario Faggioli wrote: > > > > > Which one ASSERT() fails? > > > The replq_insert() fails because it's already on the replenishment > queue > when rt_vcpu_wake() is trying to insert a vcpu again. > > (XEN) Xen call trace: > (XEN) [<ffff82d08012a003>] sched_rt.c#rt_vcpu_wake+0xf0/0x17f > (XEN) [<ffff82d08012be0c>] vcpu_wake+0x213/0x3d4 > (XEN) [<ffff82d08012c327>] vcpu_unblock+0x4b/0x4d > (XEN) [<ffff82d080169cea>] vcpu_kick+0x20/0x6f > (XEN) [<ffff82d080169d65>] vcpu_mark_events_pending+0x2c/0x2f > (XEN) [<ffff82d08010762a>] > event_2l.c#evtchn_2l_set_pending+0xa9/0xb9 > (XEN) [<ffff82d08010822a>] evtchn_send+0x158/0x183 > (XEN) [<ffff82d0801096fc>] do_event_channel_op+0xe21/0x147d > (XEN) [<ffff82d0802439e2>] lstar_enter+0xe2/0x13c > (XEN) > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 0: > (XEN) Assertion '!__vcpu_on_replq(svc)' failed at sched_rt.c:527 > (XEN) **************************************** > Ok, makes sense. > > > And like you > > > said, mostly spurious sleep > > > happens when a vcpu is running and it could happen in other > > > cases, > > > although rare. > > > > > I think I said already there's no such thing as "spurious sleep". > > Or at > > least, I can't think of anything that I would define a spurious > > sleep, > > if you do, please, explain what situation you're referring to. > > > I meant to say spurious wakeup... > If, for spurious wakeup, we refer to wakeups happening when the vcpu is either running or runnable (and hence in the runqueue already), which I do, we don't even get to call __replq_insert() in those cases. I mean, we leave rt_vcpu_wake() before that, don't we? > If rt_vcpu_sleep() removes vcpus from > replenishment queue, it's perfectly safe for rt_vcpu_wake() to > insert > them back. > It's safe against sleeps, not against blockings. That's the point I'm trying to make. > So, I'm suspecting it's the spurious wakeup that's causing > troubles because vcpus are not removed prior to rt_vcpu_wake(). > However, > the two RETURNs at the beginning of rt_vcpu_wake() should catch that > shouldn't it? > Exactly! > > In any case, one way of dealing with vcpus blocking/offlining/etc > > could > > be to, in context_saved(), in case we are not adding the vcpu back > > to > > the runq, cancel its replenishment event with __replq_remove(). > > > > (This may make it possible to avoid doing it in rt_vcpu_sleep() > > too, > > but you'll need to check and test.) > > > > Can you give this a try. > That makes sense. Doing it in context_saved() kinda implies that if > a > vcpu is sleeping and taken off, its replenishment event should be > removed. On the other hand, the logic is the same as removing it in > rt_vcpu_sleep() but just at different times. > It is, but, if done properly, it catches more cases, or at least that's what I'm after. > Well, I have tried it and > the check still needs to be there in rt_vcpu_wake(). I will send the > next version so it's easier to look at. > If you're still seeing assertion failures, perhaps context_saved() is not the right place where to do that... I'll think more about this... Anyway, yes, let's see the code. Please, also send again, as a follow up email, the assertion failure log you get if you remove the check. Thanks and Regards, Dario -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) Attachment:
signature.asc _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |