[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v8]xen: sched: convert RTDS from time to event driven model

>> So IMHO, the replenishment handler will be called in interrupt
>> handler
>> *and* with interrupt enabled.
>> The only difference between lock_irq() and lock_irqsave() is that
>> lock_irq() can only be called with interrupt enabled. (lock_irq will
>> check if the interrupt is enabled before it disable the interrupt.)
>> So I think it is safe to use lock_irq() inside replenishment handler,
>> unless there is somewhere call this handler *with interrupt
>> disabled*.
> I totally agree: _irq() is ok. (Last sentence it's a bit pointless,
> considering that this function is a timer handler, and I would not
> expect to have a timer handler called by much else than one of the
> timer's interupt. :-) )
>> OK. I changed the spin_lock_irqsave to spin_lock_irq based on this
>> patch. The system works well: system can boot up and will not crash
>> after I create a VM
>> So I think spin_lock_irq should work...
> Glad to hear that.
>> Of course, spin_lock_irqsave() is a more safe choice, with several
>> extra instruction overhead.
>> And in order to be sure _irq works, we need further tests for sure.
> Nah, we're talking about correctness, which is something one should be
> really be able to assess by looking at the code, rather than by
> testing! Looking at the code, one concludes that spin_lock_irq() is
> what should be used. If testing causes crashes due to using it, either:
>  - we were wrong when looking at the code, and we should look better!
>  - there is a bug somewhere else.
> As said, I'm glad that testing so far is confirming the analysis that
> seemed the correct one. :-)
> And I wouldn't say that _irqsave() is "more safe choice". It's either
> necessary, and then it should be used, or unnecessary, and then it
> should not.

OK. From what I saw/read in the code, _irq is enough. Further testing
just to "make sure" I didn't miss anything. :-)

> There may be situations whether it is hard or impossible to tell
> whether interrupt are disabled already or not (because, e.g., there are
> multiple call paths, with either IRQs disabled or not), and in those
> cases, using _irqsave() can be seen as wanting to be on the safer
> side... But this is not one of those cases, so using _irq() is what's
> right and not less safe than anything else.
>> I actually checked somewhere else in schedule.c and couldn't find a
>> place that the priv->lock is used in an irq context with interrupt
>> disabled.
>> So I guess maybe we overuse the spin_lock_irqsave() in the scheduler?
> Perhaps, but I've got a patch series, which I'm trying to find some
> time to finish, where I basically convert almost all locking in
> schedule.c in just plain spin_lock() --i.e., allowing us to keep IRQs
> enabled. If I manage to, _irq() or _irqsave() would not matter any
> longer. :-)
> But I'm confused, by the fact that you mention schedule.c, whether you
> are talking about locking that happens inside RTDS code, or in
> schedule.c (me, I'm talking about the latter).

Some of the functions in RTDS code assumes that lock is grab in the
caller, which is in the schedule.c. That's what I meant by looking at
the locking inside schedule.c.
When RTDS functions is called by function in schedule.c with lock
held, I want to make sure the lock is not held in an interrupt
disabled context, i.e., interrupt has not been disabled before it
tries to grab the lock.

>> I'm ok that we keep using spin_lock_irqsave() for now.
> If you're talking about this patch, then no, _irq() should be used.

OK. then let's use _irq() then unless it causes problem.

>> But maybe
>> later, it will be a better idea to explore if spin_lock_irq() can
>> replace all spin_lock_irqsave() in the RTDS scheduler?
> Maybe, but as I said, I'd like to explore other approaches (I am
> already, actually).

Looking forward to your findings. :-)

>> BTW, I'm unsure about the impact of such change on ARM right now.
> And I'm unsure about why you think this is, or should be, architecture
> dependant... can you elaborate?

I don't think it should have difference in x86 and in ARM. However,
perviously, I remembered that RTDS does not work in ARM because the
critical section in context switch in ARM is much longer than that in
x86. That's why RTDS reports error in ARM in terms of locks and was
fixed by global logic guys.

That's why I'm a little bit concern or hold my opinion on the impact
on ARM, since I didn't run the code on ARM yet and have no test data
to back up my understanding.

Thanks and Best Regards,


Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.