Xen project Mailing List

Re: [Xen-devel] [PATCH] x86/IRQ: prevent vector sharing within IO-APICs

To: "Andrew Cooper" <andrew.cooper3@xxxxxxxxxx>

From: "Jan Beulich" <JBeulich@xxxxxxxx>

Date: Tue, 15 Nov 2011 11:10:21 +0000

Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>

Delivery-date: Tue, 15 Nov 2011 03:14:25 -0800

List-id: Xen developer discussion <xen-devel.lists.xensource.com>

>>> On 14.11.11 at 18:33, Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote: > On 14/11/11 16:29, Jan Beulich wrote: >>>>> On 14.11.11 at 16:27, Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote: >>> On 14/11/11 14:20, Jan Beulich wrote: >>>> I'm having some more fundamental problem with this original change >>>> of yours: The assignment of the new vector happens in the context >>>> of move_native_irq(), which gets called *after* doing the manual >>>> EOI (and hence also *after* the vector != desc->arch.vector check). >>>> How does that do what it is supposed to? >>>> >>>> (Below the [untested] patch that I would propose to do what I described >>>> above, pending your clarification regarding the original change.) >>>> >>>> Jan >>> Are you sure? I was under the impression that for safety, you have to >>> move the IRQ before ack'ing it at the local APIC, to prevent another IRQ >>> coming in while you are updating the structures. >> Just take another look at end_level_ioapic_irq(). > > If it makes a difference, I was debugging the issue on boxes without > x2APIC, so there was no direct EOI support. Given the number of times Directed EOI has nothing to do with x2apic. > the code conditionally returns depending on direct_eoi_enabled, I would > not be surprised if there are two separate codepaths. >... >> But the crucial thing is that desc->arch.vector gets written during step 2, >> and hence the condition around the EOI can never be true unless an >> interrupt surfaces in the window between the EOI in step 2 and the >> disabling of it in move_native_irq(). Was it just this special case that >> your original change addressed (the change description doesn't hint in >> that direction). > > Given that the bug was certainly a timing issue, and most of the time > the IRQ migrated, everything worked fine, It is quite possible that it > was a new IRQ appearing in that window. Having said that, even if a new > IRQ arrived, its remote IRR should not be set until the LAPIC has acked > it, at which point we should be in the IRQ handler for it which seems > rather hard if we are still in the handler for the previous IRQ. No - the LAPIC acking the IO-APIC request doesn't mean the IRQ is already in the process of being handled by software. In particular, while the first IRQ is in progress, the IRR bit could get set by a second interrupt instance between the EOI and the disabling (in move_native_irq()), bit EFLAGS.IF would still be zero. Once sti (or an equivalent thereof) would get executed, that second interrupt would get delivered from the LAPIC to the execution unit, and the mask-and-ack would invoke irq_complete_move(), which in turn might send the cleanup IPI (which in turn wouldn't get delivered until EFLAGS.IF became 1 again). Depending on whether the IPI arrives before or after end_level_ioapic_irq() runs for this second interrupt instance, the vectors may or may appear to be out of sync (and here the description of your original change seems wrong - you're using the new vector, not the old one, as you want to make sure that the vector written actually matches the already updated RTE(s)). > Perhaps there is a weird interaction whereby acking a masked level > interrupt doesn't reset the IRR. > > I have to admit that the logic seems a bit broken, and it certainly > seems to be more complicated than it needs to be. > >>> An alternative to 3) can occur when using the IRQ_MOVE_CLEANUP_VECTOR >>> IPI which looks through all pending cleanups on the current pcpu and >>> cleans them up. >>> >>> Anyway, it was certainly the case that I saw the RTE being updated >>> before the EOI was sent, which is why I had to change the >>> hw_irq_controler.end interface to include the vector from the pending >>> EOI stack, so end_level_ioapic_irq could EOI the new vector if it was >>> different from the vector the lapic saw. >> That's another thing I would want to revert: There really shouldn't be >> a need to pass in the old vector, as it is getting stored in >> __assign_irq_vector(). If the value wasn't valid anymore at the point >> we'd need it, we'd have a more severe problem - the vector gets >> marked as not in use by smp_irq_move_cleanup_interrupt(), and >> hence it wouldn't be valid anymore to play with it (read: EOI it) as >> it may have got re-used already. As vector_irq[] gets updated there >> too, that should result in "No irq handler for vector ...", but it would >> not make it into end_level_ioapic_irq() in such a case. >> >> Jan > > That is fair enough. When I submitted this patch, irq_cfg did not have > old_vector in it - I added old_vector in a later patch. I really should > have remembered that I could then revert the change to the interface. The point is that we need to be sure that ->arch.old_vector is still valid at this point. Yesterday I though I convinced myself that it would be, but today I'm again not certain anymore. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.