[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2 07/10] xen/arm: Release maintenance interrupt when CPU is hot-unplugged
On 27/04/18 15:38, Mirela Simonovic wrote: Can you please give a pointer to code in Linux? The only place I see the GICC disabled is when using versatile TC2. It looks like it is just because they don't have PSCI support so Linux has to do the power management. That's not a platform we are ever going to fully support in Xen.Hi, On Fri, Apr 27, 2018 at 4:15 PM, Tim Deegan <tim@xxxxxxx> wrote:Hi, At 10:28 +0100 on 27 Apr (1524824906), Julien Grall wrote:On 26/04/18 15:23, Tim Deegan wrote:At 11:08 +0100 on 26 Apr (1524740921), Julien Grall wrote:On 20/04/18 13:25, Mirela Simonovic wrote:This looks a bit weird. AFAIU, if you disable the CPU interface, then you should never receive interrupt after. So why would you re-enable them? I realize the code in __cpu_disbale do that, but this looks quite wrong to me. There are no way to receive queued timer interrupt afterwards.That is what I took as a reference, but I asked myself the same. There is (extremely small, but it exists) time window between disabling irq locally and disabling CPU interface. An interrupt received in that time window would propagate to the CPU but I'm not sure would happen after the GIC CPU interface is disabled and interrupts are locally enabled. That is the only explanation I can come up with, although I believe the answer is nothing. Since you're at ARM you could check this internally.Speaking with Andre (in CC), the GIC CPU interface may have forwarded an interrupt to the processor before it gets disabled. So when the interrupt will be re-enabled, the processor will jump to the interrupt exception entry. However, looking at the spec (4-78 in ARM IHI 0048B.b), Xen will read a spurious interrupt ID from GICC_IAR. So I am not sure what the point of that code. It looks like it has been taken from x86, but some bits are missing. AFAIU, x86 will only suspend the timer afterwards (see time_suspend). I am not fully sure why this code is there on Arm. Whether we expect a timer interrupt to come up. Stefano, Tim, do you have any insight on that code?Sorry, no. I pretty clearly copied this logic from x86, which copied it directly from Linux at some point in the past. I don't know why x86 does it this way, and I haven't dived into linux to find out. :) But draining the outstanding IRQs seems like a polite thing to do if you're ever going to re-enable this CPU (at least without resetting it first).I am not entirely sure what you mean by draining, do you mean they will serviced by Xen? If so, what kind of interrupts do you expect to be serviced (e.g PPI, SPIs) ?All I mean is, when you disable the GICC (or APIC, or whatever), you know that it won't send any more interrupts to the CPU. But you would like to also be certain that any interrupts it already sent to the CPU get processed now. Otherwise, if you bring the CPU up again later that interrupt could still be there. Better to get it out of the way now, right? AIUI that's what x86 is doing by re-enabling interrupts and waiting a bit, which seems a bit crude but OK. ARM could maybe do the same thing by disabling GICC, dsb(), then disable interrupts. But I don't understand the interface between GICD, GICC and CPU well enough to reason about it properly. It's also possible that there's some subtlety of the timer interrupt handling that I don't know about -- I _think_ that the reason timer interrupts are relevant is that they're generated inside the APIC, so that even when no interrupts are routed to the core, the APIC could still generate one as it's being shut down.Clearly, this code does not seem to be doing what we are expecting. Speaking the Marc Z. (GIC maintainers in Linux), there are no need to disable the GIC CPU interface in the hypervisor/OS. You are going to shutdown the CPU and it will be reset when you are coming back.I don't think this assumption is guaranteed to be correct. In current implementation of ATF and if no additional security software runs the assumption would likely be correct, but it wouldn't be correct in general. Linux does disable GICC in such a scenario. So changing this could cause problems in some scenarios, while keeping it makes no harm. While I guess this code makes no harm, it does not do what is expected (i.e draining the interrupt). I can't see any reason to keep wrong code, we should really aim to have code that match the architecture. And better to fix it when we discover the problem rather than waiting until we rediscovered it later. So at least a patch to update the code/comment should be done. Thanks for the feedback, I really appreciate it. Cheers, -- Julien Grall _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |