[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3] x86/i8259: do not assume interrupts always target CPU0
On Thu, Oct 26, 2023 at 09:59:42AM +0200, Jan Beulich wrote: > On 24.10.2023 16:53, Roger Pau Monne wrote: > > Sporadically we have seen the following during AP bringup on AMD platforms > > only: > > > > microcode: CPU59 updated from revision 0x830107a to 0x830107a, date = > > 2023-05-17 > > microcode: CPU60 updated from revision 0x830104d to 0x830107a, date = > > 2023-05-17 > > CPU60: No irq handler for vector 27 (IRQ -2147483648) > > microcode: CPU61 updated from revision 0x830107a to 0x830107a, date = > > 2023-05-17 > > > > This is similar to the issue raised on Linux commit 36e9e1eab777e, where > > they > > observed i8259 (active) vectors getting delivered to CPUs different than 0. > > > > On AMD or Hygon platforms adjust the target CPU mask of i8259 interrupt > > descriptors to contain all possible CPUs, so that APs will reserve the > > vector > > at startup if any legacy IRQ is still delivered through the i8259. Note > > that > > if the IO-APIC takes over those interrupt descriptors the CPU mask will be > > reset. > > > > Spurious i8259 interrupt vectors however (IRQ7 and IRQ15) can be injected > > even > > when all i8259 pins are masked, and hence would need to be handled on all > > CPUs. > > > > Continue to reserve PIC vectors on CPU0 only, but do check for such spurious > > interrupts on all CPUs if the vendor is AMD or Hygon. Note that once the > > vectors get used by devices detecting PIC spurious interrupts will no > > longer be > > possible, however the device driver should be able to cope with spurious > > interrupts. Such PIC spurious interrupts occurring when the vector is in > > use > > by a local APIC routed source will lead to an extra EOI, which might > > unintentionally clear a different vector from ISR. Note this is already the > > current behavior, so assume it's infrequent enough to not cause real issues. > > > > Finally, adjust the printed message to display the CPU where the spurious > > interrupt has been received, so it looks like: > > > > microcode: CPU1 updated from revision 0x830107a to 0x830107a, date = > > 2023-05-17 > > cpu1: spurious 8259A interrupt: IRQ7 > > microcode: CPU2 updated from revision 0x830104d to 0x830107a, date = > > 2023-05-17 > > > > Amends: 3fba06ba9f8b ('x86/IRQ: re-use legacy vector ranges on APs') > > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> > > Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx> > with one nit (which I think can be taken care of when committing): > > > --- a/xen/arch/x86/irq.c > > +++ b/xen/arch/x86/irq.c > > @@ -1920,7 +1920,16 @@ void do_IRQ(struct cpu_user_regs *regs) > > kind = ""; > > if ( !(vector >= FIRST_LEGACY_VECTOR && > > vector <= LAST_LEGACY_VECTOR && > > - !smp_processor_id() && > > + (!smp_processor_id() || > > + /* > > + * For AMD/Hygon do spurious PIC interrupt > > + * detection on all CPUs, as it has been observed > > + * that during unknown circumstances spurious PIC > > + * interrupts have been delivered to CPUs > > + * different than the BSP. > > + */ > > + (boot_cpu_data.x86_vendor & (X86_VENDOR_AMD | > > + X86_VENDOR_HYGON))) && > > Afaict these two lines need indenting by one more blank, to account > for the parentheses enclosing the || operands. Indeed, please adjust at commit if you don't mind. Thanks, Roger.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |