[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH v2] x86/i8259: do not assume interrupts always target CPU0


  • To: xen-devel@xxxxxxxxxxxxxxxxxxxx
  • From: Roger Pau Monne <roger.pau@xxxxxxxxxx>
  • Date: Mon, 23 Oct 2023 14:46:35 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=QSXlqjPToKHgQULyC976CQxyVan704f+/BpCJVPz8xg=; b=jyVVZxU0tVR3cuI/gUtcqO0FFmG9PHE4rigFQ+bt5FTVxhRsXYfgGdTTztGRAwGlyeYYHESOncLTr8+LTrbsnNlEgcWcxqlCWlkmEDFZKYrkBRZlhTYuXeN6i3Y8zYhy/ix1aeuaebsngwRmXNAZygdRqb5sgFOSY7PKJeGe7I/pcVT+/hNBq4FCf4QIgeZT+H0CDvIhqFt8hcTxm8YyGM4DR+zV1oIiJDLWr0EjMtNEs9F/g86/gWnmMqwLpv0K7XOqoP3+OsDUM99auXSBuxKI04c0Hgl9jrrg3AP2dkgMbNV7Qk6kkHk5/tvujdx5VrtTzel3czTQyOKx4RP5Pw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=S9N2IvJ4B/McY9XaiHM3p/MVpo6t6EikL+FruyRbYeCXOocNe2l5MmErYg3cInngWgVe0TzSiDQUsrixIH9XUUTW2MuDTOGhiW80Sg+aR7x+RFH+34C5tZk9lNIhsIIiyeBQB9NsgKkTcdY9PkxVvIsiApkyAEbzZir7CItZmQrJ+do5XOvRNov7kGSKp57zNSftKQKY0vUS5fWHGz/xcfk3IOn/xB5GAm70lQJSe3j5gp9BUFzQO31+KYfKBRKXzOoBSTtsbEFqid/0h6roNlqfu5gOyXuxIJHlRAMnUUt+ButhKCP2MxZliCaZp41omywada+Xq0GAE9Yf8Bf5VQ==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: Roger Pau Monne <roger.pau@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Mon, 23 Oct 2023 13:32:08 +0000
  • Ironport-data: A9a23:rNJiMq9jzAfhy7MZsmAEDrUDoX+TJUtcMsCJ2f8bNWPcYEJGY0x3y GEZCGCCbvmMNmP8f48gaIqz/UxTuMXSx9Y1GgdrpCo8E34SpcT7XtnIdU2Y0wF+jCHgZBk+s 5hBMImowOQcFCK0SsKFa+C5xZVE/fjVAOK6UKidYnwZqTZMEE8JkQhkl/MynrlmiN24BxLlk d7pqojUNUTNNwRcawr40Ird7ks01BjOkGlA5AdnPaoQ5Aa2e0Q9V/rzG4ngdxMUfaEMdgKKb 76r5K20+Grf4yAsBruN+losWhRXKlJ6FVHmZkt+A8BOsDAbzsAB+v9T2M4nQVVWk120c+VZk 72hg3ASpTABZcUgkMxFO/VR/roX0aduoNcrKlDn2SCfItGvn9IBDJyCAWlvVbD09NqbDklh5 6UpNShKaCy6iuiX0Ou3c+1ixfYseZyD0IM34hmMzBn/JNN/GNXoZPyP4tVVmjAtmspJAPDSI dIDbiZiZwjBZBsJPUoLDJU5n6GjgXyXnz9w8QrJ4/ZopTWNilUvgdABM/KMEjCObd9SkUuC4 HrP4kzyAw0ANczZwj2Amp6prraVwH2iBtlMStVU8NZrvVHC+WYLVSYycmHrpLqrhm2ufPZAf hl8Fi0G6PJaGFaQZsnwWVi0rWCJujYYWsFMCKsq5QeV0K3W7g2FQG8eQVZpatYrqcs3TjwCz UKSkpXiAjkHmKKRYWKQ8PGTtzzaBMQOBWoLZCtBRgxc5dDm+dg3lkiWEIclF7OphNroHz222 yqNsCU1m7QUi4gMyrm/+lfExTmro/AlUzII2+keZUr9hisRWWJvT9bygbQHxZ6s9Lqkc2Q=
  • Ironport-hdrordr: A9a23:eAhSmawRjhNhqxeCCOQzKrPxR+gkLtp133Aq2lEZdPULSKGlfp GV9sjziyWetN9wYh4dcB67Scu9qBTnhOZICOgqTM6ftWzd1FdAQ7sSibcKrweBJ8SczJ8h6U 4DSdkYNDSYNzET46fHCWGDYqwdKbK8gcWVbInlvhRQpVYAUdAa0+41MHftLqUwLzM2dKYRJd 653I5qtjCgcXMYYoCSAWQEZfHKo5nmhY/rehkPAj8g8U2rgSmz4LD3PhCE1lNGOgk/iosKwC zgqUjU96+ju/a0xlv10HLS1Y1fnJ/M28ZOH8uFj+kSM3HJhhyzbIpsdrWetHQeof2p6nwtjN 7Qyi1Qd/hb2jf0RCWYsBHt0w7v3HIH7GLj80aRhT/GrdbiTDw3JsJdjcYBGyGponYIjZVZ6u ZmzmiZv51YAVfpmzn83cHBU1VPmlCvqXQvvOYPhzh0UJcYapVWsYsDlXklZqsoLWbf0sQKAe NuBMbT6LJ/dk6bVWnQui1VzNmlTh0Ib2W7a3lHnvbQ/yldnXh/wUdd7tcYhG08+JU0TIQBz/ jYM4xz/Ys+AfM+XOZYPqMsUMG3AmvCTVbnK2SJO2nqE6kBJjbkt4P32rMo/+unEaZ4gKfaoK 6xEW+wiFRCO34HUaa1rd52G1H2MSiAtA3Wu49jD8MTgMy/eFLpWRfzO2zG3fHQ5sn3OferJc pbCKgmf8MLElGeZrqhpzeOPaW6CUNuJfH96exLL26mk4bsFrDAkND9XbL6GIfNeAxUKl8XRE FzFgTOGA==
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Sporadically we have seen the following during AP bringup on AMD platforms
only:

microcode: CPU59 updated from revision 0x830107a to 0x830107a, date = 2023-05-17
microcode: CPU60 updated from revision 0x830104d to 0x830107a, date = 2023-05-17
CPU60: No irq handler for vector 27 (IRQ -2147483648)
microcode: CPU61 updated from revision 0x830107a to 0x830107a, date = 2023-05-17

This is similar to the issue raised on Linux commit 36e9e1eab777e, where they
observed i8259 (active) vectors getting delivered to CPUs different than 0.

On AMD or Hygon platforms adjust the target CPU mask of i8259 interrupt
descriptors to contain all possible CPUs, so that APs will reserve the vector
at startup if any legacy IRQ is still delivered through the i8259.  Note that
if the IO-APIC takes over those interrupt descriptors the CPU mask will be
reset.

Spurious i8259 interrupt vectors however (IRQ7 and IRQ15) can be injected even
when all i8259 pins are masked, and hence would need to be handled on all CPUs.

Do not reserve the PIC spurious vectors on all CPUs, but do check for such
spurious interrupts on all CPUs if the vendor is AMD or Hygon.  Note that once
the vectors get used by devices detecting PIC spurious interrupts will no
longer be possible, however the device should be able to cope with spurious
interrupt.  Such PIC spurious interrupts occurring when the vector is in use by
a local APIC routed source will lead to an extra EOI, which might
unintentionally clear a different vector from ISR.  Note this is already the
current behavior, so assume it's infrequent enough to not cause real issues.

Finally, adjust the printed message to display the CPU where the spurious
interrupt has been received, so it looks like:

microcode: CPU1 updated from revision 0x830107a to 0x830107a, date = 2023-05-17
cpu1: spurious 8259A interrupt: IRQ7
microcode: CPU2 updated from revision 0x830104d to 0x830107a, date = 2023-05-17

Fixes: 3fba06ba9f8b ('x86/IRQ: re-use legacy vector ranges on APs')
Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
---
Changes since v1:
 - Do not reserved spurious PIC vectors on APs, but still check for spurious
   PIC interrupts.
 - Reword commit message.
---
Not sure if the Fixes tag is the most appropriate here, since AFAICT this is a
hardware glitch, but it makes it easier to see to which versions the fix should
be backported, because Xen previous behavior was to reserve all legacy vectors
on all CPUs.
---
 xen/arch/x86/i8259.c | 29 +++++++++++++++++++++++++++--
 xen/arch/x86/irq.c   |  1 -
 2 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/i8259.c b/xen/arch/x86/i8259.c
index ed9f55abe51e..0935cdf07b65 100644
--- a/xen/arch/x86/i8259.c
+++ b/xen/arch/x86/i8259.c
@@ -37,6 +37,15 @@ static bool _mask_and_ack_8259A_irq(unsigned int irq);
 
 bool bogus_8259A_irq(unsigned int irq)
 {
+    if ( smp_processor_id() &&
+         !(boot_cpu_data.x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) )
+        /*
+         * For AMD/Hygon do spurious PIC interrupt detection on all CPUs, as it
+         * has been observed that during unknown circumstances spurious PIC
+         * interrupts have been delivered to CPUs different than the BSP.
+         */
+        return false;
+
     return !_mask_and_ack_8259A_irq(irq);
 }
 
@@ -222,7 +231,8 @@ static bool _mask_and_ack_8259A_irq(unsigned int irq)
         is_real_irq = false;
         /* Report spurious IRQ, once per IRQ line. */
         if (!(spurious_irq_mask & irqmask)) {
-            printk("spurious 8259A interrupt: IRQ%d.\n", irq);
+            printk("cpu%u: spurious 8259A interrupt: IRQ%u\n",
+                   smp_processor_id(), irq);
             spurious_irq_mask |= irqmask;
         }
         /*
@@ -349,7 +359,22 @@ void __init init_IRQ(void)
             continue;
         desc->handler = &i8259A_irq_type;
         per_cpu(vector_irq, cpu)[LEGACY_VECTOR(irq)] = irq;
-        cpumask_copy(desc->arch.cpu_mask, cpumask_of(cpu));
+
+        /*
+         * The interrupt affinity logic never targets interrupts to offline
+         * CPUs, hence it's safe to use cpumask_all here.
+         *
+         * Legacy PIC interrupts are only targeted to CPU0, but depending on
+         * the platform they can be distributed to any online CPU in hardware.
+         * Note this behavior has only been observed on AMD hardware. In order
+         * to cope install all active legacy vectors on all CPUs.
+         *
+         * IO-APIC will change the destination mask if/when taking ownership of
+         * the interrupt.
+         */
+        cpumask_copy(desc->arch.cpu_mask, boot_cpu_data.x86_vendor &
+                                          (X86_VENDOR_AMD | X86_VENDOR_HYGON) ?
+                                          &cpumask_all : cpumask_of(cpu));
         desc->arch.vector = LEGACY_VECTOR(irq);
     }
     
diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index f42ad539dcd5..a2f9374f5deb 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -1920,7 +1920,6 @@ void do_IRQ(struct cpu_user_regs *regs)
                 kind = "";
             if ( !(vector >= FIRST_LEGACY_VECTOR &&
                    vector <= LAST_LEGACY_VECTOR &&
-                   !smp_processor_id() &&
                    bogus_8259A_irq(vector - FIRST_LEGACY_VECTOR)) )
             {
                 printk("CPU%u: No irq handler for vector %02x (IRQ %d%s)\n",
-- 
2.42.0




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.