[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: NetBSD dom0 PVH: hardware interrupts stalls


  • To: Manuel Bouyer <bouyer@xxxxxxxxxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Tue, 17 Nov 2020 16:58:07 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=1CGoFyLPlUsRQw7DTbXrB6juYSwysCpizYUG3l7GMak=; b=MZ48vgDEgqHOnGs+NBp3Hjw4w8thFJwNCasO1Y5lybyjJD1pnKkuBf3z3HD/R+Vq9ximRVFoCaG1/jcWKSVgbBAzU+3J8q5si+f2iur1emah/BP6oYw82B187nWxLLGrHSdCl4ypM4uvI8QBu0PGN1ybUq2TJUVelParZZIBCZVk5gOmRvWmGnq0Qd3SDtZjKc/YhF03/c4xfqqAqywt629uYppZnWfkSkCadji7MTx43dMx+k6xrJ9YTKP8LMKir4d+bcdpbjOrVesqpqF2uHwt/6p/BHxG+O9uB5XXP44p3utbsylsTf9eBkDsiEJORinzWj8PZcbgrKTmq6dsZw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=FMmUyJPFT9Urfe9/+KhKk8aY0Xd2D1r4zwrWaklIj3erHmUKFdKNEab8jWF+qDHjBzb7R/m09f3MG24FXcsOfAKdxdpm3EWiYH/7nBA8FZpx/kER1fc6iqqv43qokjQCd+qZhLF4sPPDG4N/cW8fJFkzWyLmsZUdkv2CzUgcpFv5A7DxSAsOu1xsph8pWBORfMx9W0lJWfUdFBublPtXJtkIem06mArnFyWIRIpcwtKX2OVVLTULNdGzRU7etJ4iRaVxkHQ++tUnm3u4db5XHwJCetYWEpfqMIjIMwh3uSvlmJUGuD63jni56CaTxMorkg1vvMbzIhLssqs4Klc+9w==
  • Authentication-results: esa4.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Tue, 17 Nov 2020 15:58:28 +0000
  • Ironport-sdr: C9HbK6g+pOtZnGOVZ/O3/gt5oU6RWBq6Ki8ufYTMIR7lMMFf4RJ4UrJYaB8iWF/OJDzzFTdbvT GoCduA9ehm5Z6eM9F8SSU0zdkB5jJzZKIEfSijBI9QUBxtqfzXeDjpRNxQmodZaJIbErfQMtAy wMiNxXblpnNXREWORnUCgFF/AAK6/9AF5bFJqgl7AA7XdWiwRsdzEq1QppRb/WoDPz2igC589/ f/030uCHsMGKiQcgqlzRJVcZfwwT3Xoxlxhu3U25mh0UOWhAxGOCH1xLyXHmaJTYBRUjitilnO YnQ=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Tue, Nov 17, 2020 at 04:09:49PM +0100, Manuel Bouyer wrote:
> Hello,
> so, after fixing an issue in the NetBSD kernel, related to PV clock
> interrupts, I'm back with physical interrupts issues.
> At some point in the initialisation, the dom0 kernel stops receiving
> interrupts for its disks controller. The disk controller is:
> [   1.0000030] mfii0 at pci6 dev 0 function 0: "PERC H740P Adapter ", 
> firmware 51.13.0-3485, 8192MB cache
> (XEN) d0: bind: m_gsi=34 g_gsi=34
> [   1.0000030] allocated pic ioapic2 type level pin 2 level 6 to cpu0 slot 2 
> idt entry 103
> [   1.0000030] mfii0: interrupting at ioapic2 pin 2
> 
> entering the NetBSD kenrel debugger and looking at interrupt counters,
> I see that some interrupts did trigger on ioapic2 pin 2, as well as for
> some other hardware controllers.
> I did print the controller's status when the command times out, and
> the controller says that there is an interrupt pending. So I guess that
> the command was executed, but the dom0 kernel didn't get interupted.
> 
> At this point I can't say if other hardware controller interripts are
> working (because of the lockdown I don't have physical access
> to the hardware).
> 
> What's strange is that some Xen console activity seems to be enough to
> resume interrupt activity. Hitting ^A 3 times is enough to get some progess
> on the dom0's disk controller, and hitting 'v' is usually enough to
> get the dom0 multiuser. Once there the systems looks stable, I can log
> in from network. But I/O may stall again on reboot, maybe because the
> dom0 kenrel is back using synchronous console output.

Hm, certainly weird.

> Any idea what to look at from here ?

I have attached a patch below that will dump the vIO-APIC info as part
of the 'i' debug key output, can you paste the whole output of the 'i'
debug key when the system stalls?

Can you assert that you properly EOI the vectors on the local APIC? (I
don't have a patch to dump the emulated lapic ISR right now, but could
provide one if needed).

Roger.
---8<---
diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
index 67d4a6237f..fd0d75db80 100644
--- a/xen/arch/x86/hvm/vioapic.c
+++ b/xen/arch/x86/hvm/vioapic.c
@@ -30,6 +30,7 @@
 #include <xen/lib.h>
 #include <xen/errno.h>
 #include <xen/sched.h>
+#include <xen/softirq.h>
 #include <xen/nospec.h>
 #include <public/hvm/ioreq.h>
 #include <asm/hvm/io.h>
@@ -720,3 +721,34 @@ void vioapic_deinit(struct domain *d)
 
     vioapic_free(d, d->arch.hvm.nr_vioapics);
 }
+
+void vioapic_dump(void)
+{
+    const struct domain *d = hardware_domain;
+    unsigned int i;
+
+    if ( !has_vioapic(d) )
+        return;
+
+    printk("vIO-APIC dom%u state:\n", d->domain_id);
+    for ( i = 0; i < d->arch.hvm.nr_vioapics; i++ )
+    {
+        const struct hvm_vioapic *vioapic = domain_vioapic(d, i);
+        unsigned int j;
+
+        for ( j = 0; j < vioapic->nr_pins; j++ )
+        {
+            const union vioapic_redir_entry *ent = &vioapic->redirtbl[j];
+
+            printk("ioapic %u pin %u gsi %u vector %#x\n"
+                   "  delivery mode %u dest mode %u delivery status %u\n"
+                   "  polarity %u IRR %u trig mode %u mask %u dest id %u\n",
+                   i, j, vioapic->base_gsi + j, ent->fields.vector,
+                   ent->fields.delivery_mode, ent->fields.dest_mode,
+                   ent->fields.delivery_status, ent->fields.polarity,
+                   ent->fields.remote_irr, ent->fields.trig_mode,
+                   ent->fields.mask, ent->fields.dest_id);
+            process_pending_softirqs();
+        }
+    }
+}
diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index 8d1f9a9fc6..bd208efc58 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -24,6 +24,7 @@
 #include <asm/msi.h>
 #include <asm/current.h>
 #include <asm/flushtlb.h>
+#include <asm/hvm/vioapic.h>
 #include <asm/mach-generic/mach_apic.h>
 #include <irq_vectors.h>
 #include <public/physdev.h>
@@ -441,8 +442,15 @@ int __init init_irq_data(void)
     set_bit(HYPERCALL_VECTOR, used_vectors);
 #endif
     
-    /* IRQ_MOVE_CLEANUP_VECTOR used for clean up vectors */
-    set_bit(IRQ_MOVE_CLEANUP_VECTOR, used_vectors);
+    /*
+     * Mark vectors up to the cleanup one as used, to prevent an infinite loop
+     * in irq_move_cleanup_interrupt.
+     */
+    BUILD_BUG_ON(IRQ_MOVE_CLEANUP_VECTOR < FIRST_DYNAMIC_VECTOR);
+    for ( vector = FIRST_DYNAMIC_VECTOR;
+          vector <= IRQ_MOVE_CLEANUP_VECTOR;
+          vector++ )
+        set_bit(vector, used_vectors);
 
     return 0;
 }
@@ -727,10 +735,6 @@ void irq_move_cleanup_interrupt(struct cpu_user_regs *regs)
 {
     unsigned vector, me;
 
-    /* This interrupt should not nest inside others. */
-    BUILD_BUG_ON(APIC_PRIO_CLASS(IRQ_MOVE_CLEANUP_VECTOR) !=
-                 APIC_PRIO_CLASS(FIRST_DYNAMIC_VECTOR));
-
     ack_APIC_irq();
 
     me = smp_processor_id();
@@ -764,6 +768,8 @@ void irq_move_cleanup_interrupt(struct cpu_user_regs *regs)
              cpumask_test_cpu(me, desc->arch.cpu_mask) )
             goto unlock;
 
+        BUG_ON(vector <= IRQ_MOVE_CLEANUP_VECTOR);
+
         irr = apic_read(APIC_IRR + (vector / 32 * 0x10));
         /*
          * Check if the vector that needs to be cleanedup is
@@ -2524,6 +2530,7 @@ static void dump_irqs(unsigned char key)
             printk("   %#02x -> %ps()\n", i, direct_apic_vector[i]);
 
     dump_ioapic_irq_info();
+    vioapic_dump();
 }
 
 static int __init setup_dump_irqs(void)
diff --git a/xen/include/asm-x86/hvm/vioapic.h 
b/xen/include/asm-x86/hvm/vioapic.h
index d6f4e12d54..8a3ad18b20 100644
--- a/xen/include/asm-x86/hvm/vioapic.h
+++ b/xen/include/asm-x86/hvm/vioapic.h
@@ -70,4 +70,6 @@ int vioapic_get_mask(const struct domain *d, unsigned int 
gsi);
 int vioapic_get_vector(const struct domain *d, unsigned int gsi);
 int vioapic_get_trigger_mode(const struct domain *d, unsigned int gsi);
 
+void vioapic_dump(void);
+
 #endif /* __ASM_X86_HVM_VIOAPIC_H__ */




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.