[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: NetBSD dom0 PVH: hardware interrupts stalls


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Mon, 23 Nov 2020 10:49:18 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=PJFMd475iwVIU2mGpNJbjXvsEuSEI59CiBzjPZm+gac=; b=Jqcygkd5aVibyw1QmZC3VVc/szyVI0+8O49vq3mRPrQRfOnJJoiT+qlnljL/N9SCzwcGkWUQAokJH1SHYkjXXGusfdZ7Z8oAml3oKcYrK4D8MALNZ5zKrQ/Ib+jCoQn8uWmcf8H9f6ry/ySUZFtviA12wXD09npHnmBUXGgsb7QFSOlM3/t/CJKYUSoFceqdcN1CfMvlIzbaAZ2AuJYD0LCPvaA2/RL32SsFTnltzSY0FeLAf50wNanr32bh7YHpn156XMwNV3WTZgXaHc2jTZ2NYxapuUU39u/AhiJZtuOgDadP90PZdZOumkgZRI6V6q/Z47yL620u5X8NzKeAFg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Cy4hQmYkr5U8k2Hg3yHcBK2wOgUz36CBFxsuXnu9OrFd7p3GpZsU8plDwUBSfMqBcDIfiQ4FXUEBZoPG+Cxw/0zPvNgi6xGO6meW4jRpVDnfJxYmmFIlC1oLvI3Y35T8S9qWpV7N2cyzDnE6Jolu7RapdXFxMzpNiV5XKw6bS/L6pRLiaUl3/6rJPhdBnrpVL1GxKkvyh9lVSpAabn7KNDTA+rSel8sb+OmioUVv0BFe4ht/awn1gm5ZZMXwv/YoxJRkXPutb/ND15k3yvaFu2w/7wkDkGA2PRwTaEmuO8fwt3cn1VA3xTCdfXuf1k9S85MtcMFG2fc2EiqwqKXDCQ==
  • Authentication-results: esa4.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: Manuel Bouyer <bouyer@xxxxxxxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Mon, 23 Nov 2020 09:49:51 +0000
  • Ironport-sdr: TOBUjuq1fzHaVhzdt5eMkZuBUJL851VcAtdqAE+Jxrz60PvJmouCVm6wE8Egi8vep/mrDLP88C +msOXakPrCYIZQaKjqEFZyuuk7SBaSuzm8eWe6XbG+93I4C4gaMgMs0xKmtrR8gND/ZVwGty2M GFtR09/zVXOTRqEiRf1DrvHnzGq0Y5hCrsHdArovH/Prf7ooGSDedxrjF6qgDxFoGcuUKGD+9R FD8QYrZkK4ZNKx7zvysLHAYyJGUMCEhmki2B0m58gdetDG6dTgcihZc7IP2xTq+0Mzns9Qx3ht T5o=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Fri, Nov 20, 2020 at 09:54:42AM +0100, Jan Beulich wrote:
> On 20.11.2020 09:28, Roger Pau Monné wrote:
> > On Fri, Nov 20, 2020 at 09:09:51AM +0100, Jan Beulich wrote:
> >> On 19.11.2020 18:57, Manuel Bouyer wrote:
> >>> I added an ASSERT() after the printf to ket a stack trace, and got:
> >>> db{0}> call ioapic_dump_raw^M
> >>> Register dump of ioapic0^M
> >>> [  13.0193374] 00 08000000 00170011 08000000(XEN) vioapic.c:141:d0v0 
> >>> apic_mem_readl:undefined ioregsel 3
> >>> (XEN) vioapic.c:512:vioapic_irq_positive_edge: vioapic_deliver 2
> >>> (XEN) Assertion '!print' failed at vioapic.c:512
> >>> (XEN) ----[ Xen-4.15-unstable  x86_64  debug=y   Tainted:   C   ]----
> >>> (XEN) CPU:    0
> >>> (XEN) RIP:    e008:[<ffff82d0402c4164>] 
> >>> vioapic_irq_positive_edge+0x14e/0x150
> >>> (XEN) RFLAGS: 0000000000010202   CONTEXT: hypervisor (d0v0)
> >>> (XEN) rax: ffff82d0405c806c   rbx: ffff830836650580   rcx: 
> >>> 0000000000000000
> >>> (XEN) rdx: ffff8300688bffff   rsi: 000000000000000a   rdi: 
> >>> ffff82d0404b36b8
> >>> (XEN) rbp: ffff8300688bfde0   rsp: ffff8300688bfdc0   r8:  
> >>> 0000000000000004
> >>> (XEN) r9:  0000000000000032   r10: 0000000000000000   r11: 
> >>> 00000000fffffffd
> >>> (XEN) r12: ffff8308366dc000   r13: 0000000000000022   r14: 
> >>> ffff8308366dc31c
> >>> (XEN) r15: ffff8308366d1d80   cr0: 0000000080050033   cr4: 
> >>> 00000000003526e0
> >>> (XEN) cr3: 00000008366c9000   cr2: 0000000000000000
> >>> (XEN) fsb: 0000000000000000   gsb: 0000000000000000   gss: 
> >>> 0000000000000000
> >>> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
> >>> (XEN) Xen code around <ffff82d0402c4164> 
> >>> (vioapic_irq_positive_edge+0x14e/0x150):
> >>> (XEN)  3d 10 be 1d 00 00 74 c2 <0f> 0b 55 48 89 e5 41 57 41 56 41 55 41 
> >>> 54 53 48
> >>> (XEN) Xen stack trace from rsp=ffff8300688bfdc0:
> >>> (XEN)    0000000200000086 ffff8308366dc000 0000000000000022 
> >>> 0000000000000000
> >>> (XEN)    ffff8300688bfe08 ffff82d0402bcc33 ffff8308366dc000 
> >>> 0000000000000022
> >>> (XEN)    0000000000000001 ffff8300688bfe40 ffff82d0402bd18f 
> >>> ffff830835a7eb98
> >>> (XEN)    ffff8308366dc000 ffff830835a7eb40 ffff8300688bfe68 
> >>> 0100100100100100
> >>> (XEN)    ffff8300688bfea0 ffff82d04026f6e1 ffff830835a7eb30 
> >>> ffff8308366dc0f4
> >>> (XEN)    ffff830835a7eb40 ffff8300688bfe68 ffff8300688bfe68 
> >>> ffff82d0405cec80
> >>> (XEN)    ffffffffffffffff ffff82d0405cec80 0000000000000000 
> >>> ffff82d0405d6c80
> >>> (XEN)    ffff8300688bfed8 ffff82d04022b6fa ffff83083663f000 
> >>> ffff83083663f000
> >>> (XEN)    0000000000000000 0000000000000000 0000000a7c62165b 
> >>> ffff8300688bfee8
> >>> (XEN)    ffff82d04022b798 ffff8300688bfe08 ffff82d0402a4bcb 
> >>> 0000000000000000
> >>> (XEN)    0000000000000206 ffff8316da86e61c ffff8316da86e600 
> >>> ffff938031fd47c0
> >>> (XEN)    0000000000000003 0000000000000400 ff889e8da08f928a 
> >>> 0000000000000000
> >>> (XEN)    0000000000000002 0000000000000100 000000000000b86e 
> >>> ffff93803237f010
> >>> (XEN)    0000000000000000 ffff8316da86e61c 0000beef0000beef 
> >>> ffffffff80555918
> >>> (XEN)    000000bf0000beef 0000000000000046 ffff938031fd4790 
> >>> 000000000000beef
> >>> (XEN)    000000000000beef 000000000000beef 000000000000beef 
> >>> 000000000000beef
> >>> (XEN)    0000e01000000000 ffff83083663f000 0000000000000000 
> >>> 00000000003526e0
> >>> (XEN)    0000000000000000 0000000000000000 0000060100000001 
> >>> 0000000000000000
> >>> (XEN) Xen call trace:
> >>> (XEN)    [<ffff82d0402c4164>] R vioapic_irq_positive_edge+0x14e/0x150
> >>> (XEN)    [<ffff82d0402bcc33>] F arch/x86/hvm/irq.c#assert_gsi+0x5e/0x7b
> >>> (XEN)    [<ffff82d0402bd18f>] F hvm_gsi_assert+0x62/0x77
> >>> (XEN)    [<ffff82d04026f6e1>] F 
> >>> drivers/passthrough/io.c#dpci_softirq+0x261/0x29e
> >>> (XEN)    [<ffff82d04022b6fa>] F common/softirq.c#__do_softirq+0x8a/0xbf
> >>> (XEN)    [<ffff82d04022b798>] F do_softirq+0x13/0x15
> >>> (XEN)    [<ffff82d0402a4bcb>] F vmx_asm_do_vmentry+0x2b/0x30
> >>> (XEN) 
> >>> (XEN) 
> >>> (XEN) ****************************************
> >>> (XEN) Panic on CPU 0:
> >>> (XEN) Assertion '!print' failed at vioapic.c:512
> >>> (XEN) ****************************************
> >>
> >> Right, this was the expected path after what you've sent prior to this.
> >> Which turned my attention back to the 'i' debug key output you had sent
> >> the other day. There we have
> >>
> >> (XEN)    IRQ:  34 vec:51 IO-APIC-level   status=010 aff:{0}/{0-7} 
> >> in-flight=1 d0: 34(-MM)
> >>
> >> i.e. at that point we're waiting for Dom0 to signal it's done handling
> >> the IRQ. There is, however, a timer associated with this. Yet that's
> >> actually to prevent the system getting stuck, i.e. the "in-flight"
> >> state ought to clear 1ms later (when that timer expires), and hence
> >> ought to be pretty unlikely to catch when non-zero _and_ something's
> >> actually stuck.
> > 
> > I somehow assumed the interrupt was in-flight because the printing to
> > the Xen console caused one to be injected, and thus dom0 didn't had
> > time to Ack it yet.
> 
> By "injected" you mean from Xen into Dom0, or by the hardware for Xen
> to handle? (I ask because I think I saw you use the term also for the
> latter case, in some context.) If the former, then something would
> need to have caused Xen to inject it, while in the latter case there
> would need to have been a reason that it didn't get delivered earlier.

Sorry, wrote this in a hurry and didn't realize it could be
misleading. I meant injected from hardware to Xen, which would then
also be injected from Xen to dom0.

I would expect softirqs to be running normally (as you have already
asked and Manuel proved the watchdog is not triggering).

Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.