[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen-unstable 4.8: Host crash when shutting down guest with pci device passed through using MSI-X interrupts.



On 2016-07-18 22:57, Andrew Cooper wrote:
On 18/07/2016 20:26, Sander Eikelenboom wrote:
Monday, July 18, 2016, 7:48:20 PM, you wrote:

On 18/07/16 11:21, linux@xxxxxxxxxxxxxx wrote:
Hi Jan,

It seems that since your patch series starting with commit:
2016-06-22 x86/vMSI-X: defer intercept handler registration
74c6dc2d0ac4dcab0c6243cdf6ed550c1532b798

The shutdown of a guest which has a PCI device passed through which
uses MSI-X interrupts causes
a host crash, see the splat below. Somehow it also doesn't reboot in 5
seconds as it is supposed to (i don't have no-reboot on the command
line).

--
Sander


(XEN) [2016-07-16 16:03:17.069] ----[ Xen-4.8-unstable  x86_64
debug=y  Not tainted ]----
(XEN) [2016-07-16 16:03:17.069] CPU:    0
(XEN) [2016-07-16 16:03:17.069] RIP:    e008:[<ffff82d0801e39de>]
msixtbl_pt_unregister+0x7b/0xd9
(XEN) [2016-07-16 16:03:17.069] RFLAGS: 0000000000010082   CONTEXT:
hypervisor (d0v0)
(XEN) [2016-07-16 16:03:17.069] rax: ffff83055c678e40   rbx:
ffff83055c685500   rcx: 0000000000000001
(XEN) [2016-07-16 16:03:17.069] rdx: 0000000000000000   rsi:
0000000000001ab0   rdi: ffff8305313b85a0
(XEN) [2016-07-16 16:03:17.069] rbp: ffff83009fd07c78   rsp:
ffff83009fd07c68   r8:  ffff8305356dfff0
(XEN) [2016-07-16 16:03:17.069] r9:  ffff8305356df480   r10:
ffff830503420c50   r11: 0000000000000282
(XEN) [2016-07-16 16:03:17.069] r12: ffff8305313b8000   r13:
ffff83009fd07e48   r14: ffff8305313b8000
(XEN) [2016-07-16 16:03:17.069] r15: ffff8305356df4a8   cr0:
0000000080050033   cr4: 00000000000006e0
(XEN) [2016-07-16 16:03:17.069] cr3: 000000053639f000   cr2:
0000000000000000
(XEN) [2016-07-16 16:03:17.069] ds: 0000   es: 0000   fs: 0000   gs:
0000   ss: e010   cs: e008
(XEN) [2016-07-16 16:03:17.069] Xen code around <ffff82d0801e39de>
(msixtbl_pt_unregister+0x7b/0xd9):
(XEN) [2016-07-16 16:03:17.069] 39 42 18 74 19 48 89 ca <48> 8b 0a 0f
18 09 48 39 fa 75 ec 48 8d 7b 24 e8
(XEN) [2016-07-16 16:03:17.069] Xen stack trace from
rsp=ffff83009fd07c68:
(XEN) [2016-07-16 16:03:17.069]    0000000000000000 ffff8305356df480
ffff83009fd07ce8 ffff82d08014c394
(XEN) [2016-07-16 16:03:17.069]    0000000000000001 ffff8305356df480
0000000000000293 ffff8305313b80cc
(XEN) [2016-07-16 16:03:17.069]    000000568012ffe5 ffff8305313b8000
ffff83009fd07cd8 ffff83009fd07e38
(XEN) [2016-07-16 16:03:17.070]    0000000000000000 ffff83054e5fc000
00007fc25a33e004 ffff8305313b8000
(XEN) [2016-07-16 16:03:17.070]    ffff83009fd07da8 ffff82d0801629c8
0000000000000000 ffff83053b1191f0
(XEN) [2016-07-16 16:03:17.070]    0000000000000246 ffff83009fd07d28
ffff82d0801300ae 000000000000000e
(XEN) [2016-07-16 16:03:17.070]    ffff83009fd07d78 ffff82d080171497
ffff83009fd07d78 000000020001d17b
(XEN) [2016-07-16 16:03:17.070]    ffff83009fd07d68 0000000000000000
ffff83009fd07d68 ffff82d080130280
(XEN) [2016-07-16 16:03:17.070]    ffff83009fd07d78 ffff82d08014d0aa
0000000000000202 0000000000000000
(XEN) [2016-07-16 16:03:17.070]    ffff8305313b8000 ffff88005716d320
0000000000305000 00007fc25a33e004
(XEN) [2016-07-16 16:03:17.070]    ffff83009fd07ef8 ffff82d080104b2c
0000000000000206 0000000000000002
(XEN) [2016-07-16 16:03:17.070]    ffff83009fd07df8 ffff82d08018c9db
0000000000000cfe 0000000000000002
(XEN) [2016-07-16 16:03:17.070]    0000000000000002 ffff83054e5fc000
ffff83009fd07e48 ffff82d08019c119
(XEN) [2016-07-16 16:03:17.070]    ffff83009fd07e38 0000000080121177
ffff83009fd07e38 0000000000000cfe
(XEN) [2016-07-16 16:03:17.070]    ffff83009fd07f18 0000000000000206
0000000c00000030 000056082bb90013
(XEN) [2016-07-16 16:03:17.070]    0000000200000056 00007fc200000013
0000305600000000 000056082b87465d
(XEN) [2016-07-16 16:03:17.070]    00007ffe268206e0 00007fc25606b31f
0000000000000000 000056082b8746cf
(XEN) [2016-07-16 16:03:17.070]    0000000000001000 fee5600026820730
00007ffe26820740 000056082b8797be
(XEN) [2016-07-16 16:03:17.070]    00000000fee56000 0000430026820772
00007ffe26820740 0000000000003056
(XEN) [2016-07-16 16:03:17.070]    00007ffe268206e0 ffff83009ff8a000
00007ffe26820580 ffff88005716d320
(XEN) [2016-07-16 16:03:17.070] Xen call trace:
(XEN) [2016-07-16 16:03:17.070]    [<ffff82d0801e39de>]
msixtbl_pt_unregister+0x7b/0xd9
(XEN) [2016-07-16 16:03:17.070]    [<ffff82d08014c394>]
pt_irq_destroy_bind+0x2be/0x3f0
(XEN) [2016-07-16 16:03:17.070]    [<ffff82d0801629c8>]
arch_do_domctl+0xc77/0x2414
(XEN) [2016-07-16 16:03:17.070]    [<ffff82d080104b2c>]
do_domctl+0x19db/0x1d26
(XEN) [2016-07-16 16:03:17.070]    [<ffff82d0802426bd>]
lstar_enter+0xdd/0x137
(XEN) [2016-07-16 16:03:17.070]
(XEN) [2016-07-16 16:03:17.070] Pagetable walk from 0000000000000000:
(XEN) [2016-07-16 16:03:17.070]  L4[0x000] = 0000000000000000
ffffffffffffffff
(XEN) [2016-07-16 16:03:18.147]
(XEN) [2016-07-16 16:03:18.155] ****************************************
(XEN) [2016-07-16 16:03:18.175] Panic on CPU 0:
(XEN) [2016-07-16 16:03:18.187] FATAL PAGE FAULT
(XEN) [2016-07-16 16:03:18.200] [error_code=0000]
(XEN) [2016-07-16 16:03:18.214] Faulting linear address: 0000000000000000 (XEN) [2016-07-16 16:03:18.233] ****************************************
(XEN) [2016-07-16 16:03:18.252]
(XEN) [2016-07-16 16:03:18.261] Reboot in five seconds...

Can you paste the disassembly of msixtbl_pt_unregister() please? That
is a dereference of %rdx which is NULL at this point, but I need to
figure out which pointer it is supposed to be.
Hi Andrew,

<snip>

Thanks.  What has happened is that the msixtbl linked list is still
uninitialised at this point.  The only way I can see for this to happen
is that msixtbl_init() hasn't been called, or hasn't passed its first if
condition.  The INIT_LIST_HEAD() visible in the context of the 2nd hunk
of identified changeset is the line of code which changes the list from
0 to initialised, and I don't see anywhere which re-zeros it later.

This alone suggests that the VM in question isn't actually using MSI-X
interrupts, even if the device passed through is capable.

Hmm didn't actually check this before, but you seem to be right
(below is the lspci output from within the guest).


Following the style of the identified changeset,

andrewcoop@andrewcoop:/local/xen.git/xen$ git diff
diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
index e418b98..c533719 100644
--- a/xen/arch/x86/hvm/vmsi.c
+++ b/xen/arch/x86/hvm/vmsi.c
@@ -519,7 +519,7 @@ void msixtbl_pt_unregister(struct domain *d, struct
pirq *pirq)
     ASSERT(pcidevs_locked());
     ASSERT(spin_is_locked(&d->event_lock));

-    if ( !has_vlapic(d) )
+    if ( !d->arch.hvm_domain.msixtbl_list.next )
         return;

     irq_desc = pirq_spin_lock_irq_desc(pirq, NULL);

should resolve your issue, although I am very tempted to replace the
opencoded list logic with a msixtbl_initialised() predicate instead.

~Andrew

It does resolve the issue, thanks !

--
Sander

00:05.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Turks PRO [Radeon HD 6570/7570/8550] (prog-if 00 [VGA controller]) Subsystem: PC Partner Limited / Sapphire Technology Turks PRO [Radeon HD 6570/7570/8550]
        Physical Slot: 5
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin A routed to IRQ 68
        Region 0: Memory at e0000000 (64-bit, prefetchable) [size=256M]
        Region 2: Memory at f3060000 (64-bit, non-prefetchable) [size=128K]
        Region 4: I/O ports at c100 [size=256]
        Expansion ROM at f3080000 [disabled] [size=128K]
        Capabilities: [50] Power Management version 3
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- 
Unsupported-
                        RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
                        MaxPayload 128 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- 
TransPend-
LnkCap: Port #1, Speed 5GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
                         EqualizationPhase2-, EqualizationPhase3-, 
LinkEqualizationRequest-
        Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
                Address: 00000000fee57000  Data: 4300
        Kernel driver in use: radeon

00:06.0 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Turks/Whistler HDMI Audio [Radeon HD 6000 Series] Subsystem: PC Partner Limited / Sapphire Technology Turks/Whistler HDMI Audio [Radeon HD 6000 Series]
        Physical Slot: 6
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin B routed to IRQ 79
        Region 0: Memory at f30b0000 (64-bit, non-prefetchable) [size=16K]
        Capabilities: [50] Power Management version 3
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- 
Unsupported-
                        RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
                        MaxPayload 128 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- 
TransPend-
LnkCap: Port #1, Speed 5GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
                         EqualizationPhase2-, EqualizationPhase3-, 
LinkEqualizationRequest-
        Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
                Address: 00000000fee56000  Data: 4300
        Kernel driver in use: snd_hda_intel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.