[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Xen-unstable: xen panic RIP: dpci_softirq
Tuesday, November 18, 2014, 9:56:33 PM, you wrote: >> >> Uhmm i thought i had these switched off (due to problems earlier and then >> forgot >> about them .. however looking at the earlier reports these lines were also >> in >> those reports). >> >> The xen-syms and these last runs are all with a prestine xen tree cloned >> today (staging >> branch), so the qemu-xen and seabios defined with that were also freshly >> cloned >> and had a new default seabios config. (just to rule out anything stale in my >> tree) >> >> If you don't see those messages .. perhaps your seabios and qemu trees (and >> at least the >> seabios config) are not the most recent (they don't get updated >> automatically >> when you just do a git pull on the main tree) ? >> >> In /tools/firmware/seabios-dir/.config i have: >> CONFIG_USB=y >> CONFIG_USB_UHCI=y >> CONFIG_USB_OHCI=y >> CONFIG_USB_EHCI=y >> CONFIG_USB_XHCI=y >> CONFIG_USB_MSC=y >> CONFIG_USB_UAS=y >> CONFIG_USB_HUB=y >> CONFIG_USB_KEYBOARD=y >> CONFIG_USB_MOUSE=y >> > I seem to have the same thing. Perhaps it is my XHCI controller being wonky. >> And this is all just from a: >> - git clone git://xenbits.xen.org/xen.git -b staging >> - make clean && ./configure && make -j6 && make -j6 install > Aye. > .. snip.. >> > 1) test_and_[set|clear]_bit sometimes return unexpected values. >> > [But this might be invalid as the addition of the ffff8303faaf25a8 >> > might be correct - as the second dpci the softirq is processing >> > could be the MSI one] >> >> Would there be an easy way to stress test this function separately in some >> debugging function to see if it indeed is returning unexpected values ? > Sadly no. But you got me looking in the right direction when you mentioned > 'timeout'. >> >> > 2) INIT_LIST_HEAD operations on the same CPU are not honored. >> >> Just curious, have you also tested the patches on AMD hardware ? > Yes. To reproduce this the first thing I did was to get an AMD box. >> >> >> >> When i look at the combination of (2) and (3), It seems it could be an >> >> interaction between the two passed through devices and/or different IRQ >> >> types. >> >> > Could be - as in it is causing this issue to show up faster than >> > expected. Or it is the one that triggers more than one dpci happening >> > at the same time. >> >> Well that didn't seem to be it (see separate amendment i mailed previously) > Right, the current theory I've is that the interrupts are not being > Acked within 8 milisecond and we reset the 'state' - and at the same > time we get an interrupt and schedule it - while we are still processing > the same interrupt. This would explain why the 'test_and_clear_bit' > got the wrong value. > In regards to the list poison - following this thread of logic - with > the 'state = 0' set we open the floodgates for any CPU to put the same > 'struct hvm_pirq_dpci' on its list. > We do reset the 'state' on _every_ GSI that is mapped to a guest - so > we also reset the 'state' for the MSI one (XHCI). Anyhow in your case: > CPUX: CPUY: > pt_irq_time_out: > state = 0; > [out of timer coder, the raise_softirq > pirq_dpci is on the dpci_list] [adds the pirq_dpci as state == 0] > softirq_dpci softirq_dpci: > list_del > [entries poison] > list_del <= BOOM > > Is what I believe is happening. > The INTX device - once I put a load on it - does not trigger > any pt_irq_time_out, so that would explain why I cannot hit this. > But I believe your card hits these "hiccups". Hi Konrad, I just tested you 5 patches and as a result i still got an(other) host crash: (complete serial log attached) (XEN) [2014-11-18 21:55:41.591] ----[ Xen-4.5.0-rc x86_64 debug=y Not tainted ]---- (XEN) [2014-11-18 21:55:41.591] CPU: 0 (XEN) [2014-11-18 21:55:41.591] ----[ Xen-4.5.0-rc x86_64 debug=y Not tainted ]---- (XEN) [2014-11-18 21:55:41.591] RIP: e008:[<ffff82d08012c7e7>]CPU: 2 (XEN) [2014-11-18 21:55:41.591] RIP: e008:[<ffff82d08014a461>] hvm_do_IRQ_dpci+0xbd/0x13c (XEN) [2014-11-18 21:55:41.591] RFLAGS: 0000000000010006 _spin_unlock+0x1f/0x30CONTEXT: hypervisor (XEN) [2014-11-18 21:55:41.591] (XEN) [2014-11-18 21:55:41.591] RFLAGS: 0000000000010246 rax: 0000000000000000 rbx: ffff8303773450a8 rcx: 0000000000000001 (XEN) [2014-11-18 21:55:41.591] CONTEXT: hypervisor (XEN) [2014-11-18 21:55:41.591] rdx: 0000000000000000 rsi: ffff83054ef4ef98 rdi: 0000000012aa5400 (XEN) [2014-11-18 21:55:41.591] rax: ffff82d080328da0 rbx: ffff8305186c5d80 rcx: 0000000000000000 (XEN) [2014-11-18 21:55:41.591] rbp: ffff83054ef47c88 rsp: ffff83054ef47c78 r8: ffff8305186c58d0 (XEN) [2014-11-18 21:55:41.591] r9: 000000000000002f r10: 00000000000000d0 r11: ffffffff829084b0 (XEN) [2014-11-18 21:55:41.591] rdx: ffff82d0802e0000 rsi: ffff83050aead2a8 rdi: 00000000000000b8 (XEN) [2014-11-18 21:55:41.591] rbp: ffff82d0802e7df8 rsp: ffff82d0802e7df8 r8: ffff82d0802e7d28 (XEN) [2014-11-18 21:55:41.591] r9: 0000000000000040 r10: 0000000000000000 r11: ffffffffffffffc0 (XEN) [2014-11-18 21:55:41.591] r12: ffff8305186c5d80 r13: ffff8303773450a8 r14: ffff8303773450b8 (XEN) [2014-11-18 21:55:41.591] r15: ffff8305186c5b00 cr0: 000000008005003b cr4: 00000000000006f0 (XEN) [2014-11-18 21:55:41.591] r12: ffff830515b5b000 r13: 0000000000000000 r14: ffff830377345080 (XEN) [2014-11-18 21:55:41.591] cr3: 000000054a215000 cr2: 00000000000000b8 (XEN) [2014-11-18 21:55:41.591] r15: 000000000000002f cr0: 000000008005003b cr4: 00000000000006f0 (XEN) [2014-11-18 21:55:41.591] ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 (XEN) [2014-11-18 21:55:41.591] cr3: 000000054a215000 cr2: 0000000000000160 (XEN) [2014-11-18 21:55:41.591] ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 (XEN) [2014-11-18 21:55:41.591] Xen stack trace from rsp=ffff82d0802e7df8: (XEN) [2014-11-18 21:55:41.591] ffff82d0802e7e48Xen stack trace from rsp=ffff83054ef47c78: (XEN) [2014-11-18 21:55:41.591] ffff82d08014a395 ffff83009fd2d060 ffff83054ef47c88 ffff8303773450b8 (XEN) [2014-11-18 21:55:41.591] ffffc900141f2b20 ffff82d080328f80 ffff830377345140 ffff82d08014a26e ffff8303773450a8 ffff83054ef47d18 (XEN) [2014-11-18 21:55:41.591] ffff82d080172060 000000943f43e518 ffff88002b227e18 ffff82d0802e7e78 (XEN) [2014-11-18 21:55:41.591] ffff82d08012f2c3 0000000000000286 (XEN) [2014-11-18 21:55:41.591] 0000000100000031 ffff82d08018b20f ffff82d080328f80 ffff83050b0bb5e0 ffff83054ef47cf8 ffff82d080178846 ffff8303773450e0 (XEN) [2014-11-18 21:55:41.591] (XEN) [2014-11-18 21:55:41.591] ffff82d0802e7ec8 ffff82d08012f3c3 ffff82d0802e7ef8 0000000000000000 ffff82d08022d5a1 (XEN) [2014-11-18 21:55:41.591] 000000943f65d8b4 ffff83055d002f24 0000000000000000 0000002f9ff88000 (XEN) [2014-11-18 21:55:41.591] ffff82d0802fff80 ffff83054ef47d28 000000000055d126 ffff83054ef12000 ffff82d0802fff80 ffffffffffffffff (XEN) [2014-11-18 21:55:41.591] ffff830515b5b0b8 ffff82d0802e0000 ffff88002b227e18 (XEN) [2014-11-18 21:55:41.591] ffff82d0802e7ef8 ffff82d0802fff80 ffff82d08012be31 ffff8303773450a8 ffff830515b5b000 (XEN) [2014-11-18 21:55:41.591] 0200200200200200 ffff83009fd2d000 (XEN) [2014-11-18 21:55:41.591] 00007cfab10b82b7 0000000000000001 ffff82d080233122 0200200200200200 ffff830515b5b000 (XEN) [2014-11-18 21:55:41.591] 0000000000000001 ffff88005925a1e8 (XEN) [2014-11-18 21:55:41.591] ffff8303773450a8 ffff82d0802fff80 ffff82d0802e7f08 ffff82d08012be89 ffff83054ef47dd8 00007d2f7fd180c7 ffff830515b5b0b8 ffff82d080232cd1 (XEN) [2014-11-18 21:55:41.591] ffff88002b227e18 ffff88005925a1e8 (XEN) [2014-11-18 21:55:41.591] 0000000000000001 0000000000000001 (XEN) [2014-11-18 21:55:41.591] ffff88002b227bb8 ffffffff829084b0 ffff88005f6d35a8 0000000000000000 0000000000000000 (XEN) [2014-11-18 21:55:41.591] 00000000000000d0 000000943f4e172d 0000000000000000 ffff830377345150 0000000000005776 ffffffff81c10cc0 (XEN) [2014-11-18 21:55:41.591] 000000943f43e300 0000000000000000 (XEN) [2014-11-18 21:55:41.591] 0000000000000001 0000000000000001 0000000000000000 ffff83054ef4ef98 (XEN) [2014-11-18 21:55:41.591] ffff830515b5b0bc 000000b900000000 ffff82d08012c69f ffff88005925a180 ffff88005f6d3500 000000000000e008 000000fa00000000 (XEN) [2014-11-18 21:55:41.591] 0000000000000246 (XEN) [2014-11-18 21:55:41.591] ffffffff810eab63 ffff83054ef47dd0 000000000000e033 0000000000000000 0000000000000286 ffff830377345110 ffff88002b227b68 (XEN) [2014-11-18 21:55:41.591] (XEN) [2014-11-18 21:55:41.591] 000000000000e02b ffff83054ef47ec8 ffff82d08014962d 000000000000beef 0000000000000100 ffff82d080328da0 (XEN) [2014-11-18 21:55:41.591] 000000000000beef 000000000000beef (XEN) [2014-11-18 21:55:41.591] 000000000000beef 0000000000000000 ffff83009fd2d000 ffff830512b6c068 0000000000000000 ffff83054ef4e540 (XEN) [2014-11-18 21:55:41.591] ffff83054ef4e400 0000000000000000 (XEN) [2014-11-18 21:55:41.591] Xen call trace: (XEN) [2014-11-18 21:55:41.591] [<ffff82d08012c7e7>] _spin_unlock+0x1f/0x30 (XEN) [2014-11-18 21:55:41.591] ffff830515b5b0b8 (XEN) [2014-11-18 21:55:41.591] 0000000100000000 ffff83054ef47e88 [<ffff82d08014a395>] pt_irq_time_out+0x127/0x136 (XEN) [2014-11-18 21:55:41.591] [<ffff82d08012f2c3>] execute_timer+0x4e/0x6c (XEN) [2014-11-18 21:55:41.591] [<ffff82d08012f3c3>] timer_softirq_action+0xe2/0x220 (XEN) [2014-11-18 21:55:41.591] [<ffff82d08012be31>] __do_softirq+0x81/0x8c (XEN) [2014-11-18 21:55:41.591] [<ffff82d08012be89>] do_softirq+0x13/0x15 (XEN) [2014-11-18 21:55:41.591] [<ffff82d080232cd1>] process_softirqs+0x21/0x30 (XEN) [2014-11-18 21:55:41.591] (XEN) [2014-11-18 21:55:41.591] ffff83054ef47e88 ffff83054ef47e88Pagetable walk from 00000000000000b8: (XEN) [2014-11-18 21:55:41.591] (XEN) [2014-11-18 21:55:41.591] ffff8303773450a8 L4[0x000] = 0000000000000000 ffffffffffffffff (XEN) [2014-11-18 21:55:41.591] 0000000000000082 ffff8303773450a8 (XEN) [2014-11-18 21:55:43.260] **************************************** (XEN) [2014-11-18 21:55:43.280] ffff830377345150Panic on CPU 0: (XEN) [2014-11-18 21:55:43.297] FATAL PAGE FAULT (XEN) [2014-11-18 21:55:43.310] [error_code=0000] (XEN) [2014-11-18 21:55:43.323] Faulting linear address: 00000000000000b8 (XEN) [2014-11-18 21:55:43.343] **************************************** (XEN) [2014-11-18 21:55:43.362] (XEN) [2014-11-18 21:55:43.371] Reboot in five seconds... (XEN) [2014-11-18 21:55:43.386] (XEN) [2014-11-18 21:55:43.395] ffff830515b5b000 0000000000000001 ffff830377345080 000000000000002f (XEN) [2014-11-18 21:55:43.422] ffff83054ef47f08 ffff82d0801721a3 ffff83054ef47e88 ffff83054ef47e88 (XEN) [2014-11-18 21:55:43.449] 00000ecc00000004 ffff82d080300080 ffff82d0802fff80 ffffffffffffffff (XEN) [2014-11-18 21:55:43.476] ffff83054ef40000 0000000000000001 ffff83054ef47ef8 ffff82d08012be31 (XEN) [2014-11-18 21:55:43.503] ffff83009ff88000 ffffffff83081590 ffffffff8221c520 ffffffff8221cc20 (XEN) [2014-11-18 21:55:43.530] Xen call trace: (XEN) [2014-11-18 21:55:43.543] [<ffff82d08014a461>] hvm_do_IRQ_dpci+0xbd/0x13c (XEN) [2014-11-18 21:55:43.565] [<ffff82d080172060>] do_IRQ+0x49c/0x624 (XEN) [2014-11-18 21:55:43.584] [<ffff82d080233122>] common_interrupt+0x62/0x70 (XEN) [2014-11-18 21:55:43.606] [<ffff82d08012c69f>] _spin_lock+0x1a/0x54 (XEN) [2014-11-18 21:55:43.626] [<ffff82d08014962d>] dpci_softirq+0x241/0x3ad (XEN) [2014-11-18 21:55:43.648] [<ffff82d08012be31>] __do_softirq+0x81/0x8c (XEN) [2014-11-18 21:55:43.669] [<ffff82d08012be89>] do_softirq+0x13/0x15 (XEN) [2014-11-18 21:55:43.689] [<ffff82d080232cd1>] process_softirqs+0x21/0x30 (XEN) [2014-11-18 21:55:43.711] (XEN) [2014-11-18 21:55:43.720] Pagetable walk from 0000000000000160: (XEN) [2014-11-18 21:55:43.738] L4[0x000] = 0000000000000000 ffffffffffffffff (XEN) [2014-11-18 21:55:43.759] (XEN) [2014-11-18 21:55:43.768] **************************************** (XEN) [2014-11-18 21:55:43.787] Panic on CPU 2: (XEN) [2014-11-18 21:55:43.800] FATAL PAGE FAULT (XEN) [2014-11-18 21:55:43.813] [error_code=0002] (XEN) [2014-11-18 21:55:43.826] Faulting linear address: 0000000000000160 (XEN) [2014-11-18 21:55:43.845] **************************************** (XEN) [2014-11-18 21:55:43.865] (XEN) [2014-11-18 21:55:43.873] Reboot in five seconds... # addr2line -e xen-syms ffff82d08012c7e7 /usr/src/new/xen-unstable-vanilla/xen/include/asm/spinlock.h:18 # addr2line -e xen-syms ffff82d08014a461 /usr/src/new/xen-unstable-vanilla/xen/include/asm/atomic.h:172 # addr2line -e xen-syms ffff82d080172060 /usr/src/new/xen-unstable-vanilla/xen/arch/x86/irq.c:1175 # addr2line -e xen-syms ffff82d080233122 /usr/src/new/xen-unstable-vanilla/xen/arch/x86/x86_64/entry.S:487 # addr2line -e xen-syms ffff82d08012c69f /usr/src/new/xen-unstable-vanilla/xen/common/spinlock.c:126 # addr2line -e xen-syms ffff82d08014962d /usr/src/new/xen-unstable-vanilla/xen/drivers/passthrough/io.c:835 # addr2line -e xen-syms ffff82d08014a395 /usr/src/new/xen-unstable-vanilla/xen/drivers/passthrough/io.c:339 # addr2line -e xen-syms ffff82d08012f2c3 /usr/src/new/xen-unstable-vanilla/xen/common/timer.c:426 Attachment:
serial.log _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |