[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Questions about GPLPV stability tests



On Mon, Dec 26, 2011 at 06:45:32PM +0000, Roderick Colenbrander wrote:
> I have finally got some results (the tests are still running). Let me
> first summarize what tests were running.

Thanks for being so dilligient in providing full details. It helps after
reading this after the holidays.
> 
> In total I had 4 machines. All systems had at least Xen 4.1.2 + Linux
> 2.6.32.48 (and one had a Xen 4.1.3 snapshot). I divided the systems
> into 2 groups and each group ran a different test. There are 2
> differencess between the groups:
> 1) one was the use of blktap vs not using blktap. The main reason for
> having the blktap systems in even though it adds noise, is that it
> kept some of tests close to what they previously were.
> 2) the non-blktap systems used a different cpu pinning configuration.
> Previously Dom0 was using 0-3 and DomU 4-7, but due to hyperthreading
> both VMs used the same cores (0 and 4 are on the same physical core).
> 
> Now to the initial results.
> - so far each machine has been up for 10 days.
> - one machine failed early on due to a PCI device assignment issue. I
> suspect that at some point due to a race condition, ownership of the
> PCI device wasn't transferred correctly back from DomU to Dom0 on VM
> shutdown. Xm and Xl respectively fail with 'Error: failed to assign
> device 03:00.0: maybe it has already been assigned to other domain, or
> maybe it doesn't exist.' From a quick glance at the logs it looks like

Lets ignore that one. It is harmelss and I should probably remove it.

> on shutdown of a previous VM, the 'unmap bdf' never happened. Not sure
> why, but I guess due to a potential toolchain bug.
> 
> - One of the non-blktap machines had a kernel Oops. It happened on VM
> shutdown and I wouldn't be surprised if it was similar to the crashes
> we wanted to debug. The system is in a usable state though, but this
> may have been due to the use of Linux 2.6.32.48 or the different CPU
> pinning configuration. Some of the serial output:
> 
> Thu Dec 22 23:17:38 2011 - 1232   0: 87    113   365   blkif-backend
>      525325sec
> 
> Thu Dec 22 23:30:07 2011 - 1259   0: 12    6     222   xenbus
>      526065sec
> Thu Dec 22 23:30:07 2011 - 1250   0: 62    42    1461  ahci
>      526065sec
> Thu Dec 22 23:30:07 2011 - 1249   0: 37    28    75    eth0-rx-0
>      526065sec
> Thu Dec 22 23:30:07 2011 - 1248   0: 71    305   933   eth0-tx-0
>      526065sec
> Thu Dec 22 23:30:07 2011 - 1243   0: 6     3     134
> evtchn:xenstored     526065sec
> Thu Dec 22 23:30:07 2011 - 1241   0: 6     3     87582
> evtchn:xenstored     526065sec
> Thu Dec 22 23:30:07 2011 - 1240   0: 256   261   54162 evtchn:qemu-dm
>      526065sec
> Thu Dec 22 23:30:07 2011 - 1239   0: 244   251   7219  evtchn:qemu-dm
>      526065sec
> Thu Dec 22 23:30:07 2011 - 1238   0: 289   273   5931  evtchn:qemu-dm
>      526065sec
> Thu Dec 22 23:30:07 2011 - 1237   0: 248   245   4279  evtchn:qemu-dm
>      526065sec
> Thu Dec 22 23:30:07 2011 - 1236   0: 12    7     315   blkif-backend
>      526065sec
> Thu Dec 22 23:30:07 2011 - 1234   0: 7     4     43    veth1
>      526065sec
> Thu Dec 22 23:30:07 2011 - 1232 CPU0 going backwards (-3304)!
> Thu Dec 22 23:30:07 2011 -   19 CPU0 going backwards (-212)!
> Thu Dec 22 23:30:07 2011 -   18   0: 35    17    35    ehci_hcd:usb1,
> uhci_hcd:usb8 526065sec
> Thu Dec 22 23:30:07 2011 -   16 CPU0 going backwards (-176)!
> Thu Dec 22 23:30:07 2011 -   12 CPU0 going backwards (-4)!
> Thu Dec 22 23:30:07 2011 -    4 CPU0 going backwards (-1)!
> Thu Dec 22 23:30:12 2011 - 1250   0: 384   213   1461  ahci
>      526070sec
> Thu Dec 22 23:30:12 2011 - 1249   0: 14    21    75    eth0-rx-0
>      526070sec
> Thu Dec 22 23:30:12 2011 - 1240   0: 260   260   54162 evtchn:qemu-dm
>      526070sec
> Thu Dec 22 23:30:12 2011 - 1239   0: 279   265   7219  evtchn:qemu-dm
>      526070sec
> Thu Dec 22 23:30:13 2011 - 1238   0: 271   272   5931  evtchn:qemu-dm
>      526070sec
> Thu Dec 22 23:30:13 2011 - 1237   0: 261   253   4279  evtchn:qemu-dm
>      526070sec
> Thu Dec 22 23:30:13 2011 - 1236   0: 145   76    315   blkif-backend
>      526070sec
> Thu Dec 22 23:30:18 2011 - 1250   0: 372   293   1461  ahci
>      526075sec
> Thu Dec 22 23:30:18 2011 - 1249   0: 26    24    75    eth0-rx-0
>      526075sec
> Thu Dec 22 23:30:18 2011 - 1248   0: 16    86    933   eth0-tx-0
>      526075sec
> Thu Dec 22 23:30:18 2011 - 1240   0: 234   247   54162 evtchn:qemu-dm
>      526075sec
> Thu Dec 22 23:30:18 2011 - 1239   0: 234   249   7219  evtchn:qemu-dm
>      526075sec
> Thu Dec 22 23:30:18 2011 - 1238   0: 241   256   5931  evtchn:qemu-dm
>      526075sec
> Thu Dec 22 23:30:18 2011 - 1237   0: 224   239   4279  evtchn:qemu-dm
>      526075sec
> Thu Dec 22 23:30:18 2011 - 1236   0: 100   88    315   blkif-backend
>      526075sec
> Thu Dec 22 23:30:23 2011 - 1249   0: 16    20    75    eth0-rx-0
>      526080sec
> Thu Dec 22 23:30:23 2011 - 1248   0: 7     46    933   eth0-tx-0
>      526080sec
> Thu Dec 22 23:30:23 2011 - 1240   0: 8     128   54162 evtchn:qemu-dm
>      526080sec
> Thu Dec 22 23:30:23 2011 - 1239   0: 16    133   7219  evtchn:qemu-dm
>      526080sec
> Thu Dec 22 23:30:23 2011 - 1238   0: 28    142   5931  evtchn:qemu-dm
>      526080sec
> Thu Dec 22 23:30:23 2011 - 1233 CPU0 going backwards (-12648)!
> Thu Dec 22 23:30:23 2011 -   19 CPU0 going backwards (-212)!
> Thu Dec 22 23:30:23 2011 -   18   0: 35    17    42    ehci_hcd:usb1,
> uhci_hcd:usb8 526080sec
> Thu Dec 22 23:30:23 2011 -   16 CPU0 going backwards (-176)!
> Thu Dec 22 23:30:23 2011 -   12 CPU0 going backwards (-4)!
> Thu Dec 22 23:30:23 2011 -    4 CPU0 going backwards (-1)!
> 
> [533804.785589] BUG: unable to handle kernel NULL pointer dereference
> at 0000000000000008
> [533804.793913] IP: [<ffffffff814bdba9>] _spin_lock+0x5/0x15
> [533804.799556] PGD 0
> [533804.801896] Oops: 0002 [#1] SMP
> [533804.805448] last sysfs file:
> /sys/devices/pci0000:00/0000:00:03.0/0000:03:00.0/class
> [533804.813736] CPU 0
> [533804.816066] Modules linked in: dm_snapshot dm_mod ipmi_si ipmi_devintf
> [533804.822914] Pid: 21632, comm: qemu-dm Not tainted 2.6.32.48 #1 X8STi
> [533804.829595] RIP: e030:[<ffffffff814bdba9>]  [<ffffffff814bdba9>]
> _spin_lock+0x5/0x15
> [533804.837873] RSP: e02b:ffff88005f187c50  EFLAGS: 00010202
> [533804.843513] RAX: 0000000000000100 RBX: ffff88007d6923c0 RCX:
> 0000000000000003
> [533804.851192] RDX: ffff88007deb3180 RSI: ffff88007d6923c0 RDI:
> 0000000000000008
> [533804.858871] RBP: ffff88007e260128 R08: 0000000000000000 R09:
> 0000000000000000
> [533804.866552] R10: ffff88007121eb40 R11: ffffffff811b862b R12:
> ffff88007fac1e40
> [533804.874541] R13: ffff88007e3c3340 R14: ffff88007e3c3340 R15:
> ffff88007fdfbc00
> [533804.882243] FS:  00007f5cdcefe6f0(0000) GS:ffff880028038000(0000)
> knlGS:0000000000000000
> [533804.890874] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> [533804.896948] CR2: 0000000000000008 CR3: 0000000001001000 CR4:
> 0000000000002660
> [533804.904627] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [533804.912306] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> 0000000000000400
> [533804.919985] Process qemu-dm (pid: 21632, threadinfo
> ffff88005f186000, task ffff880074f4e270)
> [533804.928971] Stack:
> [533804.931312]  ffffffff810a9ad0 ffff88007deb3180 ffff88007e260100
> ffff88007e260100
> [533804.938762] <0> ffffffff8124222d 0000000000000008 0000000000000008
> ffff88007deb3180
> [533804.946900] <0> ffffffff810b245e ffff88007fac1e40 ffff88007deb3180
> 0000000000000000
> [533804.955465] Call Trace:
> [533804.958244]  [<ffffffff810a9ad0>] ? mmu_notifier_unregister+0x27/0xa5
> [533804.965019]  [<ffffffff8124222d>] ? gntdev_release+0xc3/0xd1
> [533804.971007]  [<ffffffff810b245e>] ? __fput+0x100/0x1af
> [533804.976477]  [<ffffffff810af998>] ? filp_close+0x5b/0x62
> [533804.982119]  [<ffffffff81042989>] ? put_files_struct+0x64/0xc1
> [533804.988280]  [<ffffffff810441f0>] ? do_exit+0x209/0x68d
> [533804.993836]  [<ffffffff810441f8>] ? do_exit+0x211/0x68d
> [533804.999390]  [<ffffffff810446e9>] ? do_group_exit+0x75/0x9c
> [533805.005294]  [<ffffffff8104cf1d>] ? get_signal_to_deliver+0x30d/0x338
> [533805.012063]  [<ffffffff81010f7a>] ? do_notify_resume+0x8a/0x6b1
> [533805.018310]  [<ffffffff810bdb3a>] ? poll_select_copy_remaining+0xd0/0xf3
> [533805.025339]  [<ffffffff81011c8e>] ? int_signal+0x12/0x17
> [533805.030980] Code: 00 00 00 01 74 05 e8 67 1c d2 ff 48 89 d0 5e c3
> ff 14 25 20 2d 6a 81 f0 81 2f 00 00 00 01 74 05 e8 4d 1c d2 ff c3 b8
> 00 01 00 00 <f0> 66 0f c1 07 38 e0 74 06 f3 90 8a 07 eb f6 c3 f0 81 2f
> 00 00
> [533805.050454] RIP  [<ffffffff814bdba9>] _spin_lock+0x5/0x15
> [533805.056182]  RSP <ffff88005f187c50>
> [533805.059997] CR2: 0000000000000008
> [533805.063638] ---[ end trace 85ee7cbec9ce72ac ]---
> [533805.068584] Fixing recursive fault but reboot is needed!
> 
> If I had to guess, some of the gnttab code in qemu triggered a bug in
> the gntdev code? I have some experience with gnttab/gntdev, but don't
> know the inner parts of it very well.

It certainly looks that way. Or rather that we hit a race - what
mmu_notifier_unregister tried to call was already de-allocated.
[edit: Perhaps this is related to a TLB flush fix:
 7899891c7d161752f29abcc9bc0a9c6c3a3af26c xen mmu: fix a race window
causing leave_mm BUG()]

That would explain the hang you got. The qemu-dm is stuck waiting for
gntdev to respond (you should be able to verify this by attaching gdb to
qemu-dm and seeing the backtrace - it should be stuck at some ioctl
call). And the kernel sees that this particular process is not doing
anything. Also we have some gntdev in a bad state (but that should be OK
to the other processes) - so I am not sure how that "hangs" your box.

The interrupts being disabled does not seem to occur with this crash?
Does read_interrupts code you are running is still spewing data right? Or
does it stop as well?

But lets fix one thing at a time.

Looking at the code in 2.6.32 it is the version that went upstream
as git commit ab31523c2fcac557226bac72cbdf5fafe01f9a26 and it has
not had any of the features/bug-fixes that went with the upstream
version:

ab31523 xen/gntdev: allow usermode to map granted pages
8d3eaea xen/gntdev: add VM_PFNMAP to vma
9329e76 xen: gntdev: move use of GNTMAP_contains_pte next to the map_op
ba5d101 xen/gntdev: stop using "token" argument
f0a70c8 xen/gntdev: Fix circular locking dependency
a12b4eb xen gntdev: use gnttab_map_refs and gnttab_unmap_refs
ef91082 xen-gntdev: Change page limit to be global instead of per-open
a879211 xen-gntdev: Use find_vma rather than iterating our vma list
manually
68b025c xen-gntdev: Add reference counting to maps
aab8f11 xen-gntdev: Support mapping in HVM domains
bdc612d xen/gntalloc,gntdev: Add unmap notify ioctl
90b6f30 xen-gntdev: Fix memory leak when mmap fails
0ea22f0 xen-gntdev: Fix unmap notify on PV domains
84e4075 xen-gntdev: Use map->vma for checking map validity
b57c186 xen-gntdev: Avoid unmapping ranges twice
12996fc xen-gntdev: Avoid double-mapping memory
9960be9 xen-gntdev: prevent using UNMAP_NOTIFY_CLEAR_BYTE on read-only
mappings
77c35ac xen-gntdev: Fix incorrect use of zero handle
f4ee4af xen-gntdev: Add cast to pointer
38eaeb0 xen: gntdev: fix build warning
d79647a xen/gntdev,gntalloc: Remove unneeded VM flags
ca47cea xen-gntdev: Use ballooned pages for grant mappings
12f0258 xen-gntdev: return -EFAULT on copy_to_user failure
a93e20a xen-gntdev: unlock on error path in gntdev_mmap()

The big question is whether the bug you are hitting is fixed by one of those
git commits or that an unrelated fix (like the vmalloc_sync_all or the
kmap_atomic one) which are:
461ae488ecb125b140d7ea29ceeedbcce9327003 mm: sync vmalloc address space page 
tables in alloc_vm_area(
2cd1c8d4dc7ecca9e9431e2dabe41ae9c7d89e51 x86/paravirt: PTE updates in 
k(un)map_atomic need to be synchronous, regardless of lazy_mmu mode

But the errors that one gets without those two git commits is much
different from what you get. So I doubt it is one of those.

The CPU time going backwards is disturbing. It does imply that the Xen
hypervisor has stopped updating the timer information. Or maybe it has
not, but the gntdev has crashed and left all the interrupts disabled.

So three questions:
 1). Is the read_intr printing any data after the crash?
 2). If you attach gdb to qemu-dm can you see where it is stuck at?
 3). Is the console activate at all? If not, can you SSH in (some
     network cards switch to polling so you can still login in a machine
     even if the interrupts are turned off - the atl1c something can do
     it and I think the r8169 as well, but not sure).
 4). If you had configured your dom0 console to use the serial console
     instead of going through the Xen hypervisor console (so
     console=ttyS0 on Linux, and no com1=XXX and console=com1 on Xen
     hypervisor line), could you get a back-track of where the Linux
     kernel is at using Alt-sysRq?

Thanks and sorry for taking so long. Just coming back from holidays.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.