[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [SPAM] Re: kernel BUG at arch/x86/xen/mmu.c:1860! - ideas.
Hi guys, This thread has gone quiet for a while and I was wondering if a solution had been found? I'm currently running the packaged version of Xen 4.0.1 in Debian Squeeze and everything runs well, except for the random crashing when using LVM. I use LVM for the disk partitions, and use live snapshots as part of our backup routine. That is, create snapshot -> mount snapshot -> rsync -> umount snapshot -> remove snapshot. Cheers, Dave Hunter. On Mon, 2011-03-28 at 20:29 +0800, Teck Choon Giam wrote: > On Mon, Mar 28, 2011 at 7:37 PM, Andreas Olsowski > <andreas.olsowski@xxxxxxxxxxx> wrote: > > > >> - turn on CONFIG_DEBUG_PAGEALLOC > >> - turn on CONFIG_DEBUG_LIST > >> - turn on CONFIG_DEBUG_KMEMLEAK > >> - turn on CONFIG_JBD_DEBUG, CONFIG_JBD2_DEBUG > >> - turn on CONFIG_SLUB_DEBUG_ON > > > > After i enabled those options (i dont use SLUB, i use SLAB) i do no longer > > encounter any errors. > > > > I completed 1000 loops of snapshot/mount/umoun/removesnapshot. > > Did you try with just CONFIG_DEBUG_PAGEALLOC=y and leave the rest > unchange of your config? My testing all narrow down to > CONFIG_DEBUG_PAGEALLOC=y to prevent this BUG. > > > > > > > Without those options in 2.6.32.35 i hit a different bug earlier today: > > > > But you really have to be patient to see some output, because lvremove will > > hang quite a while: > > (a "while" beeing the a a roughly the time it takes for: wait 5 min for > > error, leave office, get coffee, smoke cigarette, goto restroom, return to > > office, finally see error) > > > > kernel: BUG: unable to handle kernel paging request > > ... > > kernel: RIP [<ffffffff8100f2bf>] xen_set_pmd+0x2f/0xb0 > > syslog/dmesg output is attached as crash.2.6.32.35-xen_01 or available at: > > http://pastebin.com/Ad8MhUzD > > I hit this before: > > # grep 'xen_set_pmd' /var/log/messages* > /var/log/messages:Mar 27 09:31:14 xen05 kernel: IP: > [<ffffffff8100e2d4>] xen_set_pmd+0x16/0x2b > /var/log/messages:Mar 27 09:31:14 xen05 kernel: RIP: > e030:[<ffffffff8100e2d4>] [<ffffffff8100e2d4>] xen_set_pmd+0x16/0x2b > /var/log/messages:Mar 27 09:31:14 xen05 kernel: RIP > [<ffffffff8100e2d4>] xen_set_pmd+0x16/0x2b > /var/log/messages:Mar 27 09:06:10 xen05 kernel: IP: > [<ffffffff8100e2d4>] xen_set_pmd+0x16/0x2b > /var/log/messages:Mar 27 09:06:10 xen05 kernel: RIP: > e030:[<ffffffff8100e2d4>] [<ffffffff8100e2d4>] xen_set_pmd+0x16/0x2b > /var/log/messages:Mar 27 09:06:10 xen05 kernel: RIP > [<ffffffff8100e2d4>] xen_set_pmd+0x16/0x2b > /var/log/messages:Mar 27 15:18:57 xen05 kernel: IP: > [<ffffffff8100e2d4>] xen_set_pmd+0x16/0x2b > /var/log/messages:Mar 27 15:18:57 xen05 kernel: RIP: > e030:[<ffffffff8100e2d4>] [<ffffffff8100e2d4>] xen_set_pmd+0x16/0x2b > /var/log/messages:Mar 27 15:18:57 xen05 kernel: RIP > [<ffffffff8100e2d4>] xen_set_pmd+0x16/0x2b > /var/log/messages.1:Mar 23 11:00:16 xen05 kernel: IP: > [<ffffffff8100e2d4>] xen_set_pmd+0x16/0x2b > /var/log/messages.1:Mar 23 11:00:16 xen05 kernel: RIP: > e030:[<ffffffff8100e2d4>] [<ffffffff8100e2d4>] xen_set_pmd+0x16/0x2b > /var/log/messages.1:Mar 23 11:00:17 xen05 kernel: RIP > [<ffffffff8100e2d4>] xen_set_pmd+0x16/0x2b > > But unable to reproduce when CONFIG_DEBUG_PAGEALLOC=y. > > > > > After that happened i did a kernel recompile without rebooting the machine > > first and encoundeterd system_call_fastpath as last call once more as shown > > in crash.2.6.32.35-xen_02 or http://pastebin.com/kB38W5mp > > I hit this at least once but unable to when CONFIG_DEBUG_PAGEALLOC=y: > > /var/log/messages-Mar 27 17:04:39 xen05 kernel: ------------[ cut here > ]------------ > /var/log/messages-Mar 27 17:04:39 xen05 kernel: kernel BUG at > arch/x86/xen/mmu.c:1872! > /var/log/messages-Mar 27 17:04:39 xen05 kernel: invalid opcode: 0000 [#1] SMP > /var/log/messages-Mar 27 17:04:39 xen05 kernel: last sysfs file: > /sys/block/sdd/dev > /var/log/messages-Mar 27 17:04:39 xen05 kernel: CPU 2 > /var/log/messages-Mar 27 17:04:39 xen05 kernel: Modules linked in: > ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 > xt_state nf_conntrack ipt_REJECT xt_tcpudp xt_physdev iptable_filter > ip_tables x_tables bridge stp be2iscsi iscsi_tcp bnx2i cnic uio ipv6 > cxgb3i cxgb3 mdio libiscsi_tcp libiscsi scsi_transport_iscsi > dm_multipath scsi_dh video backlight output sbs sbshc power_meter > hwmon battery acpi_memhotplug xen_acpi_memhotplug ac parport_pc lp > parport tg3 libphy sg ide_cd_mod cdrom serio_raw button tpm_tis tpm > tpm_bios i2c_i801 i2c_core shpchp iTCO_wdt pcspkr dm_snapshot dm_zero > dm_mirror dm_region_hash dm_log dm_mod ata_piix libata sd_mod scsi_mod > raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode] > /var/log/messages-Mar 27 17:04:39 xen05 kernel: Pid: 5874, comm: > lvcreate Not tainted 2.6.32.35-4.xen.pvops.choon.centos5 #1 PowerEdge > 860 > /var/log/messages-Mar 27 17:04:39 xen05 kernel: RIP: > e030:[<ffffffff8100cb5b>] [<ffffffff8100cb5b>] > pin_pagetable_pfn+0x53/0x59 > /var/log/messages-Mar 27 17:04:39 xen05 kernel: RSP: > e02b:ffff8800303d1c28 EFLAGS: 00010282 > /var/log/messages-Mar 27 17:04:39 xen05 kernel: RAX: 00000000ffffffea > RBX: 000000000003032d RCX: 0000000000000181 > /var/log/messages-Mar 27 17:04:39 xen05 kernel: RDX: 00000000deadbeef > RSI: 00000000deadbeef RDI: 00000000deadbeef > /var/log/messages-Mar 27 17:04:39 xen05 kernel: RBP: ffff8800303d1c48 > R08: 0000000000000968 R09: ffff880000000000 > /var/log/messages-Mar 27 17:04:39 xen05 kernel: R10: 00000000deadbeef > R11: ffff8800303d1d08 R12: 0000000000000003 > /var/log/messages-Mar 27 17:04:39 xen05 kernel: R13: 000000000003032d > R14: ffff880030360000 R15: 00007fd324a00000 > /var/log/messages-Mar 27 17:04:39 xen05 kernel: FS: > 00007fd327d2e710(0000) GS:ffff880028089000(0000) > knlGS:0000000000000000 > /var/log/messages-Mar 27 17:04:39 xen05 kernel: CS: e033 DS: 0000 ES: > 0000 CR0: 000000008005003b > /var/log/messages-Mar 27 17:04:39 xen05 kernel: CR2: 00000000004612f0 > CR3: 000000003a025000 CR4: 0000000000002660 > /var/log/messages-Mar 27 17:04:39 xen05 kernel: DR0: 0000000000000000 > DR1: 0000000000000000 DR2: 0000000000000000 > /var/log/messages-Mar 27 17:04:39 xen05 kernel: DR3: 0000000000000000 > DR6: 00000000ffff0ff0 DR7: 0000000000000400 > /var/log/messages-Mar 27 17:04:39 xen05 kernel: Process lvcreate (pid: > 5874, threadinfo ffff8800303d0000, task ffff880030360000) > /var/log/messages-Mar 27 17:04:39 xen05 kernel: Stack: > /var/log/messages-Mar 27 17:04:39 xen05 kernel: 0000000000000000 > 00000000002027a9 000000013eb43318 000000000003032d > /var/log/messages-Mar 27 17:04:39 xen05 kernel: <0> ffff8800303d1c68 > ffffffff8100e07c ffff880032be05c0 ffff880032aa9928 > /var/log/messages-Mar 27 17:04:39 xen05 kernel: <0> ffff8800303d1c78 > ffffffff8100e0af ffff8800303d1cb8 ffffffff810a4433 > /var/log/messages-Mar 27 17:04:39 xen05 kernel: Call Trace: > /var/log/messages-Mar 27 17:04:39 xen05 kernel: [<ffffffff8100e07c>] > xen_alloc_ptpage+0x64/0x69 > /var/log/messages-Mar 27 17:04:39 xen05 kernel: [<ffffffff8100e0af>] > xen_alloc_pte+0xe/0x10 > /var/log/messages-Mar 27 17:04:39 xen05 kernel: [<ffffffff810a4433>] > __pte_alloc+0x70/0xce > /var/log/messages-Mar 27 17:04:39 xen05 kernel: [<ffffffff810a45d1>] > handle_mm_fault+0x140/0x8b9 > /var/log/messages-Mar 27 17:04:39 xen05 kernel: [<ffffffff810a50c9>] > __get_user_pages+0x37f/0x479 > /var/log/messages-Mar 27 17:04:39 xen05 kernel: [<ffffffff810a76ca>] > __mlock_vma_pages_range+0xc0/0x16f > /var/log/messages-Mar 27 17:04:39 xen05 kernel: [<ffffffff8131c03f>] > ? _spin_unlock_irqrestore+0x11/0x13 > /var/log/messages-Mar 27 17:04:39 xen05 kernel: [<ffffffff810a78db>] > mlock_fixup+0x162/0x199 > /var/log/messages-Mar 27 17:04:39 xen05 kernel: [<ffffffff810a7989>] > do_mlockall+0x77/0x8d > /var/log/messages-Mar 27 17:04:39 xen05 kernel: [<ffffffff81139016>] > ? security_capable+0x27/0x29 > /var/log/messages-Mar 27 17:04:39 xen05 kernel: [<ffffffff810a7ce2>] > sys_mlockall+0x8f/0xb9 > /var/log/messages:Mar 27 17:04:39 xen05 kernel: [<ffffffff81012ac2>] > system_call_fastpath+0x16/0x1b > /var/log/messages-Mar 27 17:04:39 xen05 kernel: Code: 48 b8 ff ff ff > ff ff ff ff 7f 48 21 c2 48 89 55 e8 48 8d 7d e0 be 01 00 00 00 31 d2 > 41 ba f0 7f 00 00 e8 e9 c7 ff ff 85 c0 74 04 <0f> 0b eb fe c9 c3 55 40 > f6 c7 01 48 89 e5 53 48 89 fb 74 5b 48 > /var/log/messages-Mar 27 17:04:39 xen05 kernel: RIP > [<ffffffff8100cb5b>] pin_pagetable_pfn+0x53/0x59 > /var/log/messages-Mar 27 17:04:39 xen05 kernel: RSP <ffff8800303d1c28> > /var/log/messages-Mar 27 17:04:39 xen05 kernel: ---[ end trace > bf36c55d2ecd52e5 ]--- > > > > > > > Maybe this helps, but i think, if anything, this makes it worse as the debug > > options actually supressed the problem that needs to be debugged. > > True. At least now we know/narrow down to just related to > CONFIG_DEBUG_PAGEALLOC. Maybe Konrad or Jeremy can have a closer look > in the related codes... ... > > Thanks. > > Kindest regards, > Giam Teck Choon > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxxxxxxxx > http://lists.xensource.com/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |