[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
Hello Wei and all other interested people, I saw this thread from around May. It got silent on this thread after your post on May 31. Is there any progress on this problem? I am running into this issue as well with the openSUSE 12.3 distribution. This is with their 3.7.10-1.16-xen kernel and Xen version 4.2.1_12-1.12.10. On the net I see some discussion of people hitting this issue but not that much. E.g., one of the symptoms is that a guest crashes when running zypper install or zypper update when the Internet connection is fast enough. OpenSUSE 3.4.X kernels are running ok as guest on top of the openSUSE 12.3 Xen distribution, but apparently since 3.7.10 and higher there is this issue. I spent already quite some time in getting grip on the issue. I added a bug to bugzilla.novell.com but no response. See https://bugzilla.novell.com/show_bug.cgi?id=826374 for details. Apparently for hitting this bug (i.e. make it all the way to the crash), it is required to use some hardware which performs not too slow. With this I mean it is easy to find hardware which is unable to reproduce the issue. In one of my recent experiments I changed the SLAB allocater to SLUB which provides more detailed kernel logging. Here is the log output after the first detected issue regarding xennet: 2013-07-03T23:51:16.560229+02:00 domUA kernel: [ 97.562370] netfront: Too many frags 2013-07-03T23:51:17.228143+02:00 domUA kernel: [ 98.230466] netfront: Too many frags 2013-07-03T23:51:17.596074+02:00 domUA kernel: [ 98.597300] netfront: Too many frags 2013-07-03T23:51:18.740215+02:00 domUA kernel: [ 99.743080] net_ratelimit: 2 callbacks suppressed 2013-07-03T23:51:18.740242+02:00 domUA kernel: [ 99.743084] netfront: Too many frags 2013-07-03T23:51:19.104100+02:00 domUA kernel: [ 100.104281] netfront: Too many frags 2013-07-03T23:51:19.760134+02:00 domUA kernel: [ 100.760594] netfront: Too many frags 2013-07-03T23:51:21.820154+02:00 domUA kernel: [ 102.821202] netfront: Too many frags 2013-07-03T23:51:22.192188+02:00 domUA kernel: [ 103.192655] netfront: Too many frags 2013-07-03T23:51:26.060144+02:00 domUA kernel: [ 107.062447] netfront: Too many frags 2013-07-03T23:51:26.412116+02:00 domUA kernel: [ 107.415165] netfront: Too many frags 2013-07-03T23:51:27.092147+02:00 domUA kernel: [ 108.094615] netfront: Too many frags 2013-07-03T23:51:27.492112+02:00 domUA kernel: [ 108.494255] netfront: Too many frags 2013-07-03T23:51:27.520194+02:00 domUA kernel: [ 108.522445] ============================================================================= 2013-07-03T23:51:27.520206+02:00 domUA kernel: [ 108.522448] BUG kmalloc-1024 (Tainted: G W ): Redzone overwritten 2013-07-03T23:51:27.520209+02:00 domUA kernel: [ 108.522450] ----------------------------------------------------------------------------- 2013-07-03T23:51:27.520212+02:00 domUA kernel: [ 108.522450] 2013-07-03T23:51:27.520215+02:00 domUA kernel: [ 108.522452] Disabling lock debugging due to kernel taint 2013-07-03T23:51:27.520217+02:00 domUA kernel: [ 108.522454] INFO: 0xffff8800f66068f8-0xffff8800f66068ff. First byte 0x0 instead of 0xcc 2013-07-03T23:51:27.520220+02:00 domUA kernel: [ 108.522461] INFO: Allocated in __alloc_skb+0x88/0x260 age=11 cpu=0 pid=1325 2013-07-03T23:51:27.520223+02:00 domUA kernel: [ 108.522466] set_track+0x6c/0x190 2013-07-03T23:51:27.520225+02:00 domUA kernel: [ 108.522470] alloc_debug_processing+0x83/0x109 2013-07-03T23:51:27.520228+02:00 domUA kernel: [ 108.522472] __slab_alloc.constprop.48+0x523/0x593 2013-07-03T23:51:27.520231+02:00 domUA kernel: [ 108.522474] __kmalloc_track_caller+0xb4/0x200 2013-07-03T23:51:27.520233+02:00 domUA kernel: [ 108.522477] __kmalloc_reserve+0x3c/0xa0 2013-07-03T23:51:27.520236+02:00 domUA kernel: [ 108.522478] __alloc_skb+0x88/0x260 2013-07-03T23:51:27.520239+02:00 domUA kernel: [ 108.522483] network_alloc_rx_buffers+0x76/0x5f0 [xennet] 2013-07-03T23:51:27.520241+02:00 domUA kernel: [ 108.522486] netif_poll+0xcf4/0xf30 [xennet] 2013-07-03T23:51:27.520243+02:00 domUA kernel: [ 108.522489] net_rx_action+0xf0/0x2e0 2013-07-03T23:51:27.520246+02:00 domUA kernel: [ 108.522493] __do_softirq+0x127/0x280 2013-07-03T23:51:27.520248+02:00 domUA kernel: [ 108.522496] call_softirq+0x1c/0x30 2013-07-03T23:51:27.520251+02:00 domUA kernel: [ 108.522499] do_softirq+0x56/0xd0 2013-07-03T23:51:27.520253+02:00 domUA kernel: [ 108.522501] irq_exit+0x52/0xd0 2013-07-03T23:51:27.520256+02:00 domUA kernel: [ 108.522503] evtchn_do_upcall+0x281/0x2e7 2013-07-03T23:51:27.520258+02:00 domUA kernel: [ 108.522505] do_hypervisor_callback+0x1e/0x30 2013-07-03T23:51:27.520261+02:00 domUA kernel: [ 108.522507] 0x7f45f0a2f1e0 2013-07-03T23:51:27.520263+02:00 domUA kernel: [ 108.522509] INFO: Freed in skb_free_head+0x5c/0x70 age=14 cpu=0 pid=1325 2013-07-03T23:51:27.520266+02:00 domUA kernel: [ 108.522512] set_track+0x6c/0x190 2013-07-03T23:51:27.520269+02:00 domUA kernel: [ 108.522513] free_debug_processing+0x151/0x201 2013-07-03T23:51:27.520271+02:00 domUA kernel: [ 108.522515] __slab_free+0x47/0x499 2013-07-03T23:51:27.520274+02:00 domUA kernel: [ 108.522517] kfree+0x1df/0x230 2013-07-03T23:51:27.520276+02:00 domUA kernel: [ 108.522519] skb_free_head+0x5c/0x70 2013-07-03T23:51:27.520279+02:00 domUA kernel: [ 108.522521] skb_release_data+0xea/0xf0 2013-07-03T23:51:27.520281+02:00 domUA kernel: [ 108.522522] __kfree_skb+0x1e/0xb0 2013-07-03T23:51:27.520284+02:00 domUA kernel: [ 108.522524] kfree_skb+0x80/0xc0 2013-07-03T23:51:27.520286+02:00 domUA kernel: [ 108.522527] netif_poll+0x824/0xf30 [xennet] 2013-07-03T23:51:27.520289+02:00 domUA kernel: [ 108.522529] net_rx_action+0xf0/0x2e0 2013-07-03T23:51:27.520291+02:00 domUA kernel: [ 108.522530] __do_softirq+0x127/0x280 2013-07-03T23:51:27.520294+02:00 domUA kernel: [ 108.522532] call_softirq+0x1c/0x30 2013-07-03T23:51:27.520296+02:00 domUA kernel: [ 108.522534] do_softirq+0x56/0xd0 2013-07-03T23:51:27.520299+02:00 domUA kernel: [ 108.522536] irq_exit+0x52/0xd0 2013-07-03T23:51:27.520302+02:00 domUA kernel: [ 108.522538] evtchn_do_upcall+0x281/0x2e7 2013-07-03T23:51:27.520304+02:00 domUA kernel: [ 108.522539] do_hypervisor_callback+0x1e/0x30 2013-07-03T23:51:27.520307+02:00 domUA kernel: [ 108.522541] INFO: Slab 0xffff8800ffd78100 objects=12 used=7 fp=0xffff8800f66074d0 flags=0x400000000000408 2013-07-03T23:51:27.520310+02:00 domUA kernel: [ 108.522543] INFO: Object 0xffff8800f66064f8 @offset=9464 fp=0x0000018800000000 2013-07-03T23:51:27.520312+02:00 domUA kernel: [ 108.522543] 2013-07-03T23:51:27.520315+02:00 domUA kernel: [ 108.522546] Bytes b4 ffff8800f66064e8: 4a 40 ff ff 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a J@......ZZZZZZZZ 2013-07-03T23:51:27.520318+02:00 domUA kernel: [ 108.522548] Object ffff8800f66064f8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk 2013-07-03T23:51:27.520320+02:00 domUA kernel: [ 108.522549] Object ffff8800f6606508: 00 16 3e 29 7e 3c 00 25 90 69 ea 4e 08 00 45 08 ..>)~<.%.i.N..E. 2013-07-03T23:51:27.520323+02:00 domUA kernel: [ 108.522551] Object ffff8800f6606518: fe bc 46 d7 40 00 40 06 d3 69 0a 57 06 91 0a 57 ..F.@.@..i.W...W 2013-07-03T23:51:27.520326+02:00 domUA kernel: [ 108.522553] Object ffff8800f6606528: 06 b4 9b 86 00 16 57 4d 5e bd 89 4c 40 ad 80 10 ......WM^..L@... 2013-07-03T23:51:27.520329+02:00 domUA kernel: [ 108.522554] Object ffff8800f6606538: 00 a6 20 a2 00 00 01 01 08 0a 01 eb 40 a7 ff ff .. .........@... 2013-07-03T23:51:27.520332+02:00 domUA kernel: [ 108.522556] Object ffff8800f6606548: 44 fa 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b D.kkkkkkkkkkkkkk 2013-07-03T23:51:27.520335+02:00 domUA kernel: [ 108.522557] Object ffff8800f6606558: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk 2013-07-03T23:51:27.520337+02:00 domUA kernel: [ 108.522559] Object ffff8800f6606568: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk Skipping some of the object dumping....... 2013-07-03T23:51:27.520583+02:00 domUA kernel: [ 108.522644] Object ffff8800f66068d8: 00 d7 e4 ff 00 88 ff ff 00 00 00 00 00 10 00 00 ................ 2013-07-03T23:51:27.520586+02:00 domUA kernel: [ 108.522646] Object ffff8800f66068e8: 00 92 dd ff 00 88 ff ff 00 00 00 00 88 01 00 00 ................ 2013-07-03T23:51:27.520588+02:00 domUA kernel: [ 108.522647] Redzone ffff8800f66068f8: 00 92 dd ff 00 88 ff ff ........ 2013-07-03T23:51:27.520591+02:00 domUA kernel: [ 108.522649] Padding ffff8800f6606a38: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ 2013-07-03T23:51:27.520594+02:00 domUA kernel: [ 108.522651] Pid: 1325, comm: sshd Tainted: G B W 3.7.10-1.16-dbg-xen #3 2013-07-03T23:51:27.520597+02:00 domUA kernel: [ 108.522652] Call Trace: 2013-07-03T23:51:27.520599+02:00 domUA kernel: [ 108.522658] [<ffffffff8000b097>] try_stack_unwind+0x87/0x1c0 2013-07-03T23:51:27.520602+02:00 domUA kernel: [ 108.522662] [<ffffffff80008fa5>] dump_trace+0xd5/0x250 2013-07-03T23:51:27.520605+02:00 domUA kernel: [ 108.522665] [<ffffffff8000b22c>] show_trace_log_lvl+0x5c/0x80 2013-07-03T23:51:27.520608+02:00 domUA kernel: [ 108.522668] [<ffffffff8000b265>] show_trace+0x15/0x20 2013-07-03T23:51:27.520610+02:00 domUA kernel: [ 108.522672] [<ffffffff80553a69>] dump_stack+0x77/0x80 2013-07-03T23:51:27.520612+02:00 domUA kernel: [ 108.522676] [<ffffffff801491b1>] print_trailer+0x131/0x140 2013-07-03T23:51:27.520615+02:00 domUA kernel: [ 108.522680] [<ffffffff80149709>] check_bytes_and_report+0xc9/0x120 2013-07-03T23:51:27.520617+02:00 domUA kernel: [ 108.522683] [<ffffffff8014a7f6>] check_object+0x56/0x240 2013-07-03T23:51:27.520620+02:00 domUA kernel: [ 108.522687] [<ffffffff805575b6>] free_debug_processing+0xc4/0x201 2013-07-03T23:51:27.520622+02:00 domUA kernel: [ 108.522690] [<ffffffff8055773a>] __slab_free+0x47/0x499 2013-07-03T23:51:27.520625+02:00 domUA kernel: [ 108.522694] [<ffffffff8014beff>] kfree+0x1df/0x230 2013-07-03T23:51:27.520627+02:00 domUA kernel: [ 108.522697] [<ffffffff8044a8cc>] skb_free_head+0x5c/0x70 2013-07-03T23:51:27.520630+02:00 domUA kernel: [ 108.522701] [<ffffffff8044a9ca>] skb_release_data+0xea/0xf0 2013-07-03T23:51:27.520632+02:00 domUA kernel: [ 108.522704] [<ffffffff8044a9ee>] __kfree_skb+0x1e/0xb0 2013-07-03T23:51:27.520635+02:00 domUA kernel: [ 108.522709] [<ffffffff8049fa2a>] tcp_recvmsg+0x99a/0xd50 2013-07-03T23:51:27.520637+02:00 domUA kernel: [ 108.522714] [<ffffffff804c796d>] inet_recvmsg+0xed/0x110 2013-07-03T23:51:27.520640+02:00 domUA kernel: [ 108.522718] [<ffffffff80440be8>] sock_aio_read+0x158/0x190 2013-07-03T23:51:27.520642+02:00 domUA kernel: [ 108.522722] [<ffffffff8015cb68>] do_sync_read+0x98/0xf0 2013-07-03T23:51:27.520645+02:00 domUA kernel: [ 108.522726] [<ffffffff8015d32d>] vfs_read+0xbd/0x180 2013-07-03T23:51:27.520647+02:00 domUA kernel: [ 108.522729] [<ffffffff8015d442>] sys_read+0x52/0xa0 2013-07-03T23:51:27.520650+02:00 domUA kernel: [ 108.522733] [<ffffffff8056ab3b>] system_call_fastpath+0x1a/0x1f 2013-07-03T23:51:27.520652+02:00 domUA kernel: [ 108.522736] [<00007f45ef74c960>] 0x7f45ef74c95f 2013-07-03T23:51:27.520655+02:00 domUA kernel: [ 108.522738] FIX kmalloc-1024: Restoring 0xffff8800f66068f8-0xffff8800f66068ff=0xcc 2013-07-03T23:51:27.520657+02:00 domUA kernel: [ 108.522738] 2013-07-03T23:51:27.679444+02:00 domUA kernel: [ 108.671750] ============================================================================= 2013-07-03T23:51:27.679454+02:00 domUA kernel: [ 108.671753] BUG kmalloc-1024 (Tainted: G B W ): Redzone overwritten 2013-07-03T23:51:27.679456+02:00 domUA kernel: [ 108.671754] ----------------------------------------------------------------------------- 2013-07-03T23:51:27.679458+02:00 domUA kernel: [ 108.671754] 2013-07-03T23:51:27.679460+02:00 domUA kernel: [ 108.671757] INFO: 0xffff8800f66068f8-0xffff8800f66068ff. First byte 0xcc instead of 0xbb 2013-07-03T23:51:27.679462+02:00 domUA kernel: [ 108.671762] INFO: Allocated in __alloc_skb+0x88/0x260 age=48 cpu=0 pid=1325 2013-07-03T23:51:27.679464+02:00 domUA kernel: [ 108.671765] set_track+0x6c/0x190 2013-07-03T23:51:27.679466+02:00 domUA kernel: [ 108.671767] alloc_debug_processing+0x83/0x109 2013-07-03T23:51:27.679468+02:00 domUA kernel: [ 108.671769] __slab_alloc.constprop.48+0x523/0x593 2013-07-03T23:51:27.679469+02:00 domUA kernel: [ 108.671771] __kmalloc_track_caller+0xb4/0x200 2013-07-03T23:51:27.679471+02:00 domUA kernel: [ 108.671773] __kmalloc_reserve+0x3c/0xa0 2013-07-03T23:51:27.679473+02:00 domUA kernel: [ 108.671775] __alloc_skb+0x88/0x260 2013-07-03T23:51:27.679475+02:00 domUA kernel: [ 108.671778] network_alloc_rx_buffers+0x76/0x5f0 [xennet] 2013-07-03T23:51:27.679476+02:00 domUA kernel: [ 108.671781] netif_poll+0xcf4/0xf30 [xennet] 2013-07-03T23:51:27.679478+02:00 domUA kernel: [ 108.671783] net_rx_action+0xf0/0x2e0 I noticed that after turning on all this debugging stuff, a real panic does not appear any more. This happens while copying a file with scp from dom0 to the guest (scp bigfile domu:/dev/null). In my lab, I am currently experimenting with a SuperMicro based system with Xen showing the following characteristics: __ __ _ _ ____ _ _ ____ _ _ ____ _ ___ \ \/ /___ _ __ | || | |___ \ / | / |___ \ / | / |___ \ / |/ _ \ \ // _ \ '_ \ | || |_ __) | | | | | __) |__| | | | __) | | | | | | / \ __/ | | | |__ _| / __/ _| | | |/ __/|__| |_| |/ __/ _| | |_| | /_/\_\___|_| |_| |_|(_)_____(_)_|___|_|_____| |_(_)_|_____(_)_|\___/ |_____| (XEN) Xen version 4.2.1_12-1.12.10 (abuild@) (gcc (SUSE Linux) 4.7.2 20130108 [gcc-4_7-branch revision 195012]) Wed May 29 20:31:49 UTC 2013 (XEN) Latest ChangeSet: 25952 (XEN) Bootloader: GNU GRUB 0.97 (XEN) Command line: dom0_mem=2048M,max:2048M loglvl=all guest_loglvl=all (XEN) Video information: (XEN) VGA is text mode 80x25, font 8x16 (XEN) VBE/DDC methods: V2; EDID transfer time: 1 seconds (XEN) Disc information: (XEN) Found 4 MBR signatures (XEN) Found 4 EDD information structures (XEN) Xen-e820 RAM map: (XEN) 0000000000000000 - 0000000000096400 (usable) (XEN) 0000000000096400 - 00000000000a0000 (reserved) (XEN) 00000000000e0000 - 0000000000100000 (reserved) (XEN) 0000000000100000 - 00000000bf780000 (usable) (XEN) 00000000bf78e000 - 00000000bf790000 type 9 (XEN) 00000000bf790000 - 00000000bf79e000 (ACPI data) (XEN) 00000000bf79e000 - 00000000bf7d0000 (ACPI NVS) (XEN) 00000000bf7d0000 - 00000000bf7e0000 (reserved) (XEN) 00000000bf7ec000 - 00000000c0000000 (reserved) (XEN) 00000000e0000000 - 00000000f0000000 (reserved) (XEN) 00000000fee00000 - 00000000fee01000 (reserved) (XEN) 00000000ffc00000 - 0000000100000000 (reserved) (XEN) 0000000100000000 - 0000000340000000 (usable) Skipping ACPI en SRAT (XEN) System RAM: 12279MB (12573784kB) (XEN) NUMA: Allocated memnodemap from 33e38a000 - 33e38e000 (XEN) NUMA: Using 8 for the hash shift. (XEN) Domain heap initialised DMA width 30 bits (XEN) found SMP MP-table at 000ff780 (XEN) DMI present. (XEN) Enabling APIC mode: Phys. Using 2 I/O APICs (XEN) ACPI: HPET id: 0x8086a301 base: 0xfed00000 (XEN) Failed to get Error Log Address Range. (XEN) Using ACPI (MADT) for SMP configuration information (XEN) SMP: Allowing 24 CPUs (8 hotplug CPUs) (XEN) IRQ limits: 48 GSI, 3040 MSI/MSI-X (XEN) Using scheduler: SMP Credit Scheduler (credit) (XEN) Detected 2400.115 MHz processor. (XEN) Initing memory sharing. (XEN) mce_intel.c:1238: MCA Capability: BCAST 1 SER 0 CMCI 1 firstbank 0 extended MCE MSR 0 (XEN) Intel machine check reporting enabled (XEN) PCI: MCFG configuration 0: base e0000000 segment 0000 buses 00 - ff (XEN) PCI: MCFG area at e0000000 reserved in E820 (XEN) PCI: Using MCFG for segment 0000 bus 00-ff (XEN) Intel VT-d iommu 0 supported page sizes: 4kB. (XEN) Intel VT-d Snoop Control enabled. (XEN) Intel VT-d Dom0 DMA Passthrough not enabled. (XEN) Intel VT-d Queued Invalidation enabled. (XEN) Intel VT-d Interrupt Remapping enabled. (XEN) Intel VT-d Shared EPT tables not enabled. (XEN) I/O virtualisation enabled (XEN) - Dom0 mode: Relaxed (XEN) Interrupt remapping enabled (XEN) Enabled directed EOI with ioapic_ack_old on! (XEN) ENABLING IO-APIC IRQs (XEN) ..TIMER: vector=0xF0 apic1=0 pin1=2 apic2=-1 pin2=-1 (XEN) Platform timer is 14.318MHz HPET (XEN) Allocated console ring of 128 KiB. (XEN) VMX: Supported advanced features: (XEN) - APIC MMIO access virtualisation (XEN) - APIC TPR shadow (XEN) - Extended Page Tables (EPT) (XEN) - Virtual-Processor Identifiers (VPID) (XEN) - Virtual NMI (XEN) - MSR direct-access bitmap (XEN) - Unrestricted Guest (XEN) HVM: ASIDs enabled. (XEN) HVM: VMX enabled (XEN) HVM: Hardware Assisted Paging (HAP) detected (XEN) HVM: HAP page sizes: 4kB, 2MB, 1GB (XEN) Brought up 16 CPUs (XEN) ACPI sleep modes: S3 (XEN) mcheck_poll: Machine check polling timer started. (XEN) *** LOADING DOMAIN 0 *** (XEN) Xen kernel: 64-bit, lsb, compat32 (XEN) Dom0 kernel: 64-bit, lsb, paddr 0x2000 -> 0xa65000 (XEN) PHYSICAL MEMORY ARRANGEMENT: (XEN) Dom0 alloc.: 0000000336000000->0000000337000000 (516915 pages to be allocated) (XEN) Init. ramdisk: 000000033f333000->0000000340000000 (XEN) VIRTUAL MEMORY ARRANGEMENT: (XEN) Loaded kernel: ffffffff80002000->ffffffff80a65000 (XEN) Init. ramdisk: 0000000000000000->0000000000000000 (XEN) Phys-Mach map: ffffea0000000000->ffffea0000400000 (XEN) Start info: ffffffff80a65000->ffffffff80a654b4 (XEN) Page tables: ffffffff80a66000->ffffffff80a6f000 (XEN) Boot stack: ffffffff80a6f000->ffffffff80a70000 (XEN) TOTAL: ffffffff80000000->ffffffff80c00000 (XEN) ENTRY ADDRESS: ffffffff80002000 (XEN) Dom0 has maximum 16 VCPUs (XEN) Scrubbing Free RAM: .....................................................................................................done. (XEN) Initial low memory virq threshold set at 0x4000 pages. (XEN) Std. Loglevel: All (XEN) Guest Loglevel: All (XEN) Xen is relinquishing VGA console. (XEN) ACPI: RSDP 000FACE0, 0024 (r2 ACPIAM) (XEN) ACPI: XSDT BF790100, 008C (r1 SMCI 20110827 MSFT 97) (XEN) ACPI: FACP BF790290, 00F4 (r4 082711 FACP1638 20110827 MSFT 97) (XEN) ACPI: DSDT BF7906A0, 6563 (r2 10600 10600000 0 INTL 20051117) (XEN) ACPI: FACS BF79E000, 0040 (XEN) ACPI: APIC BF790390, 011E (r2 082711 APIC1638 20110827 MSFT 97) (XEN) ACPI: MCFG BF7904B0, 003C (r1 082711 OEMMCFG 20110827 MSFT 97) (XEN) ACPI: SLIT BF7904F0, 0030 (r1 082711 OEMSLIT 20110827 MSFT 97) (XEN) ACPI: OEMB BF79E040, 0085 (r1 082711 OEMB1638 20110827 MSFT 97) (XEN) ACPI: SRAT BF79A6A0, 01D0 (r2 082711 OEMSRAT 1 INTL 1) (XEN) ACPI: HPET BF79A870, 0038 (r1 082711 OEMHPET 20110827 MSFT 97) (XEN) ACPI: DMAR BF79E0D0, 0130 (r1 AMI OEMDMAR 1 MSFT 97) (XEN) ACPI: SSDT BF7A1B30, 0363 (r1 DpgPmm CpuPm 12 INTL 20051117) (XEN) ACPI: EINJ BF79A8B0, 0130 (r1 AMIER AMI_EINJ 20110827 MSFT 97) (XEN) ACPI: BERT BF79AA40, 0030 (r1 AMIER AMI_BERT 20110827 MSFT 97) (XEN) ACPI: ERST BF79AA70, 01B0 (r1 AMIER AMI_ERST 20110827 MSFT 97) (XEN) ACPI: HEST BF79AC20, 00A8 (r1 AMIER ABC_HEST 20110827 MSFT 97) (XEN) System RAM: 12279MB (12573784kB) (XEN) SRAT: PXM 0 -> APIC 0 -> Node 0 (XEN) SRAT: PXM 0 -> APIC 2 -> Node 0 (XEN) SRAT: PXM 0 -> APIC 18 -> Node 0 (XEN) SRAT: PXM 0 -> APIC 20 -> Node 0 (XEN) SRAT: PXM 0 -> APIC 1 -> Node 0 (XEN) SRAT: PXM 0 -> APIC 3 -> Node 0 (XEN) SRAT: PXM 0 -> APIC 19 -> Node 0 (XEN) SRAT: PXM 0 -> APIC 21 -> Node 0 I am happy to assist in more kernel probing. It is even possible for me to setup access for someone to this machine. Best regards, Dion Kant On 05/17/2013 10:59 AM, Wei Liu wrote: > Moving discussion to Xen-devel > > On Thu, May 16, 2013 at 10:29:56PM +0300, Eugene Istomin wrote: >> Hello, >> >> I tried to use 3.9.2 kernel with xen 4.2.2/4.3rc1 and in both variants leads >> to this error in network-intensive load (such as iperf, 100 nginx parallel >> requests to 1M files and so on): >> > It would be more helpful if you can provide info on your configurations > (Dom0 and DomU), your workload, how to reproduce the bug. > > I run iperf and NFS to test Xen network, but never see any crash like > this myself. > >> BUG: unable to handle kernel paging request at ffff8800795a3000 >> [ 60.246945] IP: [<ffffffffa001a75c>] netif_poll+0x49c/0xe80 [xennet] >> [ 60.246975] PGD a8a067 PUD a9a067 PMD 7fc27067 PTE >> 80100000795a3065 >> [ 60.247004] Oops: 0003 [#1] SMP >> [ 60.247020] Modules linked in: af_packet hwmon domctl crc32_pclmul >> crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw >> aes_x86_64 joydev xts gf128mul autofs4 scsi_dh_emc scsi_dh_alua >> scsi_dh_rdac scsi_dh_hp_sw scsi_dh xenblk cdrom xennet ata_generic >> ata_piix >> [ 60.247144] CPU 0 >> [ 60.247154] Pid: 0, comm: swapper/0 Not tainted 3.9.2-1.g04040b9-xen >> #1 >> [ 60.247179] RIP: e030:[<ffffffffa001a75c>] [<ffffffffa001a75c>] >> netif_poll+0x49c/0xe80 [xennet] >> ... > Could you provide fuul stack trace? AFAICT there is no netif_poll in Xen > netfront/back. > > Presumably this is Dom0 log? (from the domctl module) > >> We have couple of production hypervisors on 3.4 kernels with high- >> throughput internal network (VM-to-VM in one Dom0), iperf on them is >> working well: >> [ 3] 0.0- 2.0 sec 3357 MBytes 14080 Mbits/sec >> [ 3] 2.0- 4.0 sec 2880 MBytes 12077 Mbits/sec >> [ 3] 4.0- 6.0 sec 2909 MBytes 12202 Mbits/sec >> [ 3] 6.0- 8.0 sec 2552 MBytes 10702 Mbits/sec >> [ 3] 8.0-10.0 sec 3616 MBytes 15166 Mbits/sec >> [ 3] 10.0-12.0 sec 3415 MBytes 14324 Mbits/sec >> >> >> Seems like a kernel bug, is this related to one of this fixes in linux-next >> or i >> need to create new bugreport? >> >> 1) >> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=1aaf6d3d3d1e95f4be07e32dd84aa1c93855fbbd >> 2) >> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=9ecd1a75d977e2e8c48139c7d3efed183f898d94 >> 3) >> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=2810e5b9a7731ca5fce22bfbe12c96e16ac44b6f >> 4) >> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=03393fd5cc2b6cdeec32b704ecba64dbb0feae3c >> 5) >> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=59ccb4ebbc35e36a3c143f2d1355deb75c2e628f >> > I don't think these patches can fix your problem at first glance. > > > Wei. > > _______________________________________________ > Xen-users mailing list > Xen-users@xxxxxxxxxxxxx > http://lists.xen.org/xen-users _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxx http://lists.xen.org/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |