[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8



Hello Wei and all other interested people,

I saw this thread from around May. It got silent on this thread after
your post on May 31.

Is there any progress on this problem?

I am running into this issue as well with the openSUSE 12.3
distribution. This is with their 3.7.10-1.16-xen kernel and Xen version
4.2.1_12-1.12.10. On the net I see some discussion of people hitting
this issue but not that much.  E.g., one of the symptoms is that a guest
crashes when running zypper install or zypper update when the Internet
connection is fast enough.

OpenSUSE 3.4.X kernels are running ok as guest on top of the openSUSE
12.3 Xen distribution, but apparently since 3.7.10 and higher there is
this issue.

I spent already quite some time in getting grip on the issue. I added a
bug to bugzilla.novell.com but no response. See
https://bugzilla.novell.com/show_bug.cgi?id=826374 for details.
Apparently for hitting this bug (i.e. make it all the way to the crash),
it is required to use some hardware which performs not too slow. With
this I mean it is easy to find hardware which is unable to reproduce the
issue.

In one of my recent experiments I changed the SLAB allocater to SLUB
which provides more detailed kernel logging. Here is the log output
after the first detected issue regarding xennet:

2013-07-03T23:51:16.560229+02:00 domUA kernel: [   97.562370] netfront:
Too many frags
2013-07-03T23:51:17.228143+02:00 domUA kernel: [   98.230466] netfront:
Too many frags
2013-07-03T23:51:17.596074+02:00 domUA kernel: [   98.597300] netfront:
Too many frags
2013-07-03T23:51:18.740215+02:00 domUA kernel: [   99.743080]
net_ratelimit: 2 callbacks suppressed
2013-07-03T23:51:18.740242+02:00 domUA kernel: [   99.743084] netfront:
Too many frags
2013-07-03T23:51:19.104100+02:00 domUA kernel: [  100.104281] netfront:
Too many frags
2013-07-03T23:51:19.760134+02:00 domUA kernel: [  100.760594] netfront:
Too many frags
2013-07-03T23:51:21.820154+02:00 domUA kernel: [  102.821202] netfront:
Too many frags
2013-07-03T23:51:22.192188+02:00 domUA kernel: [  103.192655] netfront:
Too many frags
2013-07-03T23:51:26.060144+02:00 domUA kernel: [  107.062447] netfront:
Too many frags
2013-07-03T23:51:26.412116+02:00 domUA kernel: [  107.415165] netfront:
Too many frags
2013-07-03T23:51:27.092147+02:00 domUA kernel: [  108.094615] netfront:
Too many frags
2013-07-03T23:51:27.492112+02:00 domUA kernel: [  108.494255] netfront:
Too many frags
2013-07-03T23:51:27.520194+02:00 domUA kernel: [  108.522445]
=============================================================================
2013-07-03T23:51:27.520206+02:00 domUA kernel: [  108.522448] BUG
kmalloc-1024 (Tainted: G        W   ): Redzone overwritten
2013-07-03T23:51:27.520209+02:00 domUA kernel: [  108.522450]
-----------------------------------------------------------------------------
2013-07-03T23:51:27.520212+02:00 domUA kernel: [  108.522450]
2013-07-03T23:51:27.520215+02:00 domUA kernel: [  108.522452] Disabling
lock debugging due to kernel taint
2013-07-03T23:51:27.520217+02:00 domUA kernel: [  108.522454] INFO:
0xffff8800f66068f8-0xffff8800f66068ff. First byte 0x0 instead of 0xcc
2013-07-03T23:51:27.520220+02:00 domUA kernel: [  108.522461] INFO:
Allocated in __alloc_skb+0x88/0x260 age=11 cpu=0 pid=1325
2013-07-03T23:51:27.520223+02:00 domUA kernel: [  108.522466]  
set_track+0x6c/0x190
2013-07-03T23:51:27.520225+02:00 domUA kernel: [  108.522470]  
alloc_debug_processing+0x83/0x109
2013-07-03T23:51:27.520228+02:00 domUA kernel: [  108.522472]  
__slab_alloc.constprop.48+0x523/0x593
2013-07-03T23:51:27.520231+02:00 domUA kernel: [  108.522474]  
__kmalloc_track_caller+0xb4/0x200
2013-07-03T23:51:27.520233+02:00 domUA kernel: [  108.522477]  
__kmalloc_reserve+0x3c/0xa0
2013-07-03T23:51:27.520236+02:00 domUA kernel: [  108.522478]  
__alloc_skb+0x88/0x260
2013-07-03T23:51:27.520239+02:00 domUA kernel: [  108.522483]  
network_alloc_rx_buffers+0x76/0x5f0 [xennet]
2013-07-03T23:51:27.520241+02:00 domUA kernel: [  108.522486]  
netif_poll+0xcf4/0xf30 [xennet]
2013-07-03T23:51:27.520243+02:00 domUA kernel: [  108.522489]  
net_rx_action+0xf0/0x2e0
2013-07-03T23:51:27.520246+02:00 domUA kernel: [  108.522493]  
__do_softirq+0x127/0x280
2013-07-03T23:51:27.520248+02:00 domUA kernel: [  108.522496]  
call_softirq+0x1c/0x30
2013-07-03T23:51:27.520251+02:00 domUA kernel: [  108.522499]  
do_softirq+0x56/0xd0
2013-07-03T23:51:27.520253+02:00 domUA kernel: [  108.522501]  
irq_exit+0x52/0xd0
2013-07-03T23:51:27.520256+02:00 domUA kernel: [  108.522503]  
evtchn_do_upcall+0x281/0x2e7
2013-07-03T23:51:27.520258+02:00 domUA kernel: [  108.522505]  
do_hypervisor_callback+0x1e/0x30
2013-07-03T23:51:27.520261+02:00 domUA kernel: [  108.522507]  
0x7f45f0a2f1e0
2013-07-03T23:51:27.520263+02:00 domUA kernel: [  108.522509] INFO:
Freed in skb_free_head+0x5c/0x70 age=14 cpu=0 pid=1325
2013-07-03T23:51:27.520266+02:00 domUA kernel: [  108.522512]  
set_track+0x6c/0x190
2013-07-03T23:51:27.520269+02:00 domUA kernel: [  108.522513]  
free_debug_processing+0x151/0x201
2013-07-03T23:51:27.520271+02:00 domUA kernel: [  108.522515]  
__slab_free+0x47/0x499
2013-07-03T23:51:27.520274+02:00 domUA kernel: [  108.522517]  
kfree+0x1df/0x230
2013-07-03T23:51:27.520276+02:00 domUA kernel: [  108.522519]  
skb_free_head+0x5c/0x70
2013-07-03T23:51:27.520279+02:00 domUA kernel: [  108.522521]  
skb_release_data+0xea/0xf0
2013-07-03T23:51:27.520281+02:00 domUA kernel: [  108.522522]  
__kfree_skb+0x1e/0xb0
2013-07-03T23:51:27.520284+02:00 domUA kernel: [  108.522524]  
kfree_skb+0x80/0xc0
2013-07-03T23:51:27.520286+02:00 domUA kernel: [  108.522527]  
netif_poll+0x824/0xf30 [xennet]
2013-07-03T23:51:27.520289+02:00 domUA kernel: [  108.522529]  
net_rx_action+0xf0/0x2e0
2013-07-03T23:51:27.520291+02:00 domUA kernel: [  108.522530]  
__do_softirq+0x127/0x280
2013-07-03T23:51:27.520294+02:00 domUA kernel: [  108.522532]  
call_softirq+0x1c/0x30
2013-07-03T23:51:27.520296+02:00 domUA kernel: [  108.522534]  
do_softirq+0x56/0xd0
2013-07-03T23:51:27.520299+02:00 domUA kernel: [  108.522536]  
irq_exit+0x52/0xd0
2013-07-03T23:51:27.520302+02:00 domUA kernel: [  108.522538]  
evtchn_do_upcall+0x281/0x2e7
2013-07-03T23:51:27.520304+02:00 domUA kernel: [  108.522539]  
do_hypervisor_callback+0x1e/0x30
2013-07-03T23:51:27.520307+02:00 domUA kernel: [  108.522541] INFO: Slab
0xffff8800ffd78100 objects=12 used=7 fp=0xffff8800f66074d0
flags=0x400000000000408
2013-07-03T23:51:27.520310+02:00 domUA kernel: [  108.522543] INFO:
Object 0xffff8800f66064f8 @offset=9464 fp=0x0000018800000000
2013-07-03T23:51:27.520312+02:00 domUA kernel: [  108.522543]
2013-07-03T23:51:27.520315+02:00 domUA kernel: [  108.522546] Bytes b4
ffff8800f66064e8: 4a 40 ff ff 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a 
J@......ZZZZZZZZ
2013-07-03T23:51:27.520318+02:00 domUA kernel: [  108.522548] Object
ffff8800f66064f8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 
kkkkkkkkkkkkkkkk
2013-07-03T23:51:27.520320+02:00 domUA kernel: [  108.522549] Object
ffff8800f6606508: 00 16 3e 29 7e 3c 00 25 90 69 ea 4e 08 00 45 08 
..>)~<.%.i.N..E.
2013-07-03T23:51:27.520323+02:00 domUA kernel: [  108.522551] Object
ffff8800f6606518: fe bc 46 d7 40 00 40 06 d3 69 0a 57 06 91 0a 57 
..F.@.@..i.W...W
2013-07-03T23:51:27.520326+02:00 domUA kernel: [  108.522553] Object
ffff8800f6606528: 06 b4 9b 86 00 16 57 4d 5e bd 89 4c 40 ad 80 10 
......WM^..L@...
2013-07-03T23:51:27.520329+02:00 domUA kernel: [  108.522554] Object
ffff8800f6606538: 00 a6 20 a2 00 00 01 01 08 0a 01 eb 40 a7 ff ff  ..
.........@...
2013-07-03T23:51:27.520332+02:00 domUA kernel: [  108.522556] Object
ffff8800f6606548: 44 fa 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 
D.kkkkkkkkkkkkkk
2013-07-03T23:51:27.520335+02:00 domUA kernel: [  108.522557] Object
ffff8800f6606558: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 
kkkkkkkkkkkkkkkk
2013-07-03T23:51:27.520337+02:00 domUA kernel: [  108.522559] Object
ffff8800f6606568: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 
kkkkkkkkkkkkkkkk

Skipping some of the object dumping.......

2013-07-03T23:51:27.520583+02:00 domUA kernel: [  108.522644] Object
ffff8800f66068d8: 00 d7 e4 ff 00 88 ff ff 00 00 00 00 00 10 00 00 
................
2013-07-03T23:51:27.520586+02:00 domUA kernel: [  108.522646] Object
ffff8800f66068e8: 00 92 dd ff 00 88 ff ff 00 00 00 00 88 01 00 00 
................
2013-07-03T23:51:27.520588+02:00 domUA kernel: [  108.522647] Redzone
ffff8800f66068f8: 00 92 dd ff 00 88 ff ff                          ........
2013-07-03T23:51:27.520591+02:00 domUA kernel: [  108.522649] Padding
ffff8800f6606a38: 5a 5a 5a 5a 5a 5a 5a 5a                          ZZZZZZZZ
2013-07-03T23:51:27.520594+02:00 domUA kernel: [  108.522651] Pid: 1325,
comm: sshd Tainted: G    B   W    3.7.10-1.16-dbg-xen #3
2013-07-03T23:51:27.520597+02:00 domUA kernel: [  108.522652] Call Trace:
2013-07-03T23:51:27.520599+02:00 domUA kernel: [  108.522658] 
[<ffffffff8000b097>] try_stack_unwind+0x87/0x1c0
2013-07-03T23:51:27.520602+02:00 domUA kernel: [  108.522662] 
[<ffffffff80008fa5>] dump_trace+0xd5/0x250
2013-07-03T23:51:27.520605+02:00 domUA kernel: [  108.522665] 
[<ffffffff8000b22c>] show_trace_log_lvl+0x5c/0x80
2013-07-03T23:51:27.520608+02:00 domUA kernel: [  108.522668] 
[<ffffffff8000b265>] show_trace+0x15/0x20
2013-07-03T23:51:27.520610+02:00 domUA kernel: [  108.522672] 
[<ffffffff80553a69>] dump_stack+0x77/0x80
2013-07-03T23:51:27.520612+02:00 domUA kernel: [  108.522676] 
[<ffffffff801491b1>] print_trailer+0x131/0x140
2013-07-03T23:51:27.520615+02:00 domUA kernel: [  108.522680] 
[<ffffffff80149709>] check_bytes_and_report+0xc9/0x120
2013-07-03T23:51:27.520617+02:00 domUA kernel: [  108.522683] 
[<ffffffff8014a7f6>] check_object+0x56/0x240
2013-07-03T23:51:27.520620+02:00 domUA kernel: [  108.522687] 
[<ffffffff805575b6>] free_debug_processing+0xc4/0x201
2013-07-03T23:51:27.520622+02:00 domUA kernel: [  108.522690] 
[<ffffffff8055773a>] __slab_free+0x47/0x499
2013-07-03T23:51:27.520625+02:00 domUA kernel: [  108.522694] 
[<ffffffff8014beff>] kfree+0x1df/0x230
2013-07-03T23:51:27.520627+02:00 domUA kernel: [  108.522697] 
[<ffffffff8044a8cc>] skb_free_head+0x5c/0x70
2013-07-03T23:51:27.520630+02:00 domUA kernel: [  108.522701] 
[<ffffffff8044a9ca>] skb_release_data+0xea/0xf0
2013-07-03T23:51:27.520632+02:00 domUA kernel: [  108.522704] 
[<ffffffff8044a9ee>] __kfree_skb+0x1e/0xb0
2013-07-03T23:51:27.520635+02:00 domUA kernel: [  108.522709] 
[<ffffffff8049fa2a>] tcp_recvmsg+0x99a/0xd50
2013-07-03T23:51:27.520637+02:00 domUA kernel: [  108.522714] 
[<ffffffff804c796d>] inet_recvmsg+0xed/0x110
2013-07-03T23:51:27.520640+02:00 domUA kernel: [  108.522718] 
[<ffffffff80440be8>] sock_aio_read+0x158/0x190
2013-07-03T23:51:27.520642+02:00 domUA kernel: [  108.522722] 
[<ffffffff8015cb68>] do_sync_read+0x98/0xf0
2013-07-03T23:51:27.520645+02:00 domUA kernel: [  108.522726] 
[<ffffffff8015d32d>] vfs_read+0xbd/0x180
2013-07-03T23:51:27.520647+02:00 domUA kernel: [  108.522729] 
[<ffffffff8015d442>] sys_read+0x52/0xa0
2013-07-03T23:51:27.520650+02:00 domUA kernel: [  108.522733] 
[<ffffffff8056ab3b>] system_call_fastpath+0x1a/0x1f
2013-07-03T23:51:27.520652+02:00 domUA kernel: [  108.522736] 
[<00007f45ef74c960>] 0x7f45ef74c95f
2013-07-03T23:51:27.520655+02:00 domUA kernel: [  108.522738] FIX
kmalloc-1024: Restoring 0xffff8800f66068f8-0xffff8800f66068ff=0xcc
2013-07-03T23:51:27.520657+02:00 domUA kernel: [  108.522738]
2013-07-03T23:51:27.679444+02:00 domUA kernel: [  108.671750]
=============================================================================
2013-07-03T23:51:27.679454+02:00 domUA kernel: [  108.671753] BUG
kmalloc-1024 (Tainted: G    B   W   ): Redzone overwritten
2013-07-03T23:51:27.679456+02:00 domUA kernel: [  108.671754]
-----------------------------------------------------------------------------
2013-07-03T23:51:27.679458+02:00 domUA kernel: [  108.671754]
2013-07-03T23:51:27.679460+02:00 domUA kernel: [  108.671757] INFO:
0xffff8800f66068f8-0xffff8800f66068ff. First byte 0xcc instead of 0xbb
2013-07-03T23:51:27.679462+02:00 domUA kernel: [  108.671762] INFO:
Allocated in __alloc_skb+0x88/0x260 age=48 cpu=0 pid=1325
2013-07-03T23:51:27.679464+02:00 domUA kernel: [  108.671765]  
set_track+0x6c/0x190
2013-07-03T23:51:27.679466+02:00 domUA kernel: [  108.671767]  
alloc_debug_processing+0x83/0x109
2013-07-03T23:51:27.679468+02:00 domUA kernel: [  108.671769]  
__slab_alloc.constprop.48+0x523/0x593
2013-07-03T23:51:27.679469+02:00 domUA kernel: [  108.671771]  
__kmalloc_track_caller+0xb4/0x200
2013-07-03T23:51:27.679471+02:00 domUA kernel: [  108.671773]  
__kmalloc_reserve+0x3c/0xa0
2013-07-03T23:51:27.679473+02:00 domUA kernel: [  108.671775]  
__alloc_skb+0x88/0x260
2013-07-03T23:51:27.679475+02:00 domUA kernel: [  108.671778]  
network_alloc_rx_buffers+0x76/0x5f0 [xennet]
2013-07-03T23:51:27.679476+02:00 domUA kernel: [  108.671781]  
netif_poll+0xcf4/0xf30 [xennet]
2013-07-03T23:51:27.679478+02:00 domUA kernel: [  108.671783]  
net_rx_action+0xf0/0x2e0

I noticed that after turning on all this debugging stuff, a real panic
does not appear any more.

This happens while copying a file with scp from dom0 to the guest (scp
bigfile domu:/dev/null).

In my lab, I am currently experimenting with a SuperMicro based system
with Xen showing the following characteristics:

__  __            _  _    ____    _     _ ____     _   _ ____    _  ___ 
 \ \/ /___ _ __   | || |  |___ \  / |   / |___ \   / | / |___ \  / |/ _ \
  \  // _ \ '_ \  | || |_   __) | | |   | | __) |__| | | | __) | | | | | |
  /  \  __/ | | | |__   _| / __/ _| |   | |/ __/|__| |_| |/ __/ _| | |_| |
 /_/\_\___|_| |_|    |_|(_)_____(_)_|___|_|_____|  |_(_)_|_____(_)_|\___/
                                   |_____|                               
(XEN) Xen version 4.2.1_12-1.12.10 (abuild@) (gcc (SUSE Linux) 4.7.2
20130108 [gcc-4_7-branch revision 195012]) Wed May 29 20:31:49 UTC 2013
(XEN) Latest ChangeSet: 25952
(XEN) Bootloader: GNU GRUB 0.97
(XEN) Command line: dom0_mem=2048M,max:2048M loglvl=all guest_loglvl=all
(XEN) Video information:
(XEN)  VGA is text mode 80x25, font 8x16
(XEN)  VBE/DDC methods: V2; EDID transfer time: 1 seconds
(XEN) Disc information:
(XEN)  Found 4 MBR signatures
(XEN)  Found 4 EDD information structures
(XEN) Xen-e820 RAM map:
(XEN)  0000000000000000 - 0000000000096400 (usable)
(XEN)  0000000000096400 - 00000000000a0000 (reserved)
(XEN)  00000000000e0000 - 0000000000100000 (reserved)
(XEN)  0000000000100000 - 00000000bf780000 (usable)
(XEN)  00000000bf78e000 - 00000000bf790000 type 9
(XEN)  00000000bf790000 - 00000000bf79e000 (ACPI data)
(XEN)  00000000bf79e000 - 00000000bf7d0000 (ACPI NVS)
(XEN)  00000000bf7d0000 - 00000000bf7e0000 (reserved)
(XEN)  00000000bf7ec000 - 00000000c0000000 (reserved)
(XEN)  00000000e0000000 - 00000000f0000000 (reserved)
(XEN)  00000000fee00000 - 00000000fee01000 (reserved)
(XEN)  00000000ffc00000 - 0000000100000000 (reserved)
(XEN)  0000000100000000 - 0000000340000000 (usable)

Skipping ACPI en SRAT

(XEN) System RAM: 12279MB (12573784kB)

(XEN) NUMA: Allocated memnodemap from 33e38a000 - 33e38e000
(XEN) NUMA: Using 8 for the hash shift.
(XEN) Domain heap initialised DMA width 30 bits
(XEN) found SMP MP-table at 000ff780
(XEN) DMI present.

(XEN) Enabling APIC mode:  Phys.  Using 2 I/O APICs
(XEN) ACPI: HPET id: 0x8086a301 base: 0xfed00000
(XEN) Failed to get Error Log Address Range.
(XEN) Using ACPI (MADT) for SMP configuration information
(XEN) SMP: Allowing 24 CPUs (8 hotplug CPUs)
(XEN) IRQ limits: 48 GSI, 3040 MSI/MSI-X
(XEN) Using scheduler: SMP Credit Scheduler (credit)
(XEN) Detected 2400.115 MHz processor.
(XEN) Initing memory sharing.
(XEN) mce_intel.c:1238: MCA Capability: BCAST 1 SER 0 CMCI 1 firstbank 0
extended MCE MSR 0
(XEN) Intel machine check reporting enabled
(XEN) PCI: MCFG configuration 0: base e0000000 segment 0000 buses 00 - ff
(XEN) PCI: MCFG area at e0000000 reserved in E820
(XEN) PCI: Using MCFG for segment 0000 bus 00-ff
(XEN) Intel VT-d iommu 0 supported page sizes: 4kB.
(XEN) Intel VT-d Snoop Control enabled.
(XEN) Intel VT-d Dom0 DMA Passthrough not enabled.
(XEN) Intel VT-d Queued Invalidation enabled.
(XEN) Intel VT-d Interrupt Remapping enabled.
(XEN) Intel VT-d Shared EPT tables not enabled.
(XEN) I/O virtualisation enabled
(XEN)  - Dom0 mode: Relaxed
(XEN) Interrupt remapping enabled
(XEN) Enabled directed EOI with ioapic_ack_old on!
(XEN) ENABLING IO-APIC IRQs
(XEN) ..TIMER: vector=0xF0 apic1=0 pin1=2 apic2=-1 pin2=-1
(XEN) Platform timer is 14.318MHz HPET
(XEN) Allocated console ring of 128 KiB.
(XEN) VMX: Supported advanced features:
(XEN)  - APIC MMIO access virtualisation
(XEN)  - APIC TPR shadow
(XEN)  - Extended Page Tables (EPT)
(XEN)  - Virtual-Processor Identifiers (VPID)
(XEN)  - Virtual NMI
(XEN)  - MSR direct-access bitmap
(XEN)  - Unrestricted Guest
(XEN) HVM: ASIDs enabled.
(XEN) HVM: VMX enabled
(XEN) HVM: Hardware Assisted Paging (HAP) detected
(XEN) HVM: HAP page sizes: 4kB, 2MB, 1GB
(XEN) Brought up 16 CPUs
(XEN) ACPI sleep modes: S3
(XEN) mcheck_poll: Machine check polling timer started.
(XEN) *** LOADING DOMAIN 0 ***
(XEN)  Xen  kernel: 64-bit, lsb, compat32
(XEN)  Dom0 kernel: 64-bit, lsb, paddr 0x2000 -> 0xa65000
(XEN) PHYSICAL MEMORY ARRANGEMENT:
(XEN)  Dom0 alloc.:   0000000336000000->0000000337000000 (516915 pages
to be allocated)
(XEN)  Init. ramdisk: 000000033f333000->0000000340000000
(XEN) VIRTUAL MEMORY ARRANGEMENT:
(XEN)  Loaded kernel: ffffffff80002000->ffffffff80a65000
(XEN)  Init. ramdisk: 0000000000000000->0000000000000000
(XEN)  Phys-Mach map: ffffea0000000000->ffffea0000400000
(XEN)  Start info:    ffffffff80a65000->ffffffff80a654b4
(XEN)  Page tables:   ffffffff80a66000->ffffffff80a6f000
(XEN)  Boot stack:    ffffffff80a6f000->ffffffff80a70000
(XEN)  TOTAL:         ffffffff80000000->ffffffff80c00000
(XEN)  ENTRY ADDRESS: ffffffff80002000
(XEN) Dom0 has maximum 16 VCPUs
(XEN) Scrubbing Free RAM:
.....................................................................................................done.
(XEN) Initial low memory virq threshold set at 0x4000 pages.
(XEN) Std. Loglevel: All
(XEN) Guest Loglevel: All
(XEN) Xen is relinquishing VGA console.



(XEN) ACPI: RSDP 000FACE0, 0024 (r2 ACPIAM)
(XEN) ACPI: XSDT BF790100, 008C (r1 SMCI            20110827 MSFT       97)
(XEN) ACPI: FACP BF790290, 00F4 (r4 082711 FACP1638 20110827 MSFT       97)
(XEN) ACPI: DSDT BF7906A0, 6563 (r2  10600 10600000        0 INTL 20051117)
(XEN) ACPI: FACS BF79E000, 0040
(XEN) ACPI: APIC BF790390, 011E (r2 082711 APIC1638 20110827 MSFT       97)
(XEN) ACPI: MCFG BF7904B0, 003C (r1 082711 OEMMCFG  20110827 MSFT       97)
(XEN) ACPI: SLIT BF7904F0, 0030 (r1 082711 OEMSLIT  20110827 MSFT       97)
(XEN) ACPI: OEMB BF79E040, 0085 (r1 082711 OEMB1638 20110827 MSFT       97)
(XEN) ACPI: SRAT BF79A6A0, 01D0 (r2 082711 OEMSRAT         1 INTL        1)
(XEN) ACPI: HPET BF79A870, 0038 (r1 082711 OEMHPET  20110827 MSFT       97)
(XEN) ACPI: DMAR BF79E0D0, 0130 (r1    AMI  OEMDMAR        1 MSFT       97)
(XEN) ACPI: SSDT BF7A1B30, 0363 (r1 DpgPmm    CpuPm       12 INTL 20051117)
(XEN) ACPI: EINJ BF79A8B0, 0130 (r1  AMIER AMI_EINJ 20110827 MSFT       97)
(XEN) ACPI: BERT BF79AA40, 0030 (r1  AMIER AMI_BERT 20110827 MSFT       97)
(XEN) ACPI: ERST BF79AA70, 01B0 (r1  AMIER AMI_ERST 20110827 MSFT       97)
(XEN) ACPI: HEST BF79AC20, 00A8 (r1  AMIER ABC_HEST 20110827 MSFT       97)
(XEN) System RAM: 12279MB (12573784kB)
(XEN) SRAT: PXM 0 -> APIC 0 -> Node 0
(XEN) SRAT: PXM 0 -> APIC 2 -> Node 0
(XEN) SRAT: PXM 0 -> APIC 18 -> Node 0
(XEN) SRAT: PXM 0 -> APIC 20 -> Node 0
(XEN) SRAT: PXM 0 -> APIC 1 -> Node 0
(XEN) SRAT: PXM 0 -> APIC 3 -> Node 0
(XEN) SRAT: PXM 0 -> APIC 19 -> Node 0
(XEN) SRAT: PXM 0 -> APIC 21 -> Node 0

I am happy to assist in more kernel probing. It is even possible for me
to setup access for someone to this machine.

Best regards,

Dion Kant

On 05/17/2013 10:59 AM, Wei Liu wrote:
> Moving discussion to Xen-devel
>
> On Thu, May 16, 2013 at 10:29:56PM +0300, Eugene Istomin wrote:
>> Hello,
>>
>> I tried to use 3.9.2 kernel with xen 4.2.2/4.3rc1 and in both variants leads 
>> to this error in network-intensive load (such as iperf, 100 nginx parallel 
>> requests to 1M files and so on):
>>
> It would be more helpful if you can provide info on your configurations
> (Dom0 and DomU), your workload, how to reproduce the bug.
>
> I run iperf and NFS to test Xen network, but never see any crash like
> this myself.
>
>> BUG: unable to handle kernel paging request at ffff8800795a3000
>> [   60.246945] IP: [<ffffffffa001a75c>] netif_poll+0x49c/0xe80 [xennet]
>> [   60.246975] PGD a8a067 PUD a9a067 PMD 7fc27067 PTE 
>> 80100000795a3065
>> [   60.247004] Oops: 0003 [#1] SMP 
>> [   60.247020] Modules linked in: af_packet hwmon domctl crc32_pclmul 
>> crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw 
>> aes_x86_64 joydev xts gf128mul autofs4 scsi_dh_emc scsi_dh_alua 
>> scsi_dh_rdac scsi_dh_hp_sw scsi_dh xenblk cdrom xennet ata_generic 
>> ata_piix
>> [   60.247144] CPU 0 
>> [   60.247154] Pid: 0, comm: swapper/0 Not tainted 3.9.2-1.g04040b9-xen 
>> #1  
>> [   60.247179] RIP: e030:[<ffffffffa001a75c>]  [<ffffffffa001a75c>] 
>> netif_poll+0x49c/0xe80 [xennet]
>> ...
> Could you provide fuul stack trace? AFAICT there is no netif_poll in Xen
> netfront/back.
>
> Presumably this is Dom0 log? (from the domctl module)
>
>> We have couple of production hypervisors on 3.4 kernels with high-
>> throughput  internal network (VM-to-VM in one Dom0), iperf on them is 
>> working well:
>> [  3]  0.0- 2.0 sec  3357 MBytes  14080 Mbits/sec
>> [  3]  2.0- 4.0 sec  2880 MBytes  12077 Mbits/sec
>> [  3]  4.0- 6.0 sec  2909 MBytes  12202 Mbits/sec
>> [  3]  6.0- 8.0 sec  2552 MBytes  10702 Mbits/sec
>> [  3]  8.0-10.0 sec  3616 MBytes  15166 Mbits/sec
>> [  3] 10.0-12.0 sec  3415 MBytes  14324 Mbits/sec
>>
>>
>> Seems like a kernel bug, is this related to one of this fixes in linux-next 
>> or i 
>> need to create new bugreport?
>>
>> 1) 
>> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=1aaf6d3d3d1e95f4be07e32dd84aa1c93855fbbd
>> 2) 
>> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=9ecd1a75d977e2e8c48139c7d3efed183f898d94
>> 3) 
>> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=2810e5b9a7731ca5fce22bfbe12c96e16ac44b6f
>> 4) 
>> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=03393fd5cc2b6cdeec32b704ecba64dbb0feae3c
>> 5) 
>> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=59ccb4ebbc35e36a3c143f2d1355deb75c2e628f
>>
> I don't think these patches can fix your problem at first glance.
>
>
> Wei.
>
> _______________________________________________
> Xen-users mailing list
> Xen-users@xxxxxxxxxxxxx
> http://lists.xen.org/xen-users


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.