[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] domU crash with kernel BUG at drivers/net/xen-netfront.c:305



On Thu, Jan 02, 2014 at 01:09:35PM +0800, annie li wrote:
> 
> On 2013/12/27 19:09, Vasily Evseenko wrote:
> >Hi,
> >
> >I've got domU crash (~ every 1-2 days under high network (tcp) load)
> >with message:
> >
> >-----
> >[2013-12-26 03:53:18] kernel BUG at drivers/net/xen-netfront.c:305!
> >[2013-12-26 03:53:18] invalid opcode: 0000 [#1] SMP
> >[2013-12-26 03:53:18] Modules linked in: ipt_REJECT iptable_filter
> >xt_set xt_REDIRECT iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4
> >nf_nat_ipv4 nf_nat
> >ip_tables ip_set_hash_net ip_set_hash_ip ip_set nfnetlink ip6t_REJECT
> >nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter
> >ip6_table
> >s ipv6 ext3 jbd xen_netfront coretemp hwmon crc32_pclmul crc32c_intel
> >ghash_clmulni_intel microcode pcspkr ext4 jbd2 mbcache aesni_intel
> >ablk_helper c
> >ryptd lrw gf128mul glue_helper aes_x86_64 xen_blkfront dm_mirror
> >dm_region_hash dm_log dm_mod
> >[2013-12-26 03:53:18] CPU: 0 PID: 15126 Comm: python Not tainted
> >3.10.25-11.x86_64 #1
> >[2013-12-26 03:53:18] task: ffff8801e5d68ac0 ti: ffff8801e7392000
> >task.ti: ffff8801e7392000
> >[2013-12-26 03:53:18] RIP: e030:[<ffffffffa015d637>]
> >[<ffffffffa015d637>] xennet_alloc_rx_buffers+0x347/0x360 [xen_netfront]
> >[2013-12-26 03:53:18] RSP: e02b:ffff8801f2e03ce0  EFLAGS: 00010282
> >[2013-12-26 03:53:18] RAX: 00000000000001d4 RBX: ffff8801e5438800 RCX:
> >0000000000000001
> >[2013-12-26 03:53:18] RDX: 000000000000002a RSI: 0000000000000000 RDI:
> >0000000000002200
> >[2013-12-26 03:53:18] RBP: ffff8801f2e03d40 R08: 0000000000000000 R09:
> >0000000000001000
> >[2013-12-26 03:53:18] R10: ffff8801000083c0 R11: dead000000200200 R12:
> >0000000000000220
> >[2013-12-26 03:53:18] R13: ffff8801e6eec0c0 R14: 000000000000002a R15:
> >000000000239642a
> >[2013-12-26 03:53:18] FS:  00007f4cf48d57e0(0000)
> >GS:ffff8801f2e00000(0000) knlGS:0000000000000000
> >[2013-12-26 03:53:18] CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
> >[2013-12-26 03:53:18] CR2: ffffffffff600400 CR3: 00000001e0db3000 CR4:
> >0000000000042660
> >[2013-12-26 03:53:18] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> >0000000000000000
> >[2013-12-26 03:53:18] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> >0000000000000400
> >[2013-12-26 03:53:18] Stack:
> >[2013-12-26 03:53:18]  ffff8801f2e03df0 02396417e5438000
> >ffff8801e5439d58 ffff8801e54394f0
> >[2013-12-26 03:53:18]  ffff8801e5438000 002affff00000013
> >ffff8801f2e03d40 ffff8801f2e03db0
> >[2013-12-26 03:53:18]  0000000000000010 ffff8800655e6ac0
> >ffff8801e5438800 ffff8801e511a000
> >[2013-12-26 03:53:18] Call Trace:
> >[2013-12-26 03:53:18]  <IRQ>
> >[2013-12-26 03:53:18]  [<ffffffffa015dc44>] xennet_poll+0x2f4/0x630
> >[xen_netfront]
> >[2013-12-26 03:53:18]  [<ffffffff810640a9>] ? raise_softirq_irqoff+0x9/0x50
> >[2013-12-26 03:53:18]  [<ffffffff8152050c>] ? dev_kfree_skb_irq+0x5c/0x70
> >[2013-12-26 03:53:18]  [<ffffffff810e4fb9>] ?
> >handle_irq_event_percpu+0xc9/0x210
> >[2013-12-26 03:53:18]  [<ffffffff81528022>] net_rx_action+0x112/0x290
> >[2013-12-26 03:53:18]  [<ffffffff810e514d>] ? handle_irq_event+0x4d/0x70
> >[2013-12-26 03:53:18]  [<ffffffff81063c97>] __do_softirq+0xf7/0x270
> >[2013-12-26 03:53:18]  [<ffffffff81600edc>] call_softirq+0x1c/0x30
> >[2013-12-26 03:53:18]  [<ffffffff81014505>] do_softirq+0x65/0xa0
> >[2013-12-26 03:53:18]  [<ffffffff810639c5>] irq_exit+0xc5/0xd0
> >[2013-12-26 03:53:18]  [<ffffffff81351e45>] xen_evtchn_do_upcall+0x35/0x50
> >[2013-12-26 03:53:18]  [<ffffffff81600f3e>]
> >xen_do_hypervisor_callback+0x1e/0x30
> >[2013-12-26 03:53:18]  <EOI>
> >[2013-12-26 03:53:18] Code: 8b 35 ee f9 bb e1 48 8d bb 08 0d 00 00 48 83
> >c6 64 e8 2e f2 f0 e0 8b 83 ec 0c 00 00 31 d2 89 c1 d1 e9 39 d1 76 9e e9
> >5a ff ff ff <0f> 0b eb fe 0f 0b 0f 1f 00 eb fb 66 66 66 66 66 2e 0f 1f
> >84 00
> >[2013-12-26 03:53:18] RIP  [<ffffffffa015d637>]
> >xennet_alloc_rx_buffers+0x347/0x360 [xen_netfront]
> >[2013-12-26 03:53:18]  RSP <ffff8801f2e03ce0>
> >------------
> >
> >dom0 and domU kernels are vanilla 3.10.25
> >host server has 4 cores x 2 threads with mapping: 4 - dom0, 2 - domU, 2
> >- domU
> >i've tried xen versions: 4.2.3 and 4.3.1
> >also i've tried to disable offloaing on domU:  ethtool -K eth0 tx off
> >tso off gso off   ----  no effects
> >
> >domU's are under high TCP load (a lot of small tcp connections (web server))
> >sometimes  i've got on dom0:
> >---
> >[2013-12-26 00:16:30] (XEN) grant_table.c:289:d0 Increased maptrack size
> >to 2 frames
> >[2013-12-26 03:53:18] (XEN) grant_table.c:1858:d0 Bad grant reference
> >99221507
> >[2013-12-26 03:53:18] (XEN) grant_table.c:1858:d0 Bad grant reference
> >43646979
> >[2013-12-26 03:53:18] (XEN) grant_table.c:1858:d0 Bad grant reference
> >43646979
> >[2013-12-26 03:53:18] (XEN) grant_table.c:1858:d0 Bad grant reference
> >99221507
> >[2013-12-26 06:15:14] (XEN) grant_table.c:1858:d0 Bad grant reference
> >43646979
> >[2013-12-26 06:15:14] (XEN) grant_table.c:1858:d0 Bad grant reference
> >99221507
> >[2013-12-26 06:15:14] (XEN) grant_table.c:1858:d0 Bad grant reference
> >99221507
> >[2013-12-26 06:15:14] (XEN) grant_table.c:1858:d0 Bad grant reference
> >99221507
> >
> >---
> >
> >It seems the root of problem in dom0 messages above. Is it HW failure or
> >some internal kernel structures overflow?
> From the stack, it reminds me this issue is very likely same with
> the one which has been fixed. There is something wrong with counting
> slots in netback, and then responses overlapps request in the ring,
> and grantcopy gets wrong grant reference and throws out error in
> grant_table.c. See
> http://lists.xen.org/archives/html/xen-devel/2013-09/msg01143.html
> There were some back and forth work for this issue, but seems the
> fix patch exists since v3.12-rc4. Would you like to have a try with
> newer kernel version?
> 

If that patch fixes the bug it sounds like it needs to be backported
to at least 3.10.x aswell..

-- Pasi


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.