[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] why we need call xennet_alloc_rx_buffers() in xennet_poll() in Netfront module?



On Mon, Jan 7, 2013 at 8:31 AM, Konrad Rzeszutek Wilk
<konrad.wilk@xxxxxxxxxx> wrote:
>
> On Tue, Nov 27, 2012 at 11:42:11PM -0500, David Xu wrote:
> > Hi all,
> >
> > Why we call xennet_alloc_rx_buffers function in xennet_poll of netfront?
>
> That is an easy answer - we need to allocate receive buffers :-)
> >
> > When I run the iperf benchmark to measure the TCP throughput between a
> > physical machine and a VM, the TCP server which is a Xen VM crashed.
> >
> > I traced the source code and found that the error is from  xennet_poll
> > =>  xennet_alloc_rx_buffers => __skb_dequeue => __skb_unlink
>
>
> Hm, that is rather strange that you would hit this. Is this problem
> still present with the latest released kernel?

Hi, I am seeing a crash in xennet_poll => xennet_alloc_rx_buffers in
an old 2.6.27 kernel under somewhat similar circumstances (iperf on
domu). Wondering if there was any specific root-cause determined for
the crash reported below or if there are any specific patches that is
expected to fix this in latest kernels that I can look at for
back-porting.

> >
> > 1187 static inline struct sk_buff *__skb_dequeue(struct sk_buff_head *list)
> > 1188 {
> > 1189         struct sk_buff *skb = skb_peek(list);
> > 1190         if (skb)
> > 1191                 __skb_unlink(skb, list);
> > 1192         return skb;
> > 1193 }
> >
> > error is from __skb_unlinkï
> >
> > 1166 static inline void __skb_unlink(struct sk_buff *skb, struct
> > sk_buff_head *list)
> > 1167 {
> > 1168         struct sk_buff *next, *prev;
> > 1169
> > 1170         list->qlen--;
> > 1171         next       = skb->next;
> > 1172         prev       = skb->prev;
> > 1173         skb->next  = skb->prev = NULL;
> > 1174         next->prev = prev;
> > 1175         prev->next = next;
> > 1176 }
> >
> > in this line: next->prev = prev; I found the pointer "next" is NULL
> >
> > Do you know why? Thanks.
> >
> > [  100.973027] BUG: unable to handle kernel NULL pointer dereference at
> > 0000000000000008
> > [  100.973040] IP: [<ffffffff81455f16>] xennet_alloc_rx_buffers+0x166/0x350
> > [  100.973050] PGD 1cc98067 PUD 1d74c067 PMD 0
> > [  100.973051] Oops: 0002 [#1] SMP
> > [  100.973051] CPU 1
> > [  100.973051] Modules linked in:
> > [  100.973051]
> > [  100.973051] Pid: 9, comm: ksoftirqd/1 Not tainted 3.2.23 #131
> > [  100.973051] RIP: e030:[<ffffffff81455f16>]  [<ffffffff81455f16>]
> > xennet_alloc_rx_buffers+0x166/0x350
> > [  100.973051] RSP: e02b:ffff88001e8f1c10  EFLAGS: 00010206
> > [  100.973051] RAX: 0000000000000000 RBX: ffff88001da98000 RCX:
> > 00000000000012b0
> > [  100.973051] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
> > 0000000000000256
> > [  100.973051] RBP: ffff88001e8f1c60 R08: ffffc90000000000 R09:
> > 0000000000017a41
> > [  100.973051] R10: 0000000000000002 R11: 0000000000017298 R12:
> > ffff880019a7b700
> > [  100.973051] R13: 0000000000000256 R14: 0000000000012092 R15:
> > 0000000000000092
> > [  100.973051] FS:  00007f7ace91f700(0000) GS:ffff88001fd00000(0000)
> > knlGS:0000000000000000
> > [  100.973051] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> > [  100.973051] CR2: 0000000000000008 CR3: 000000001da64000 CR4:
> > 0000000000002660
> > [  100.973051] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> > 0000000000000000
> > [  100.973051] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> > 0000000000000400
> > [  100.973051] Process ksoftirqd/1 (pid: 9, threadinfo ffff88001e8f0000,
> > task ffff88001e8d7000)
> > [  100.973051] Stack:
> > [  100.973051]  0000000000000091 ffff88001da99c80 ffff88001da99400
> > 0000000100012091
> > [  100.973051]  ffff88001e8f1c60 000000000000002d ffff880019a8ac4e
> > ffff88001fd1a590
> > [  100.973051]  0000160000000000 ffff880000000000 ffff88001e8f1db0
> > ffffffff8145699a
> > [  100.973051] Call Trace:
> > [  100.973051]  [<ffffffff8145699a>] xennet_poll+0x7ca/0xe80
> > [  100.973051]  [<ffffffff814e3e51>] net_rx_action+0x151/0x2b0
> > [  100.973051]  [<ffffffff8106090d>] __do_softirq+0xbd/0x250
> > [  100.973051]  [<ffffffff81060b67>] run_ksoftirqd+0xc7/0x170
> > [  100.973051]  [<ffffffff81060aa0>] ? __do_softirq+0x250/0x250
> > [  100.973051]  [<ffffffff8107b0ac>] kthread+0x8c/0xa0
> > [  100.973051]  [<ffffffff8167ca04>] kernel_thread_helper+0x4/0x10
> > [  100.973051]  [<ffffffff81672d21>] ? retint_restore_args+0x13/0x13
> > [  100.973051]  [<ffffffff8167ca00>] ? gs_change+0x13/0x13
> > [  100.973051] Code: 0f 84 19 01 00 00 83 ab 10 14 00 00 01 45 0f b6 fe 49
> > 8b 14 24 49 8b 44 24 08 49 c7 04 24 00 00 00 00 49 c7 44 24 08 00 00 00 00
> > <48> 89 42 08 48 89 10 41 0f b6 d7 49 89 5c 24 20 48 8d 82 b8 01
> > [  100.973051] RIP  [<ffffffff81455f16>] xennet_alloc_rx_buffers+0x166/0x350
> > [  100.973051]  RSP <ffff88001e8f1c10>
> > [  100.973051] CR2: 0000000000000008
> > [  100.973259] ---[ end trace b0530821c3527d70 ]---
> > [  100.973263] Kernel panic - not syncing: Fatal exception in interrupt
> > [  100.973267] Pid: 9, comm: ksoftirqd/1 Tainted: G      D      3.2.23 #131
> > [  100.973270] Call Trace:
> > [  100.973273]  [<ffffffff816674ae>] panic+0x91/0x1a2
> > [  100.973278]  [<ffffffff8100adb2>] ? check_events+0x12/0x20
> > [  100.973282]  [<ffffffff81673b0a>] oops_end+0xea/0xf0
> > [  100.973286]  [<ffffffff81666e6b>] no_context+0x214/0x223
> > [  100.973291]  [<ffffffff8113cf94>] ? kmem_cache_free+0x104/0x110
> > [  100.973295]  [<ffffffff8166704b>] __bad_area_nosemaphore+0x1d1/0x1f0
> > [  100.973299]  [<ffffffff8166707d>] bad_area_nosemaphore+0x13/0x15
> > [  100.973304]  [<ffffffff816763fb>] do_page_fault+0x35b/0x4f0
> > [  100.973308]  [<ffffffff814d6044>] ? __netdev_alloc_skb+0x24/0x50
> > [  100.973313]  [<ffffffff8129f75a>] ? trace_hardirqs_off_thunk+0x3a/0x6c
> > [  100.973318]  [<ffffffff81672fa5>] page_fault+0x25/0x30
> > [  100.973322]  [<ffffffff81455f16>] ? xennet_alloc_rx_buffers+0x166/0x350
> > [  100.973326]  [<ffffffff8145699a>] xennet_poll+0x7ca/0xe80
> > [  100.973330]  [<ffffffff814e3e51>] net_rx_action+0x151/0x2b0
> > [  100.973334]  [<ffffffff8106090d>] __do_softirq+0xbd/0x250
> > [  100.973338]  [<ffffffff81060b67>] run_ksoftirqd+0xc7/0x170
> > [  100.973342]  [<ffffffff81060aa0>] ? __do_softirq+0x250/0x250
> > [  100.973346]  [<ffffffff8107b0ac>] kthread+0x8c/0xa0
> > [  100.973350]  [<ffffffff8167ca04>] kernel_thread_helper+0x4/0x10
> > [  100.973354]  [<ffffffff81672d21>] ? retint_restore_args+0x13/0x13
> > [  100.973358]  [<ffffffff8167ca00>] ? gs_change+0x13/0x13
> >
> > Regards,
> > Cong
>
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@xxxxxxxxxxxxx
> > http://lists.xen.org/xen-devel
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxx
> http://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.