[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] RFH: Kernel OOPS in xen_netbk_rx_action / xenvif_gop_skb



On Wed, Jun 18, 2014 at 06:48:31PM +0200, Philipp Hahn wrote:
[...]
> 
> (gdb) list *(xen_netbk_rx_action+0x18b)
> 0xffffffffa04287dc is in xen_netbk_rx_action
> (/var/build/temp/tmp.hW3dNilayw/pbuilder/linux-3.10.11/drivers/net/xen-netback/netback
> .c:611).
> 606                     meta->gso_size = skb_shinfo(skb)->gso_size;
> 607             else
> 608                     meta->gso_size = 0;
> 609
> 610             meta->size = 0;
> 611             meta->id = req->id;
> 612             npo->copy_off = 0;
> 613             npo->copy_gref = req->gref;
> 614
> 615             data = skb->data;
> 
> 
> After more debugging today I think something like this happens:
> 
> 1. The VM is receiving packets through bonding + bridge + netback +
> netfront.
> 
> 2. For some unknown reason at least one packet remains in the rx queue
> and is not delivered to the domU immediately by netback.
> 
> 3. The VM finishes shutting down.
> 
> 4. The shared ring between dom0 and domU is freed.
> 
> 5. then xen-netback continues processing the pending requests and tries
> to put the packet into the now already released shared ring.
> 
> 
> >From reading the attached disassembly I guess, that
>  AX = &meta
>  CX = &rx->string
>  DX =~ rx.req_cons
>  CR2 = &req->id
> where
>  CX + DX * sizeof(union struct xen_netif_rx_{request,response})=8 = CR2
> 
> 
> Any additional ideas or insight is appreciated.
> 

I think your analysis makes sense. Netback does have it's internal queue
and kthread can certainly be scheduled away. There doesn't seem to be a
synchronisation point between a vif getting disconnet and internal queue
gets processed. I attach a quick hack. If it does work to a degree then
we can try to work out a proper fix.

> FYI: The host has only a single CPU and is running >=2 VMs so far.
> 
> >> There's one more patch that you can pick up from 3.10.y tree. I doubt it
> >> will make much difference though.
> 
> Which patch are you referring to?
> 

You can have a look at 3.10.y tree for all the patches between your
current version and the latest stable version.

Wei.

---8<---
From d2f428a93e6e296bc5f55e16f44ac1ad63a951a8 Mon Sep 17 00:00:00 2001
From: Wei Liu <wei.liu2@xxxxxxxxxx>
Date: Thu, 19 Jun 2014 15:07:47 +0100
Subject: [PATCH] quick hack

---
 drivers/net/xen-netback/common.h    |    1 +
 drivers/net/xen-netback/interface.c |    1 +
 drivers/net/xen-netback/netback.c   |    8 ++++++++
 3 files changed, 10 insertions(+)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index f2faa77..9239824 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -66,6 +66,7 @@ struct xenvif {
        /* The shared rings and indexes. */
        struct xen_netif_tx_back_ring tx;
        struct xen_netif_rx_back_ring rx;
+       bool mapped;
 
        /* Frontend feature information. */
        u8 can_sg:1;
diff --git a/drivers/net/xen-netback/interface.c 
b/drivers/net/xen-netback/interface.c
index 540a796..5f11763 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -271,6 +271,7 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t 
domid,
        vif->dev = dev;
        INIT_LIST_HEAD(&vif->schedule_list);
        INIT_LIST_HEAD(&vif->notify_list);
+       vif->mapped = false;
 
        vif->credit_bytes = vif->remaining_credit = ~0UL;
        vif->credit_usec  = 0UL;
diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index 36efb41..f4f3693 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -720,6 +720,11 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
                vif = netdev_priv(skb->dev);
                nr_frags = skb_shinfo(skb)->nr_frags;
 
+               if (!vif->mapped) {
+                       dev_kfree_skb(skb);
+                       continue;
+               }
+
                sco = (struct skb_cb_overlay *)skb->cb;
                sco->meta_slots_used = netbk_gop_skb(skb, &npo);
 
@@ -1864,6 +1869,8 @@ static int xen_netbk_kthread(void *data)
 
 void xen_netbk_unmap_frontend_rings(struct xenvif *vif)
 {
+       vif->mapped = false;
+
        if (vif->tx.sring)
                xenbus_unmap_ring_vfree(xenvif_to_xenbus_device(vif),
                                        vif->tx.sring);
@@ -1899,6 +1906,7 @@ int xen_netbk_map_frontend_rings(struct xenvif *vif,
        BACK_RING_INIT(&vif->rx, rxs, PAGE_SIZE);
 
        vif->rx_req_cons_peek = 0;
+       vif->mapped = true;
 
        return 0;
 
-- 
1.7.10.4


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.