[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFC V2] xen/netback: Count ring slots properly when larger MTU sizes are used



On Thu, Aug 30, 2012 at 09:07:11AM +0100, Ian Campbell wrote:
> On Wed, 2012-08-29 at 13:21 +0100, Palagummi, Siva wrote:
> > This patch contains the modifications that are discussed in thread
> > http://lists.xen.org/archives/html/xen-devel/2012-08/msg01730.html

[...]

> > Instead of using max_required_rx_slots, I used the count that we
> > already have in hand to verify if we have enough room in the batch
> > queue for next skb. Please let me know if that is not appropriate.
> > Things worked fine in my environment. Under heavy load now we seems to
> > be consuming most of the slots in the queue and no BUG_ON :-)
> 
> > From: Siva Palagummi <Siva.Palagummi@xxxxxx>
> > 
> > count variable in xen_netbk_rx_action need to be incremented
> > correctly to take into account of extra slots required when skb_headlen is 
> > greater than PAGE_SIZE when larger MTU values are used. Without this change
> > BUG_ON(npo.meta_prod > ARRAY_SIZE(netbk->meta)) is causing netback thread 
> > to exit.
> > 
> > The fix is to stash the counting already done in xen_netbk_count_skb_slots
> > in skb_cb_overlay and use it directly in xen_netbk_rx_action.

You don't need to stash the estimated value to use it in
xen_netbk_rx_action() - you have the actual number of slots consumed
in hand from the return value of netbk_gop_skb(). See below.

> > Also improved the checking for filling the batch queue. 
> > 
> > Also merged a change from a patch created for xen_netbk_count_skb_slots 
> > function as per thread 
> > http://lists.xen.org/archives/html/xen-devel/2012-05/msg01864.html
> > 
> > The problem is seen with linux 3.2.2 kernel on Intel 10Gbps network
> > 
> > 
> > Signed-off-by: Siva Palagummi <Siva.Palagummi@xxxxxx>
> > ---
> > diff -uprN a/drivers/net/xen-netback/netback.c 
> > b/drivers/net/xen-netback/netback.c
> > --- a/drivers/net/xen-netback/netback.c     2012-01-25 19:39:32.000000000 
> > -0500
> > +++ b/drivers/net/xen-netback/netback.c     2012-08-28 17:31:22.000000000 
> > -0400
> > @@ -80,6 +80,11 @@ union page_ext {
> >     void *mapping;
> >  };
> >  
> > +struct skb_cb_overlay {
> > +   int meta_slots_used;
> > +   int count;
> > +};

We don't actually need a separate variable for the estimate. We could
rename meta_slots_used to meta_slots. It could hold the estimate
before netbk_gop_skb() is called and the actual number of slots
consumed after. That might be confusing, though, so maybe it's better
off left as two variables.

> >  struct xen_netbk {
> >     wait_queue_head_t wq;
> >     struct task_struct *task;
> > @@ -324,9 +329,9 @@ unsigned int xen_netbk_count_skb_slots(s
> >  {
> >     unsigned int count;
> >     int i, copy_off;
> > +   struct skb_cb_overlay *sco;
> >  
> > -   count = DIV_ROUND_UP(
> > -                   offset_in_page(skb->data)+skb_headlen(skb), PAGE_SIZE);
> > +   count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE);
> 
> This hunk appears to be upstream already (see
> e26b203ede31fffd52571a5ba607a26c79dc5c0d). Which tree are you working
> against? You should either base patches on Linus' branch or on the
> net-next branch.
> 
> Other than this the patch looks good, thanks.

I don't think that this patch completely addresses problems
calculating the number of slots required when large MTUs are used.

For example: if netback is handling a skb with a large linear data
portion, say 8157 bytes, that begins at 64 bytes from the start of the
page. Assume that GSO is disabled and there are no frags.
xen_netbk_count_skb_slots() will calculate that two slots are needed:

    count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE);

but netbk_gop_frag_copy() will actually use three. Let's walk through
the loop:

        data = skb->data;
        while (data < skb_tail_pointer(skb)) {
                unsigned int offset = offset_in_page(data);
                unsigned int len = PAGE_SIZE - offset;

                if (data + len > skb_tail_pointer(skb))
                        len = skb_tail_pointer(skb) - data;

                netbk_gop_frag_copy(vif, skb, npo,
                                    virt_to_page(data), len, offset, &head);
                data += len;
        }

The first pass will call netbk_gop_frag_copy() with len=4032,
offset=64, and head=1. After the call, head will be set to 0. Inside
netbk_gop_frag_copy(), start_new_rx_buffer() will be called with
offset=0, size=4032, head=1. We'll return false due to the checks for
"offset" and "!head".

The second pass will call netbk_gop_frag_copy() with len=4096,
offset=0, head=0. netbk_gop_frag_copy() will get called with
offset=4032, bytes=4096, head=0. We'll return true here, which we
shouldn't since it's just going to lead to buffer waste for the last
bit.

The third pass will call with len=29 and offset=0. start_new_rx_buffer()
will be called with offset=4096, bytes=29, head=0. We'll start a new
buffer for the last bit.

So you can see that we underestimate the buffers / meta slots required
to handle a skb with a large linear buffer, as we commonly have to
handle with large MTU sizes. This can lead to problems later on.

I'm not as familiar with the new compound page frag handling code, but
I can imagine that the same problem could exist there. But since the
calculation logic in xen_netbk_count_skb_slots() directly models the
code setting up the copy operation, at least it will be estimated
properly.

> >  
> >     copy_off = skb_headlen(skb) % PAGE_SIZE;
> >  
> > @@ -352,6 +357,8 @@ unsigned int xen_netbk_count_skb_slots(s
> >                     size -= bytes;
> >             }
> >     }
> > +   sco = (struct skb_cb_overlay *)skb->cb;
> > +   sco->count = count;
> >     return count;
> >  }
> >  
> > @@ -586,9 +593,6 @@ static void netbk_add_frag_responses(str
> >     }
> >  }
> >  
> > -struct skb_cb_overlay {
> > -   int meta_slots_used;
> > -};
> >  
> >  static void xen_netbk_rx_action(struct xen_netbk *netbk)
> >  {
> > @@ -621,12 +625,16 @@ static void xen_netbk_rx_action(struct x
> >             sco = (struct skb_cb_overlay *)skb->cb;
> >             sco->meta_slots_used = netbk_gop_skb(skb, &npo);
> >  
> > -           count += nr_frags + 1;
> > +           count += sco->count;

Why increment count by the /estimated/ count instead of the actual
number of slots used? We have the number of slots in the line just
above, in sco->meta_slots_used.

> >             __skb_queue_tail(&rxq, skb);
> >  
> > +           skb = skb_peek(&netbk->rx_queue);
> > +           if (skb == NULL)
> > +                   break;
> > +           sco = (struct skb_cb_overlay *)skb->cb;
> >             /* Filled the batch queue? */
> > -           if (count + MAX_SKB_FRAGS >= XEN_NETIF_RX_RING_SIZE)
> > +           if (count + sco->count >= XEN_NETIF_RX_RING_SIZE)
> >                     break;
> >     }
> >  

This change I like.

We're working on a patch to improve the buffer efficiency and the
miscalculation problem. Siva, I'd be happy to re-base and re-submit
this patch (with minor adjustments) as part of that work, unless you
want to handle that.

Matt

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.