[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] xen-netback: fix occasional leak of grant ref mappings under memory pressure
> -----Original Message----- > From: Xen-devel [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxxx] On Behalf Of > Paul Durrant > Sent: 28 February 2019 11:22 > To: Wei Liu <wei.liu2@xxxxxxxxxx> > Cc: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx>; Wei Liu > <wei.liu2@xxxxxxxxxx>; netdev@xxxxxxxxxxxxxxx; > linux-kernel@xxxxxxxxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxxx; > davem@xxxxxxxxxxxxx > Subject: Re: [Xen-devel] [PATCH] xen-netback: fix occasional leak of grant > ref mappings under memory > pressure > > > -----Original Message----- > > From: Wei Liu [mailto:wei.liu2@xxxxxxxxxx] > > Sent: 28 February 2019 11:02 > > To: Paul Durrant <Paul.Durrant@xxxxxxxxxx> > > Cc: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx>; > > xen-devel@xxxxxxxxxxxxxxxxxxxx; > > netdev@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Wei Liu > > <wei.liu2@xxxxxxxxxx>; > > davem@xxxxxxxxxxxxx > > Subject: Re: [PATCH] xen-netback: fix occasional leak of grant ref mappings > > under memory pressure > > > > On Thu, Feb 28, 2019 at 09:46:57AM +0000, Paul Durrant wrote: > > > > -----Original Message----- > > > > From: Igor Druzhinin [mailto:igor.druzhinin@xxxxxxxxxx] > > > > Sent: 28 February 2019 02:03 > > > > To: xen-devel@xxxxxxxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx; > > > > linux-kernel@xxxxxxxxxxxxxxx > > > > Cc: Wei Liu <wei.liu2@xxxxxxxxxx>; Paul Durrant > > > > <Paul.Durrant@xxxxxxxxxx>; davem@xxxxxxxxxxxxx; > > Igor > > > > Druzhinin <igor.druzhinin@xxxxxxxxxx> > > > > Subject: [PATCH] xen-netback: fix occasional leak of grant ref mappings > > > > under memory pressure > > > > > > > > Zero-copy callback flag is not yet set on frag list skb at the moment > > > > xenvif_handle_frag_list() returns -ENOMEM. This eventually results in > > > > leaking grant ref mappings since xenvif_zerocopy_callback() is never > > > > called for these fragments. Those eventually build up and cause Xen > > > > to kill Dom0 as the slots get reused for new mappings. > > > > > > > > That behavior is observed under certain workloads where sudden spikes > > > > of page cache usage for writes coexist with active atomic skb > > > > allocations. > > > > > > > > Signed-off-by: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx> > > > > --- > > > > drivers/net/xen-netback/netback.c | 3 +++ > > > > 1 file changed, 3 insertions(+) > > > > > > > > diff --git a/drivers/net/xen-netback/netback.c > > > > b/drivers/net/xen-netback/netback.c > > > > index 80aae3a..2023317 100644 > > > > --- a/drivers/net/xen-netback/netback.c > > > > +++ b/drivers/net/xen-netback/netback.c > > > > @@ -1146,9 +1146,12 @@ static int xenvif_tx_submit(struct xenvif_queue > > > > *queue) > > > > > > > > if (unlikely(skb_has_frag_list(skb))) { > > > > if (xenvif_handle_frag_list(queue, skb)) { > > > > + struct sk_buff *nskb = > > > > + > > > > skb_shinfo(skb)->frag_list; > > > > if (net_ratelimit()) > > > > netdev_err(queue->vif->dev, > > > > "Not enough memory > > > > to consolidate frag_list!\n"); > > > > + xenvif_skb_zerocopy_prepare(queue, > > > > nskb); > > > > xenvif_skb_zerocopy_prepare(queue, skb); > > > > kfree_skb(skb); > > > > continue; > > > > > > Whilst this fix will do the job, I think it would be better to get rid of > > > the kfree_skb() from > > inside xenvif_handle_frag_list() and always deal with it here rather than > > having it happen in two > > different places. Something like the following... > > > > +1 for having only one place. > > > > > > > > ---8<--- > > > diff --git a/drivers/net/xen-netback/netback.c > > > b/drivers/net/xen-netback/netback.c > > > index 80aae3a32c2a..093c7b860772 100644 > > > --- a/drivers/net/xen-netback/netback.c > > > +++ b/drivers/net/xen-netback/netback.c > > > @@ -1027,13 +1027,13 @@ static void xenvif_tx_build_gops(struct > > > xenvif_queue *queue, > > > /* Consolidate skb with a frag_list into a brand new one with local > > > pages on > > > * frags. Returns 0 or -ENOMEM if can't allocate new pages. > > > */ > > > -static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct > > > sk_buff *skb) > > > +static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct > > > sk_buff *diff --git > > a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c > > > index 80aae3a32c2a..093c7b860772 100644 > > > --- a/drivers/net/xen-netback/netback.c > > > +++ b/drivers/net/xen-netback/netback.c > > > @@ -1027,13 +1027,13 @@ static void xenvif_tx_build_gops(struct > > > xenvif_queue *qu > > > eue, > > > /* Consolidate skb with a frag_list into a brand new one with local > > > pages on > > > * frags. Returns 0 or -ENOMEM if can't allocate new pages. > > > */ > > > -static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct > > > sk_buff * > > > skb) > > > +static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct > > > sk_buff * > > > skb, > > > + struct sk_buff *nskb) > > > { > > > unsigned int offset = skb_headlen(skb); > > > skb_frag_t frags[MAX_SKB_FRAGS]; > > > int i, f; > > > struct ubuf_info *uarg; > > > - struct sk_buff *nskb = skb_shinfo(skb)->frag_list; > > > > > > queue->stats.tx_zerocopy_sent += 2; > > > queue->stats.tx_frag_overflow++; > > > @@ -1072,11 +1072,6 @@ static int xenvif_handle_frag_list(struct > > > xenvif_queue *q > > > ueue, struct sk_buff *s > > > skb_frag_size_set(&frags[i], len); > > > } > > > > > > - /* Copied all the bits from the frag list -- free it. */ > > > - skb_frag_list_init(skb); > > > - xenvif_skb_zerocopy_prepare(queue, nskb); > > > - kfree_skb(nskb); > > > - > > > /* Release all the original (foreign) frags. */ > > > for (f = 0; f < skb_shinfo(skb)->nr_frags; f++) > > > skb_frag_unref(skb, f); > > > @@ -1145,7 +1140,11 @@ static int xenvif_tx_submit(struct xenvif_queue > > > *queue) > > > xenvif_fill_frags(queue, skb); > > > > > > if (unlikely(skb_has_frag_list(skb))) { > > > - if (xenvif_handle_frag_list(queue, skb)) { > > > + struct sk_buff *nskb = skb_shinfo(skb)->frag_list; > > > + > > > + xenvif_skb_zerocopy_prepare(queue, nskb); > > > + > > > + if (xenvif_handle_frag_list(queue, skb, nskb)) { > > > if (net_ratelimit()) > > > netdev_err(queue->vif->dev, > > > "Not enough memory to > > > consolidate > frag_list!\n"); > > > @@ -1153,6 +1152,10 @@ static int xenvif_tx_submit(struct xenvif_queue > > > *queue) > > > kfree_skb(skb); > > > continue; > > > } > > > + > > > + /* Copied all the bits from the frag list. */ > > > + skb_frag_list_init(skb); > > > + kfree(nskb); > > > > I think you want kfree_skb here? > > No. nskb is the frag list... it is unlinked from skb by the call to > skb_frag_list_init() and then it > can be freed on its own. The skb is what we need to retain, because that now > contains all the data. Sorry I misread/understood what you were getting at. Yes, I meant kfree_skb(nskb). Paul > > Cheers, > > Paul > > > > > Wei. > > > > > } > > > > > > skb->dev = queue->vif->dev; > > > ---8<--- > > > > > > What do you think? > > > > > > Paul > > > > > > > -- > > > > 2.7.4 > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxxxxxxxxx > https://lists.xenproject.org/mailman/listinfo/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |