[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] netback: Delayed copy alternative

To: Ian Campbell <Ian.Campbell@xxxxxxxxxx>
From: Zoltan Kiss <zoltan.kiss@xxxxxxxxxx>
Date: Wed, 20 Nov 2013 12:28:48 +0000
Cc: Wei Liu <wei.liu2@xxxxxxxxxx>, Jonathan Davies <Jonathan.Davies@xxxxxxxxxxxxx>, Paul Durrant <Paul.Durrant@xxxxxxxxxx>, David Vrabel <david.vrabel@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Malcolm Crossley <malcolm.crossley@xxxxxxxxxx>
Delivery-date: Wed, 20 Nov 2013 12:28:59 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 20/11/13 11:16, Ian Campbell wrote:

On Tue, 2013-11-19 at 16:42 +0000, Zoltan Kiss wrote:

After further discussions and investigations, it seems it is a viable
approach to drop the packets in the RX path of the another VIF after a
timeout,


Since RX/TX in netback is a bit confusing (since it is inverted, but you
don't seem to be using it that way): A diagram:

I meant "RX path" to be the Dom0->DomU path.

domU (netfront) --> dom0 (netback) --> network stack --> bridge ,
                                                                 |
domU' (netfront) <- timeout & drop <- dom0 (netback) <- stack <-'

You are proposing dropping at "timeout & drop". Since the dom0->domU'
path is based on copying there should be no problem with an skb getting
stuck with domU' holding on to it. In effect you will be dropping
traffic from some internal queue before it hits the shared ring anyway.
You will be making sure that either the full skb fits on the ring or it
remains in the queue.

I propose to drop from the qdisc queue, as it happens in classic kernel.I actually reimplemented the wake_queue timer.When the stack calls xenvif_start_xmit, it checks if it can fit thepacket into the ring, and it drops it if not. If it can fit, we aregood, as it will be copied shortly. If not, dropping is also good forus, because we get back the pages.Next start_xmit checks if there is room in the ring for a max sizedpacket, and stops queueing. That's bad, because packets gather up inqdisc indefinitely, if the ring doesn't move. The wake_queue timer isset therefore, and when it fires, it wakes the queueing. Then qdiscstart calling start_xmit again, and as the ring is full, it drops thepacket. And because we don't stop the queueing when we drop, it willkeep calling start_xmit and drops the packets, draining qdisc eventually.

What about any queueing which occurs in "network stack" (either
instance) or "bridge?" How can you cancel an skb out of those? Are you
intending that by dropping packets a "timeout & drop" they would
eventually make their way to the second netback and be droppable? How
convinced are you that this is viable?

I don't know if the stack does too much queueing on that path apart fromqdisc, I assume not. But even if it does, eventually those queues willadvance when the qdisc one gets drained out.

  and don't care about the rest of the cases (packets get stucked
somewhere in the core stack, a driver, or in the queue of a Dom0
userspace socket. In the latter case, they get copied anyway, so it
shouldn't happen)


I think that is OK iff you are copying for dom0 delivery. If you are not
copying here then an dom0 process (including an anonymous one) which can
open a socket and receive traffic could block things indefinitely.

The more general case of an unprivileged or deprivileged (i.e. a process
which has dropped its root privs somehow) being able to interfere with
the traffic in a way which causes gridlock might need a little more
thought though.

deliver_skb() will copy the skbs sent to Dom0 stack:

https://lkml.org/lkml/2012/7/20/363

Does anyone has a counterargument?

Zoli

On 13/11/13 20:29, Zoltan Kiss wrote:

Hi,

I'm trying to forward port delayed copy to my new grant mapping patches.
One important problem I've faced is that classic used
gnttab_copy_grant_page to replace the granted page with a local copy and
unmap the grant. And this function has never been upstreamed as only
netback used it. Unfortunately upstreaming it is not a very easy task,
as the kernel's grant table infrastructure doesn't track at the moment
whether the page is DMA mapped or not. It is required because we
shouldn't proceed with the copy and replace if a device already mapped
the page for DMA.
David came up with an alternative idea: we do this delayed copy because
we don't want the guest's page to get stucked in Dom0 indefinitely. The
only realistic case for that would be if the egress interface would be
an another guest's vif, where the guest (either due to a bug or as a
malicious attempt) doesn't empty its ring. I think it's a safe
assumption that Dom0 otherwise doesn't hold on to packets for too long.
Or if it does, then that's a bug we should fix instead of doing a copy
of the packet.
If we accept that only other vif's can keep the skb indefinitely, then
an easier solution would be to handle this problem on the RX side: the
RX thread can also check whether this skb hanged around for too long and
drop it. Actually, xenvif_start_xmit already checks if the guest
provided enough slots for us to do the grant copy. If I understand it
correctly. What do you think about such an approach?



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

References:
- [Xen-devel] netback: Delayed copy alternative
  - From: Zoltan Kiss
- Re: [Xen-devel] netback: Delayed copy alternative
  - From: Zoltan Kiss
- Re: [Xen-devel] netback: Delayed copy alternative
  - From: Ian Campbell

Prev by Date: Re: [Xen-devel] [PATCH v3 11/14] libxl: get and set soft affinity
Next by Date: Re: [Xen-devel] [PATCH 1/8] x86: detect and initialize Cache QoS Monitoring feature
Previous by thread: Re: [Xen-devel] netback: Delayed copy alternative
Next by thread: [Xen-devel] Request complete reversion of XSA-60 patches
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.