[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH V6 net-next 1/5] xen-netback: Factor queue-specific data into queue struct.

To: Ian Campbell <Ian.Campbell@xxxxxxxxxx>
From: Andrew Bennieston <andrew.bennieston@xxxxxxxxxx>
Date: Mon, 17 Mar 2014 11:53:35 +0000
Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx, paul.durrant@xxxxxxxxxx, wei.liu2@xxxxxxxxxx, david.vrabel@xxxxxxxxxx, netdev@xxxxxxxxxxxxxxx
Delivery-date: Mon, 17 Mar 2014 11:53:56 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 14/03/14 15:55, Ian Campbell wrote:

On Mon, 2014-03-03 at 11:47 +0000, Andrew J. Bennieston wrote:

From: "Andrew J. Bennieston" <andrew.bennieston@xxxxxxxxxx>

In preparation for multi-queue support in xen-netback, move the
queue-specific data from struct xenvif into struct xenvif_queue, and
update the rest of the code to use this.

Also[...]

Finally,[...]


This is already quite a big patch, and I don't think the commit log
covers everything it changes/refactors, does it?

It's always a good idea to break these things apart but in particular
separating the mechanical stuff (s/vif/queue/g) from the non-mechanical
stuff, since the mechanical stuff is essentially trivial to review and
getting it out the way makes the non-mechanical stuff much easier to
check (or even spot).


The vast majority of changes in this patch are s/vif/queue/g. The rest
are related changes, such as inserting loops over queues, and moving
queue-specific initialisation away from the vif-wide initialisation, so
that it can be done once per queue.

I consider these things to be logically related and definitely within
the purview of this single patch. Without doing this, it is difficult to
get a patch that results in something that even compiles, without
putting in a bunch of placeholder code that will be removed in the very
next patch.

When I split this feature into multiple patches, I took care to group
as little as possible into this first patch (and the same for netfront).
It is still a large patch, but by my count most of this is a simple
replacement of vif with queue...

A first-order approximation, searching for line pairs where the first
has 'vif' and the second has 'queue', yields:

â xen-netback git:(saturn) git show HEAD~4 | grep -A 1 vif | grep queue| wc -l

380

i.e. 760 (=380*2) lines out of the 2240 (~ 40%) are trivial replacements
of vif with queue, and this is not counting multi-line replacements, of
which there are many. What remains is mostly adding loops over these
queues. This could, in principle, be done in a second patch, but the
impact of this is small.


Signed-off-by: Andrew J. Bennieston <andrew.bennieston@xxxxxxxxxx>
Reviewed-by: Paul Durrant <paul.durrant@xxxxxxxxxx>
---
  drivers/net/xen-netback/common.h    |   85 ++++--
  drivers/net/xen-netback/interface.c |  329 ++++++++++++++--------
  drivers/net/xen-netback/netback.c   |  530 ++++++++++++++++++-----------------
  drivers/net/xen-netback/xenbus.c    |   87 ++++--
  4 files changed, 608 insertions(+), 423 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index ae413a2..4176539 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -108,17 +108,39 @@ struct xenvif_rx_meta {
   */
  #define MAX_GRANT_COPY_OPS (MAX_SKB_FRAGS * XEN_NETIF_RX_RING_SIZE)

-struct xenvif {
-       /* Unique identifier for this interface. */
-       domid_t          domid;
-       unsigned int     handle;
+/* Queue name is interface name with "-qNNN" appended */
+#define QUEUE_NAME_SIZE (IFNAMSIZ + 6)


One more than necessary? Or does IFNAMSIZ not include the NULL? (I can't
figure out if it does or not!)


interface.c contains the line:
snprintf(name, IFNAMSIZ - 1, "vif%u.%u", domid, handle);

This suggests that IFNAMSIZ counts the trailing NULL, so I can reduce
this count by 1 on that basis.

[...]
-       /* This array is allocated seperately as it is large */
-       struct gnttab_copy *grant_copy_op;
+       struct gnttab_copy grant_copy_op[MAX_GRANT_COPY_OPS];


Is this deliberate? It seems like a retrograde step reverting parts of
ac3d5ac27735 "xen-netback: fix guest-receive-side array sizes" from Paul
(at least you are nuking a speeling erorr)


Yes, this was deliberate. These arrays were moved out to avoid problems
with kmalloc for the struct net_device (which contains the struct xenvif
in its netdev_priv space). Since the queues are now allocated via
vzalloc, there is no need to do separate allocations (with the
requirement to also separately free on every error/teardown path) so I
moved these back into the main queue structure.


How does this series interact with Zoltan's foreign mapping one? Badly I
should imagine, are you going to rebase?


I'm working on the rebase right now.

+       /* First, check if there is only one queue to optimise the
+        * single-queue or old frontend scenario.
+        */
+       if (vif->num_queues == 1) {
+               queue_index = 0;
+       } else {
+               /* Use skb_get_hash to obtain an L4 hash if available */
+               hash = skb_get_hash(skb);
+               queue_index = (u16) (((u64)hash * vif->num_queues) >> 32);


No modulo num_queues here?

Is the multiply and shift from some best practice somewhere? Or else
what is it doing?


It seems to be what a bunch of other net drivers do in this scenario. I
guess the reasoning is it'll be faster than a mod num_queues.

+       /* Obtain the queue to be used to transmit this packet */
+       index = skb_get_queue_mapping(skb);
+       if (index >= vif->num_queues)
+               index = 0; /* Fall back to queue 0 if out of range */


Is this actually allowed to happen?

Even if yes, not modulo num_queue so spread it around a bit?


This probably isn't allowed to happen. I figured it didn't hurt to be a
little defensive with the code here, and falling back to queue 0 is a
fairly safe thing to do.

  static void xenvif_up(struct xenvif *vif)
  {
-       napi_enable(&vif->napi);
-       enable_irq(vif->tx_irq);
-       if (vif->tx_irq != vif->rx_irq)
-               enable_irq(vif->rx_irq);
-       xenvif_check_rx_xenvif(vif);
+       struct xenvif_queue *queue = NULL;
+       unsigned int queue_index;
+
+       for (queue_index = 0; queue_index < vif->num_queues; ++queue_index) {


This vif->num_queues -- is it the same as dev->num_tx_queues? Or areew
there differing concepts of queue around?


It should be the same as dev->real_num_tx_queues, which may be less than
dev->num_tx_queues.

+               queue = &vif->queues[queue_index];
+               napi_enable(&queue->napi);
+               enable_irq(queue->tx_irq);
+               if (queue->tx_irq != queue->rx_irq)
+                       enable_irq(queue->rx_irq);
+               xenvif_check_rx_xenvif(queue);
+       }
  }

  static void xenvif_down(struct xenvif *vif)
  {
-       napi_disable(&vif->napi);
-       disable_irq(vif->tx_irq);
-       if (vif->tx_irq != vif->rx_irq)
-               disable_irq(vif->rx_irq);
-       del_timer_sync(&vif->credit_timeout);
+       struct xenvif_queue *queue = NULL;
+       unsigned int queue_index;


Why unsigned?

Why not? You can't have a negative number of queues. Zero indicates "I
don't have any set up yet". I'm not expecting people to have 4 billion
or so queues, but equally I can't see a valid use for negative values
here.

@@ -496,9 +497,30 @@ static void connect(struct backend_info *be)
                return;
        }

-       xen_net_read_rate(dev, &be->vif->credit_bytes,
-                         &be->vif->credit_usec);
-       be->vif->remaining_credit = be->vif->credit_bytes;
+       xen_net_read_rate(dev, &credit_bytes, &credit_usec);
+       read_xenbus_vif_flags(be);
+
+       be->vif->num_queues = 1;
+       be->vif->queues = vzalloc(be->vif->num_queues *
+                       sizeof(struct xenvif_queue));
+
+       for (queue_index = 0; queue_index < be->vif->num_queues; ++queue_index) 
{
+               queue = &be->vif->queues[queue_index];
+               queue->vif = be->vif;
+               queue->id = queue_index;
+               snprintf(queue->name, sizeof(queue->name), "%s-q%u",
+                               be->vif->dev->name, queue->id);
+
+               xenvif_init_queue(queue);
+
+               queue->remaining_credit = credit_bytes;
+
+               err = connect_rings(be, queue);
+               if (err)
+                       goto err;
+       }
+
+       xenvif_carrier_on(be->vif);

        unregister_hotplug_status_watch(be);
        err = xenbus_watch_pathfmt(dev, &be->hotplug_status_watch,
@@ -507,18 +529,24 @@ static void connect(struct backend_info *be)
        if (!err)
                be->have_hotplug_status_watch = 1;

-       netif_wake_queue(be->vif->dev);
+       netif_tx_wake_all_queues(be->vif->dev);
+
+       return;
+
+err:
+       vfree(be->vif->queues);
+       be->vif->queues = NULL;
+       be->vif->num_queues = 0;
+       return;


Do you not need to unwind the setup already done on the previous queues
before the failure?



Err... yes. I was sure that code existed at some point, but I can't find
it now. Oops!


-Andrew

  }


-static int connect_rings(struct backend_info *be)
+static int connect_rings(struct backend_info *be, struct xenvif_queue *queue)
  {
-       struct xenvif *vif = be->vif;
        struct xenbus_device *dev = be->dev;
        unsigned long tx_ring_ref, rx_ring_ref;
-       unsigned int tx_evtchn, rx_evtchn, rx_copy;
+       unsigned int tx_evtchn, rx_evtchn;
        int err;
-       int val;

        err = xenbus_gather(XBT_NIL, dev->otherend,
                            "tx-ring-ref", "%lu", &tx_ring_ref,
@@ -546,6 +574,27 @@ static int connect_rings(struct backend_info *be)
                rx_evtchn = tx_evtchn;
        }

+       /* Map the shared frame, irq etc. */
+       err = xenvif_connect(queue, tx_ring_ref, rx_ring_ref,
+                            tx_evtchn, rx_evtchn);
+       if (err) {
+               xenbus_dev_fatal(dev, err,
+                                "mapping shared-frames %lu/%lu port tx %u rx 
%u",
+                                tx_ring_ref, rx_ring_ref,
+                                tx_evtchn, rx_evtchn);
+               return err;
+       }
+
+       return 0;
+}
+



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

Follow-Ups:
- Re: [Xen-devel] [PATCH V6 net-next 1/5] xen-netback: Factor queue-specific data into queue struct.
  - From: Ian Campbell

References:
- [Xen-devel] [PATCH V6 net-next 0/5] xen-net{back, front}: Multiple transmit and receive queues
  - From: Andrew J. Bennieston
- [Xen-devel] [PATCH V6 net-next 1/5] xen-netback: Factor queue-specific data into queue struct.
  - From: Andrew J. Bennieston
- Re: [Xen-devel] [PATCH V6 net-next 1/5] xen-netback: Factor queue-specific data into queue struct.
  - From: Ian Campbell

Prev by Date: Re: [Xen-devel] [RFC 02/14] xen/arm: Remove the parameter "attrindx" in copy_paddr
Next by Date: Re: [Xen-devel] [PATCH 16/19] libxl: suspend: Abolish usleeps in domain suspend wait
Previous by thread: Re: [Xen-devel] [PATCH V6 net-next 1/5] xen-netback: Factor queue-specific data into queue struct.
Next by thread: Re: [Xen-devel] [PATCH V6 net-next 1/5] xen-netback: Factor queue-specific data into queue struct.
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.