[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFC 0/4]: xen-net{back, front}: Multiple transmit and receive queues



Subject:   Re: [PATCH RFC 0/4]: xen-net{back,front}: Multiple transmit
and receive queues To:        Paul Durrant
<Paul.Durrant@xxxxxxxxxx>,"xen-devel@xxxxxxxxxxxxxxxxxxxx"
<xen-devel@xxxxxxxxxxxxxxxxxxxx> Cc:        Ian Campbell
<Ian.Campbell@xxxxxxxxxx>,Wei Liu <wei.liu2@xxxxxxxxxx> Bcc:
-=-=-=-=-=-=-=-=-=# Don't remove this line #=-=-=-=-=-=-=-=-=- On
16/01/14 10:04, Paul Durrant wrote:
-----Original Message----- From: Andrew J. Bennieston
[mailto:andrew.bennieston@xxxxxxxxxx] Sent: 15 January 2014 16:23 To:
xen-devel@xxxxxxxxxxxxxxxxxxxx Cc: Ian Campbell; Wei Liu; Paul
Durrant Subject: [PATCH RFC 0/4]: xen-net{back,front}: Multiple
transmit and receive queues

This patch series implements multiple transmit and receive queues
(i.e.  multiple shared rings) for the xen virtual network interfaces.

The series is split up as follows: - Patches 1 and 3 factor out the
queue-specific data for netback and netfront respectively, and modify
the rest of the code to use these as appropriate.  - Patches 2 and 4
introduce new XenStore keys to negotiate and use multiple shared
rings and event channels, and code to connect these as appropriate.

All other transmit and receive processing remains unchanged, i.e.
there is a kthread per queue and a NAPI context per queue.

The performance of these patches has been analysed in detail, with
results available at:

http://wiki.xenproject.org/wiki/Xen-netback_and_xen-netfront_multi-
queue_performance_testing


Nice numbers!

To summarise: * Using multiple queues allows a VM to transmit at line
rate on a 10 Gbit/s NIC, compared with a maximum aggregate throughput
of 6 Gbit/s with a single queue.  * For intra-host VM--VM traffic,
eight queues provide 171% of the throughput of a single queue; almost
12 Gbit/s instead of 6 Gbit/s.  * There is a corresponding increase
in total CPU usage, i.e. this is a scaling out over available
resources, not an efficiency improvement.  * Results depend on the
availability of sufficient CPUs, as well as the distribution of
interrupts and the distribution of TCP streams across the queues.

One open issue is how to deal with the tx_credit data for rate
limiting.  This used to exist on a per-VIF basis, and these patches
move it to per-queue to avoid contention on concurrent access to the
tx_credit data from multiple threads. This has the side effect of
breaking the tx_credit accounting across the VIF as a whole. I cannot
see a situation in which people would want to use both rate limiting
and a high-performance multi-queue mode, but if this is problematic
then it can be brought back to the VIF level, with appropriate
protection.  Obviously, it continues to work identically in the case
where there is only one queue.

Queue selection is currently achieved via an L4 hash on the packet
(i.e.  TCP src/dst port, IP src/dst address) and is not negotiated
between the frontend and backend, since only one option exists.
Future patches to support other frontends (particularly Windows) will
need to add some capability to negotiate not only the hash algorithm
selection, but also allow the frontend to specify some parameters to
this.


Yes, Windows RSS stipulates a Toeplitz hash and specifies a hash key
and mapping table. There's further awkwardness in the need to pass the
actual hash value to the frontend too - but we could use an 'extra'
seg for that, analogous to passing the GSO mss value through.

Yes, I was hoping we might be able to play tricks like that when it came
to implementing Toeplitz support.

Andrew


    Paul

Queue-specific XenStore entries for ring references and event
channels are stored hierarchically, i.e. under .../queue-N/... where
N varies from 0 to one less than the requested number of queues
(inclusive). If only one queue is requested, it falls back to the
flat structure where the ring references and event channels are
written at the same level as other vif information.

-- Andrew J. Bennieston


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.