[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH RFC 0/4]: xen-net{back, front}: Multiple transmit and receive queues
Subject: Re: [PATCH RFC 0/4]: xen-net{back,front}: Multiple transmit and receive queues To: Paul Durrant <Paul.Durrant@xxxxxxxxxx>,"xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx> Cc: Ian Campbell <Ian.Campbell@xxxxxxxxxx>,Wei Liu <wei.liu2@xxxxxxxxxx> Bcc: -=-=-=-=-=-=-=-=-=# Don't remove this line #=-=-=-=-=-=-=-=-=- On 16/01/14 10:04, Paul Durrant wrote: -----Original Message----- From: Andrew J. Bennieston [mailto:andrew.bennieston@xxxxxxxxxx] Sent: 15 January 2014 16:23 To: xen-devel@xxxxxxxxxxxxxxxxxxxx Cc: Ian Campbell; Wei Liu; Paul Durrant Subject: [PATCH RFC 0/4]: xen-net{back,front}: Multiple transmit and receive queues This patch series implements multiple transmit and receive queues (i.e. multiple shared rings) for the xen virtual network interfaces. The series is split up as follows: - Patches 1 and 3 factor out the queue-specific data for netback and netfront respectively, and modify the rest of the code to use these as appropriate. - Patches 2 and 4 introduce new XenStore keys to negotiate and use multiple shared rings and event channels, and code to connect these as appropriate. All other transmit and receive processing remains unchanged, i.e. there is a kthread per queue and a NAPI context per queue. The performance of these patches has been analysed in detail, with results available at: http://wiki.xenproject.org/wiki/Xen-netback_and_xen-netfront_multi- queue_performance_testingNice numbers!To summarise: * Using multiple queues allows a VM to transmit at line rate on a 10 Gbit/s NIC, compared with a maximum aggregate throughput of 6 Gbit/s with a single queue. * For intra-host VM--VM traffic, eight queues provide 171% of the throughput of a single queue; almost 12 Gbit/s instead of 6 Gbit/s. * There is a corresponding increase in total CPU usage, i.e. this is a scaling out over available resources, not an efficiency improvement. * Results depend on the availability of sufficient CPUs, as well as the distribution of interrupts and the distribution of TCP streams across the queues. One open issue is how to deal with the tx_credit data for rate limiting. This used to exist on a per-VIF basis, and these patches move it to per-queue to avoid contention on concurrent access to the tx_credit data from multiple threads. This has the side effect of breaking the tx_credit accounting across the VIF as a whole. I cannot see a situation in which people would want to use both rate limiting and a high-performance multi-queue mode, but if this is problematic then it can be brought back to the VIF level, with appropriate protection. Obviously, it continues to work identically in the case where there is only one queue. Queue selection is currently achieved via an L4 hash on the packet (i.e. TCP src/dst port, IP src/dst address) and is not negotiated between the frontend and backend, since only one option exists. Future patches to support other frontends (particularly Windows) will need to add some capability to negotiate not only the hash algorithm selection, but also allow the frontend to specify some parameters to this.Yes, Windows RSS stipulates a Toeplitz hash and specifies a hash key and mapping table. There's further awkwardness in the need to pass the actual hash value to the frontend too - but we could use an 'extra' seg for that, analogous to passing the GSO mss value through. Yes, I was hoping we might be able to play tricks like that when it came to implementing Toeplitz support. Andrew PaulQueue-specific XenStore entries for ring references and event channels are stored hierarchically, i.e. under .../queue-N/... where N varies from 0 to one less than the requested number of queues (inclusive). If only one queue is requested, it falls back to the flat structure where the ring references and event channels are written at the same level as other vif information. -- Andrew J. Bennieston _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |