[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] [RFC] Proposed XenStore Interactions for Multi-Queue VIFs
I'm posting this for an initial round of comments; I don't have any code at present to implement this, and wanted to get some feedback before getting started. All comments welcome :) Andrew. Proposed XenStore Interactions for Multi-Queue VIFs ======================================================================== Andrew J. Bennieston <andrew.bennieston@xxxxxxxxxx> June 26 2013 Contents -------- 1. Rationale 2. Backend feature advertising 3. Frontend setup 3.1 Selecting the number of queues and the hash algorithm 3.2 Shared ring grant references and event channels 3.2.1 Ring pages 3.2.2 Event channels 4. Summary of main points 1. Rationale --------------- Network throughput through a single VIF is limited by the processing power available for a single netback kthread to perform work on the ring. The single VIF throughput could be scaled up by implementing multiple queues per VIF. Packets would be directed to one ring or another by a hash of their headers. Initially, only TCP packets are considered (all other packets will be presented on the first queue). Multi-queue VIFs will be serviced by multiple shared ring structures associated with a single virtual network interface. At present, the connection of shared rings and event channels is performed by negotiation between the frontend (domU) and backend (dom0) domains via XenStore. This document details the proposed additions to this negotiation that would be required in order to support the setup and connection of multiple shared rings. 2. Backend feature advertising ------------------------------ The backend advertises the features it supports via keys of the form /local/domain/0/backend/vif/X/Y/feature-NNN = "1" where X is the domain ID and Y is the virtual network device number. In this proposal, a backend that wishes to support multi-queue VIFs would add the key /local/domain/0/backend/vif/X/Y/feature-multi-queue = "1" If this key exists and is set to "1", the frontend may request a multi-queue configuration. If the key is set to "0", or does not exist, the backend either does not support this feature, or it has been disabled. In addition to the feature flag, a backend which supports feature-multi-queue would advertise a maximum number of queues, via the key: /local/domain/0/backend/vif/X/Y/multi-queue-max-queues This value is the maximum number of supported ring pairs; each queue consists of a pair of rings supporting Tx (from guest) and Rx (to guest). The number of rings in total is twice the value of multi-queue-max-queues. Finally, the backend advertises the list of hash algorithms it supports. Hash algorithms define how network traffic is steered to different queues, and it is assumed that the back- and frontends will use the same hash algorithm with the same parameters. The available hash algorithms are advertised by the backend via the key /local/domain/0/backend/vif/X/Y/multi-queue-hash-list = "alg1 alg2" where "alg1 alg2" is a space-separated list of algorithms. 3. Frontend setup ----------------- The frontend will be expected to look for the feature-multi-queue XenStore key and, if present and non-zero, query the list of hash algorithms and the maximum number of queues. It will then choose the hash algorithm desired (or fall back to single-queue if the frontend and backend do not have a hash algorithm in common) and set up a number of XenStore keys to inform the backend of these choices. In single-queue mode, there is no change from the existing mechanism. 3.1 Selecting the number of queues and the hash algorithm --------------------------------------------------------- For multi-queue mode, the frontend requests the number of queues required (between 1 and the maximum advertised by the backend): /local/domain/X/device/vif/Y/multi-queue-num-queues = "2" If this key is not present, or is set to "1", single-queue mode is used. The frontend must also specify the desired hash algorithm as follows: /local/domain/X/device/vif/Y/multi-queue-hash = "alg1" where "alg1" is one of the values from multi-queue-hash-list. In addition to these keys, a number of hash-specific keys may be written to provide parameters to be used by the hash algorithm. These are not defined here in the general case, but may be used e.g. to communicate a key or a mapping between hash value and queue number, for a specific hash algorithm. The recommendation is that these are grouped together under a key named something like multi-queue-hash-params-NNN where NNN is the name of the hash algorithm specified in the multi-queue-hash key. 3.2 Shared ring grant references and event channels --------------------------------------------------- 3.2.1 Ring pages ---------------- It is the responsibility of the frontend to allocate one page for each ring (i.e. two pages for each queue) and provide a grant reference to each page, so that the backend may map them. In the single-queue case, this is done as usual with the tx-ring-ref and rx-ring-ref keys. For multi-queue, a hierarchical structure is proposed. This serves the dual purpose of clean separation of grant references between queues and allows additional mechanisms (e.g. split event channels, multi-page rings) to replicate their XenStore keys for each queue without name collisions. For each queue, the frontend should set up the following keys: /local/domain/X/device/vif/Y/queue-N/tx-ring-ref /local/domain/X/device/vif/Y/queue-N/rx-ring-ref where X is the domain ID, Y is the device ID and N is the queue number (beginning at zero). 3.2.2 Event channels -------------------- The upstream netback and netfront code supports feature-split-event-channels, allowing one channel per ring (instead of one channel per VIF). When multiple queues are used, the frontend must write either: /local/domain/X/device/vif/Y/queue-N/event-channel = "M" to use a single event channel (number M) for that queue, or /local/domain/X/device/vif/Y/queue-N/tx-event-channel = "M" /local/domain/X/device/vif/Y/queue-N/rx-event-channel = "L" to use split event channels (numbers L, M) for that queue. 4. Summary of main points ------------------------- - Each queue has two rings (one for Tx, one for Rx). - An unbalanced set of rings (e.g. more Rx than Tx) would still leave a bottleneck on the side with fewer rings, so for simplicity we require matched pairs. - The frontend may only use hash algorithms that the backend advertises; if there are no algorithms in common, frontend initialisation fails. - The backend must supply at least one fast hash algorithm for Linux guests - Note that when Windows frontend support is added, the Toeplitz algorithm must be supported by the backend. This is relatively expensive to compute, however. - Event channels are on a per-queue basis. - Split event channels may be used for some (or all) queues, again on a per-queue basis, selected by the presence of tx-event-channel, rx-event-channel keys in each queue's keyspace. - Single event channel (per queue) is selected by the presence of the event-channel key in the queue's keyspace. - There is no plan to support a single event channel for all queues, at present. This may be considered in the future to reduce the demand for event channels, which are a limited resource. - Hash-specific configuration will reside in a hash-specific sub-key, likely named something along the lines of multi-queue-hash-params-NNN where NNN is the name of the hash algorithm. The contents will depend on the algorithm selected and are not specified here. - All other configuration applies to the VIF as a whole, whether single- or multi-queue. - Again, there is the option to move keys into the queue hierarchy to allow per-queue configuration at a later date. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |