[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 2/3] public/io/netif.h: document control ring and toeplitz hashing



> -----Original Message-----
> From: Paul Durrant [mailto:paul.durrant@xxxxxxxxxx]
> Sent: 06 January 2016 13:07
> To: xen-devel@xxxxxxxxxxxxxxxxxxxx
> Cc: Paul Durrant; Ian Campbell; Ian Jackson; Jan Beulich; Keir (Xen.org); Tim
> (Xen.org)
> Subject: [PATCH v2 2/3] public/io/netif.h: document control ring and toeplitz
> hashing
> 
> This patch documents a new shared (variable message length) ring between

Sorry, I should have dropped the 'variable message length' bit here. I'll send 
v3 to fix this.

  Paul

> frontend and backend that can be used to pass bulk out-of-band data, such
> as that required to implement toeplitz hashing in the backend that is
> configurable by the frontend.
> 
> The patch then goes on to document the messages passed over the control
> ring that can be used to configure toeplitz hashing.
> 
> Signed-off-by: Paul Durrant <paul.durrant@xxxxxxxxxx>
> Cc: Ian Campbell <ian.campbell@xxxxxxxxxx>
> Cc: Ian Jackson <ian.jackson@xxxxxxxxxxxxx>
> Cc: Jan Beulich <jbeulich@xxxxxxxx>
> Cc: Keir Fraser <keir@xxxxxxx>
> Cc: Tim Deegan <tim@xxxxxxx>
> ---
> 
> v2:
>  - Use a balanced fix-sized message ring for the control ring
>    (bulk data now passed by grant reference).
> ---
>  xen/include/public/io/netif.h | 264
> ++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 264 insertions(+)
> 
> diff --git a/xen/include/public/io/netif.h b/xen/include/public/io/netif.h
> index 1790ea0..ace74f3 100644
> --- a/xen/include/public/io/netif.h
> +++ b/xen/include/public/io/netif.h
> @@ -151,6 +151,270 @@
>   */
> 
>  /*
> + * Control ring
> + * ============
> + *
> + * Some features, such as toeplitz hashing (detailed below), require a
> + * significant amount of out-of-band data to be passed from frontend to
> + * backend. Use of xenstore is not suitable for large quantities of data
> + * because of quota limitations and so a dedicated 'control ring' is used.
> + * The ability of the backend to use a control ring is advertised by
> + * setting:
> + *
> + * /local/domain/X/backend/<domid>/<vif>/feature-control-ring = "1"
> + *
> + * The frontend provides a control ring to the backend by setting:
> + *
> + * /local/domain/<domid>/device/vif/<vif>/ctrl-ring-ref = <gref>
> + * /local/domain/<domid>/device/vif/<vif>/event-channel-ctrl = <port>
> + *
> + * where <gref> is the grant reference of the shared page used to
> + * implement the control ring and <port> is an event channel to be used
> + * as a mailbox interrupt, before the frontend moves into the connected
> + * state.
> + *
> + * The control ring uses a fixed request/response message size and is
> + * balanced (i.e. one request to one response), so operationally it is much
> + * the same as a tramsmit or receive ring.
> + */
> +
> +/*
> + * Toeplitz hash types
> + * ===================
> + *
> + * For the purposes of the definitions below, 'Packet[]' is an array of
> + * octets containing an IP packet without options, 'Array[X..Y]' means a
> + * sub-array of 'Array' containing bytes X thru Y inclusive, and '+' is
> + * used to indicate concatenation of arrays.
> + */
> +
> +/*
> + * A hash calculated over an IP version 4 header as follows:
> + *
> + * Buffer[0..8] = Packet[12..15] + Packet[16..19]
> + * Result = ToeplitzHash(Buffer, 8)
> + */
> +#define _NETIF_CTRL_TOEPLITZ_FLAG_IPV4     0
> +#define NETIF_CTRL_TOEPLITZ_FLAG_IPV4      (1 <<
> _NETIF_CTRL_TOEPLITZ_FLAG_IPV4)
> +
> +/*
> + * A hash calculated over an IP version 4 header and TCP header as
> + * follows:
> + *
> + * Buffer[0..12] = Packet[12..15] + Packet[16..19] +
> + *                 Packet[20..21] + Packet[22..23]
> + * Result = ToeplitzHash(Buffer, 12)
> + */
> +#define _NETIF_CTRL_TOEPLITZ_FLAG_IPV4_TCP 1
> +#define NETIF_CTRL_TOEPLITZ_FLAG_IPV4_TCP  (1 <<
> _NETIF_CTRL_TOEPLITZ_FLAG_IPV4_TCP)
> +
> +/*
> + * A hash calculated over an IP version 6 header as follows:
> + *
> + * Buffer[0..32] = Packet[8..23] + Packet[24..39]
> + * Result = ToeplitzHash(Buffer, 32)
> + */
> +#define _NETIF_CTRL_TOEPLITZ_FLAG_IPV6     2
> +#define NETIF_CTRL_TOEPLITZ_FLAG_IPV6      (1 <<
> _NETIF_CTRL_TOEPLITZ_FLAG_IPV4)
> +
> +/*
> + * A hash calculated over an IP version 6 header and TCP header as
> + * follows:
> + *
> + * Buffer[0..36] = Packet[8..23] + Packet[24..39] +
> + *                 Packet[40..41] + Packet[42..43]
> + * Result = ToeplitzHash(Buffer, 36)
> + */
> +#define _NETIF_CTRL_TOEPLITZ_FLAG_IPV6_TCP 3
> +#define NETIF_CTRL_TOEPLITZ_FLAG_IPV6_TCP  (1 <<
> _NETIF_CTRL_TOEPLITZ_FLAG_IPV4_TCP)
> +
> +/*
> + * Control requests (netif_ctrl_request_t)
> + * =======================================
> + *
> + * All requests have the following format:
> + *
> + *    0     1     2     3     4     5     6     7  octet
> + * +-----+-----+-----+-----+-----+-----+-----+-----+
> + * |    id     |   type    |         data[0]       |
> + * +-----+-----+-----+-----+-----+-----+-----+-----+
> + * |         data[1]       |
> + * +-----+-----+-----+-----+
> + *
> + * id: the request identifier, echoed in response.
> + * type: the type of request (see below)
> + * data[]: any data associated with the request (determined by type)
> + */
> +
> +struct netif_ctrl_request {
> +    uint16_t id;
> +    uint16_t type;
> +
> +#define NETIF_CTRL_TYPE_INVALID              0
> +#define NETIF_CTRL_TYPE_GET_TOEPLITZ_FLAGS   1
> +#define NETIF_CTRL_TYPE_SET_TOEPLITZ_FLAGS   2
> +#define NETIF_CTRL_TYPE_SET_TOEPLITZ_KEY     3
> +#define NETIF_CTRL_TYPE_SET_TOEPLITZ_MAPPING 4
> +
> +    uint32_t data[2];
> +};
> +typedef struct netif_ctrl_request netif_ctrl_request_t;
> +
> +/*
> + * type = NETIF_CTRL_TYPE_GET_TOEPLITZ_FLAGS:
> + *
> + * This is sent by the frontend to query the types of toeplitz
> + * hash supported by the backend. No data is required and to the
> + * data[] field is set to 0.
> + *
> + * type = NETIF_CTRL_TYPE_SET_TOEPLITZ_FLAGS:
> + *
> + * This is sent by the frontend to set the types of toeplitz hash that
> + * the backend should calculate. Note that the 'maximal' type of hash
> + * should always be chosen. For example, if the frontend sets both IPV4
> + * and IPV4_TCP hash types then the latter hash type should be calculated
> + * for any TCP packet and the former only calculated for non-TCP packets.
> + * The data[0] field is a bitwise OR of NETIF_CTRL_TOEPLITZ_FLAG_* values
> + * defined above. The data[1] field is set to 0.
> + *
> + * NOTE: Setting data[0] to 0 disables toeplitz hashing and the backend
> + *       is free to choose how it steers packets to queues (which is the
> + *       default state).
> + *
> + * type = NETIF_CTRL_TYPE_SET_TOEPLITZ_KEY:
> + *
> + * This is sent by the frontend to set the key of toeplitz hash that
> + * the backend should calculate. The toeplitz algorithm is illustrated
> + * by the following pseudo-code:
> + *
> + * (Buffer[] and Key[] are treated as shift-registers where the MSB of
> + * Buffer/Key[0] is considered 'left-most' and the LSB of Buffer/Key[N-1]
> + * is the 'right-most').
> + *
> + * Value = 0
> + * For number of bits in Buffer[]
> + *    If (left-most bit of Buffer[] is 1)
> + *        Value ^= left-most 32 bits of Key[]
> + *    Key[] << 1
> + *    Buffer[] << 1
> + *
> + * The data[0] field is set to the size of key in octets. The data[1]
> + * field is set to a grant reference of a page containing the key. The
> + * reference must remain valid until the corresponding
> + * netif_ctrl_response_t has been processed.
> + *
> + * type = NETIF_CTRL_TYPE_SET_TOEPLITZ_MAPPING:
> + *
> + * This is sent by the frontend to set the mapping of toeplitz hash to
> + * queue number to be applied by the backend.
> + *
> + * The data[0] field is set to the order of the mapping. The data[1] field
> + * is set to a grant reference of a page containing the mapping. The
> + * reference must remain valid until the corresponding
> + * netif_ctrl_response_t has been processed.
> + *
> + * The format of the mapping is:
> + *
> + *    0     1     2     3     4     5     6     7  octet
> + * +-----+-----+-----+-----+-----+-----+-----+-----+
> + * |                    queue[0]                   |
> + * +-----+-----+-----+-----+-----+-----+-----+-----+
> + * |                    queue[1]                   |
> + * +-----+-----+-----+-----+-----+-----+-----+-----+
> + * |                    queue[2]                   |
> + * +-----+-----+-----+-----+-----+-----+-----+-----+
> + * |                    queue[3]                   |
> + *                         .
> + *                         .
> + * |                    queue[N-1]                 |
> + * +-----+-----+-----+-----+-----+-----+-----+-----+
> + *
> + * where each queue value is less than "multi-queue-num-queues" (see
> above)
> + * and N is 1 << data[0].
> + *
> + * NOTE: Before a specific mapping is set using this request, the backend
> + *       should map all toeplitz hash values to queue 0 (which is the only
> + *       queue guaranteed to exist in all cases).
> + */
> +
> +/*
> + * Control responses (netif_ctrl_response_t)
> + * =========================================
> + *
> + * All responses have the following format:
> + *
> + *    0     1     2     3     4     5     6     7  octet
> + * +-----+-----+-----+-----+-----+-----+-----+-----+
> + * |    id     |   pad     |         status        |
> + * +-----+-----+-----+-----+-----+-----+-----+-----+
> + * |         data          |
> + * +-----+-----+-----+-----+
> + *
> + * id: the corresponding request identifier
> + * pad: set to 0
> + * status: the status of request processing
> + * data: any data associated with the response (determined by type and
> + *       status)
> + */
> +
> +struct netif_ctrl_response {
> +    uint16_t id;
> +    uint16_t type;
> +    uint32_t status;
> +
> +#define NETIF_CTRL_STATUS_SUCCESS           0
> +#define NETIF_CTRL_STATUS_NOT_SUPPORTED     1
> +#define NETIF_CTRL_STATUS_INVALID_PARAMETER 2
> +#define NETIF_CTRL_STATUS_BUFFER_OVERFLOW   3
> +
> +    uint32_t data;
> +};
> +typedef struct netif_ctrl_response netif_ctrl_response_t;
> +
> +/*
> + * type = <unknown>
> + *
> + * The default response for any unrecognised request has the status field
> + * set to NETIF_CTRL_STATUS_NOT_SUPPORTED and the data field set to 0.
> + *
> + * type = NETIF_CTRL_MSG_GET_TOEPLITZ_FLAGS:
> + *
> + * Since the request carries no data there is no reason for processing to
> + * fail, hence the status field is set to NETIF_CTRL_STATUS_SUCCESS and
> the
> + * data field is a bitwise OR of NETIF_CTRL_TOEPLITZ_FLAG_* values
> (defined
> + * above) indicating which hash types are supported by the backend.
> + * If no hashing is supported then the data field should be set to 0.
> + *
> + * type = NETIF_CTRL_MSG_SET_TOEPLITZ_FLAGS:
> + *
> + * If the data[0] field in the request is invalid (i.e. contains unsupported
> + * hash types) then the status field is set to
> + * NETIF_CTRL_STATUS_INVALID_PARAMETER. Otherwise the requset
> should succeed
> + * and hence the status field is set to NETIF_CTRL_STATUS_SUCCESS.
> + * The data field should be set to 0.
> + *
> + * type = NETIF_CTRL_MSG_SET_TOEPLITZ_KEY:
> + *
> + * If the data[0] field in the request is an invalid key length (too big)
> + * then the status field is set to NETIF_CTRL_STATUS_BUFFER_OVERFLOW,
> If the
> + * data[1] field is an invalid grant reference then the status field is set
> + * to NETIF_CTRL_STATUS_INVALID_PARAMETER. Otherwise the request
> should
> + * succeed and hence the status field is set to
> NETIF_CTRL_STATUS_SUCCESS.
> + * The data field should be set to 0.
> + *
> + * type = NETIF_CTRL_MSG_SET_TOEPLITZ_MAPPING:
> + *
> + * If the data[0] field in the request is an invalid mapping order (too big)
> + * then the status field is set to NETIF_CTRL_STATUS_BUFFER_OVERFLOW,
> If the
> + * data[1] field is an invalid grant reference then the status field is set
> + * to NETIF_CTRL_STATUS_INVALID_PARAMETER. Otherwise the requset
> should
> + * succeed and hence the status field is set to
> NETIF_CTRL_STATUS_SUCCESS.
> + * The data field should be set to 0.
> + */
> +
> +DEFINE_RING_TYPES(netif_ctrl, struct netif_ctrl_request, struct
> netif_ctrl_response);
> +
> +/*
>   * Guest transmit
>   * ==============
>   *
> --
> 2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.