[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4 2/3] public/io/netif.h: document control ring and toeplitz hashing



> -----Original Message-----
> From: Ian Campbell [mailto:ian.campbell@xxxxxxxxxx]
> Sent: 08 January 2016 17:22
> To: Paul Durrant; xen-devel@xxxxxxxxxxxxxxxxxxxx
> Cc: Ian Jackson; Jan Beulich; Keir (Xen.org); Tim (Xen.org)
> Subject: Re: [PATCH v4 2/3] public/io/netif.h: document control ring and
> toeplitz hashing
> 
> On Fri, 2016-01-08 at 17:07 +0000, Paul Durrant wrote:
> > > +/*
> > > > > > + * Control responses (netif_ctrl_response_t)
> > > > > > + * =========================================
> > > > > > + *
> > > > > > + * All responses have the following format:
> > > > > > + *
> > > > > > + *ÂÂÂÂ0ÂÂÂÂÂ1ÂÂÂÂÂ2ÂÂÂÂÂ3ÂÂÂÂÂ4ÂÂÂÂÂ5ÂÂÂÂÂ6ÂÂÂÂÂ7ÂÂoctet
> > > > > > + * +-----+-----+-----+-----+-----+-----+-----+-----+
> > > > > > + * |ÂÂÂÂidÂÂÂÂÂ|ÂÂÂpadÂÂÂÂÂ|ÂÂÂÂÂÂÂÂÂstatusÂÂÂÂÂÂÂÂ|
> > > > > > + * +-----+-----+-----+-----+-----+-----+-----+-----+
> > > > > > + * |ÂÂÂÂÂÂÂÂÂdataÂÂÂÂÂÂÂÂÂÂ|
> > > > > > + * +-----+-----+-----+-----+
> > > > > > + *
> > > > > > + * id: the corresponding request identifier
> > > > > > + * pad: set to 0
> > > > > > + * status: the status of request processing
> > > > > > + * data: any data associated with the response (determined by
> > > > > > type
> > > > > > and
> > > > > > + *ÂÂÂÂÂÂÂstatus)
> > > > >
> > > > > type needs to be remembered via the id? Why not echo it in the pad
> > > > > field
> > > > > for convenience?
> > > > >
> > > >
> > > > I originally had that but thought it would be superfluous given that
> > > > the
> > > > id is echoed. I can add it back if you think that it would be a good
> > > > idea. All my implementation did with it though was ASSERT that it
> > > > matched
> > > > the req.
> > >
> > > Was it a coincidence that you had the req in hand given the id, or is
> > > it a
> > > fundamental property of any implementation of this protocol that you
> > > would
> > > have it handy?
> >
> > It's pretty fundamental. You need to get back at the grefs used for the
> > key and the mapping so any implementation is pretty certain to have the
> > req in hand.
> 
> Not all commands use a gref though, and future ones may not.
> 
> Given we are otherwise just wasting those 16 bits we may as well stick the
> type into them.
> 

Ok. Will revert to doing that.

> >
> > > > > > + *
> > > > > > + * type = NETIF_CTRL_MSG_SET_TOEPLITZ_FLAGS:
> > > > > > + *
> > > > > > + * If the data[0] field in the request is invalid (i.e. contains
> > > > > > unsupported
> > > > > > + * hash types) then the status field is set to
> > > > > > + * NETIF_CTRL_STATUS_INVALID_PARAMETER. Otherwise the
> > > requset
> > > > > should
> > > > >
> > > > > "request"
> > > > >
> > > > > > succeed
> > > > > > + * and hence the status field is set to
> > > > > > NETIF_CTRL_STATUS_SUCCESS.
> > > > > > + * The data field should be set to 0.
> > > > > > + *
> > > > > > + * type = NETIF_CTRL_MSG_SET_TOEPLITZ_KEY:
> > > > > > + *
> > > > > > + * If the data[0] field in the request is an invalid key length
> > > > > > (too
> > > > > > big)
> > > > >
> > > > > Can it not be too small as well? What is the behaviour if it is?
> > > > >
> > > >
> > > > I guess the way the algorithm is defined, a key of less than 32 bits
> > > > doesn't make much sense.
> > >
> > > What if I throw caution to the wind and set one byte anyway?
> > >
> >
> > At the moment, it goes into the 'left most' (in Microsoft's terminology)
> > byte and any remaining bytes are zero filled. From the spec it's not
> > clear if zero-filling is the right thing to do but since Windows always
> > sets a 40 byte key it's never been an issue.
> 
> I think we should either clearly define the behaviour, or set a minimum
> size.

Ok. I may as well set the minimum size at 40 then.

> 
> Does this zero extending apply to any multiple of something other than 32
> bits? Like if I give 9 octets does it effectively get padded to 12 with
> zeroes?
> 

No, the key is essentially treated as a shift register so it's just implemented 
as a byte array.

> >
> > > >
> > > > > > + * then the status field is set to
> > > > > NETIF_CTRL_STATUS_BUFFER_OVERFLOW, If the
> > > > > > + * data[1] field is an invalid grant reference then the status
> > > > > > field
> > > > > > is set
> > > > > > + * to NETIF_CTRL_STATUS_INVALID_PARAMETER. Otherwise the
> > > request
> > > > > should
> > > > > > + * succeed and hence the status field is set to
> > > > > NETIF_CTRL_STATUS_SUCCESS.
> > > > > > + * The data field should be set to 0.
> > > > > > + *
> > > > > > + * type = NETIF_CTRL_MSG_SET_TOEPLITZ_MAPPING:
> > > > > > + *
> > > > > > + * If the data[0] field in the request is an invalid mapping
> > > > > > order
> > > > > > (too big)
> > > > >
> > > > > or too small?
> > > >
> > > > An order 0 mapping would still contain a single entry and you can't
> > > > get
> > > > any smaller than that :-)
> > >
> > > I missed the order, assumed it was size, but still, what if there are
> > > 16
> > > queues and I pass order 0, the behaviour of queues 2..16 needs to be
> > > well
> > > defined.
> >
> > The hash is simply used as an index modulo mapping size (I'll make that
> > clear somewhere)
> 
> thanks.
> 
> >  so a mapping size less than the number of queues you have is ok but
> > clearly suboptimal.
> >
> > >
> > > Since this is a single page, what happens with >=512 queues? (and how
> > > far
> > > away is that possibility).
> > >
> >
> > Good question. An 8 byte queue number
> 
> 8 bytes is the hash match isn't it and Q number is the offset into the
> page, isn't it? Or have I misunderstood how something fits together?

No. Other way round :-) The hash is treated as an index into the table, modulo 
table size and the values in the table are the actual queue numbers.

> 
> > is clearly overkill if we can only steer to 512 of them. I could define a
> > limit on the number of queues at 256 and use byte array instead.
> > Realistically, it's going to be a while before we get to 256 cpus in a
> > Windows VM.
> 
> Them's brave words ;-)
> 

Getting to 32 has been hard enough. Breaking through the 64 cpu limit is going 
to be 'interesting' too, I suspect. (A lot of core Windows stuff uses native 
word size bit masks for cpu affinity so you have to start using 'processor 
groups' and it all gets a bit funky).

> We should maybe consider that a f/e OS other than Windows might, for
> some
> reason, implement Toeplitz and have a higher probability of supporting
> large #s of vcpus sooner?
> 
> Split data[0] into (total) size and offset (to set blocks of values)?

Ok. If I go for a 16-bit queue number, that's probably enough, and means that a 
4k page can hold 2k mappings. Then I'll do as you suggest and use a size, 
offset pair in place of the order field.

  Paul

> 
> > Â Paul
> >
> > > Ian.
> >
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.