[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC] netif: staging grants for requests



On 01/09/2017 08:56 AM, Paul Durrant wrote:
>> -----Original Message-----
>> From: Joao Martins [mailto:joao.m.martins@xxxxxxxxxx]
>> Sent: 06 January 2017 20:09
>> To: Paul Durrant <Paul.Durrant@xxxxxxxxxx>
>> Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx; Andrew Cooper
>> <Andrew.Cooper3@xxxxxxxxxx>; Wei Liu <wei.liu2@xxxxxxxxxx>; Stefano
>> Stabellini <sstabellini@xxxxxxxxxx>
>> Subject: Re: [RFC] netif: staging grants for requests
>>
>> On 01/06/2017 09:33 AM, Paul Durrant wrote:
>>>> -----Original Message-----
>>>> From: Joao Martins [mailto:joao.m.martins@xxxxxxxxxx]
>>>> Sent: 14 December 2016 18:11
>>>> To: xen-devel@xxxxxxxxxxxxxxxxxxxx
>>>> Cc: David Vrabel <david.vrabel@xxxxxxxxxx>; Andrew Cooper
>>>> <Andrew.Cooper3@xxxxxxxxxx>; Wei Liu <wei.liu2@xxxxxxxxxx>; Paul
>> Durrant
>>>> <Paul.Durrant@xxxxxxxxxx>; Stefano Stabellini <sstabellini@xxxxxxxxxx>
>>>> Subject: [RFC] netif: staging grants for requests
>>>>
>>>> Hey,
>>>>
>>>> Back in the Xen hackaton '16 networking session there were a couple of
>> ideas
>>>> brought up. One of them was about exploring permanently mapped
>> grants
>>>> between
>>>> xen-netback/xen-netfront.
>>>>
>>>> I started experimenting and came up with sort of a design document (in
>>>> pandoc)
>>>> on what it would like to be proposed. This is meant as a seed for
>> discussion
>>>> and also requesting input to know if this is a good direction. Of course, I
>>>> am willing to try alternatives that we come up beyond the contents of the
>>>> spec, or any other suggested changes ;)
>>>>
>>>> Any comments or feedback is welcome!
>>>>
>>>
>>> Hi,
>> Hey!
>>
>>>
>>> Sorry for the delay... I've been OOTO for three weeks.
>> Thanks for the comments!
>>
>>> I like the general approach or pre-granting buffers for RX so that the
>> backend
>>> can simply memcpy and tell the frontend which buffer a packet appears in
>> Cool,
>>
>>> but IIUC you are proposing use of a single pre-granted area for TX also,
>> which would
>>> presumably require the frontend to always copy on the TX side? I wonder if
>> we
>>> might go for a slightly different scheme...
>> I see.
>>
>>>
>>> The assumption is that the working set of TX buffers in the guest OS is 
>>> fairly
>>> small (which is probably true for a small number of heavily used sockets
>> and an
>>> OS that uses a slab allocator)...
>> Hmm, [speaking about linux] maybe for the skb allocation cache. For the
>> remaining packet pages maybe not for say a scather-gather list...? But I 
>> guess
>> it would need to be validated whether this working set is indeed kept small
>> as
>> this seems like a very strong assumption to comply with its various
>> possibilities in workloads. Plus wouldn't we leak info from these pages if it
>> wasn't used on the device but rather elsewhere in the guest stack?
> 
> Yes, potentially there is an information leak but I am assuming that the 
> backend
> is also trusted by the frontend, which is pretty will baked into the protocol
> anyway.
I assumed the same - just thought it was worth clarifying.

> Also, if the working set (which is going to be OS/stack dependent) turned
> out to be a bit too large then the frontend can always fall back to a copy 
> into a
> locally allocated buffer, as in your proposal, anyway.
Yeap.

>>> The guest TX code maintains a hash table of buffer addresses to grant refs.
>> When
>>> a packet is sent the code looks to see if it has already granted the buffer
>> and
>>> re-uses the existing ref if so, otherwise it grants the buffer and adds the
>> new
>>> ref into the table.
>>
>>> The backend also maintains a hash of grant refs to addresses and,
>> whenever it
>>> sees a new ref, it grant maps it and adds the address into the table.
>> Otherwise
>>> it does a hash lookup and thus has a buffer address it can immediately
>> memcpy
>>> from.
>>>
>>> If the frontend wants the backend to release a grant ref (e.g. because it's
>>> starting to run out of grant table) then a control message can be used to
>> ask
>>> for it back, at which point the backend removes the ref from its cache and
>>> unmaps it.
>> Wouldn't this be somewhat similar to the persistent grants in xen block
>> drivers?
> 
> Yes, it would, and I'd rather that protocol was also re-worked in this 
> fashion.
I guess then I could reuse part of my old series (persistent grants) in this
reworked fashion you suggest. I didn't went that route as I had the (apparently
wrong) impression that a persistent grants based approach were undesirable (as I
took it from past sessions)

>>> Using this scheme we allow a guest OS to still use either a zero-copy
>> approach
>>> if it wishes to do so, or a static pre-grant... or something between
>>> (e.g. pre-grant for headers, zero copy for bulk data).
>>>
>>> Does that sound reasonable?
>> Not sure yet but it looks nice if we can indeed achieve the zero copy part. 
>> But
>> I have two concerns: say a backend could be forced to always remove refs as
>> its
>> cache is always full having frontend not being able to reuse these pages
>> (subject to its own allocator behavior, in case assumption above wouldn't be
>> satisfied) nullifying backend effort into maintaining its mapped grefs table.
>> One other concern is whether those pages (assumed to be reused) might be
>> leaking
>> off guest data to the backend (when not used on netfront).
> 
> As I said, the protocol already requires the backend to be trusted by the 
> frontend
> (since grants cannot be revoked, if for no other reason) so information 
> leakage is
> not a particular concern. What I want to avoid is a protocol that denies any
> possibility of zero-copy, even in the best case, which is the way things 
> currently
> are with persistent grants in blkif.
I'll be more reassured once I've checked/verified this frontend assumption can
be achieved by a somewhat visible percentage of the sent pages; but still you've
got a valid point there that the protocol shouldn't be forcing the guest OS in
always doing a copy. Hence I like in having a control message to co-manage these
grefs table.

Joao

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.