[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [MirageOS-devel] towards a common Mirage 'FLOW' signature



On 21 Jun 2014, at 16:02, Anil Madhavapeddy <anil@xxxxxxxxxx> wrote:

> On 21 Jun 2014, at 11:36, David Scott <scott.dj@xxxxxxxxx> wrote:
>> 
>> I wonder whether we should extend flow to have all 4 of {read,write}{,_ack}  
>> so that we can control when data is acked/consumed. The TCP implementation 
>> would then only call the _ack function when it had received confirmation the 
>> data had been processed. If the TCP implementation needed to resend 
>> (possibly after a crash) it could call 'read' again and get back the same 
>> data it had before. So the result of 'read' would be an int64 stream offset 
>> * Cstruct.t, and 'read_ack' would mark an int64 offset as being consumed. 
>> This is what I'm doing in xenstore and shared-memory-ring: I don't know if 
>> anyone else wants this kind of behaviour. In the case where a VM sends a 
>> block write, which is then sent over NFS/TCP it would allow us to call 
>> write_ack on the flow to the guest when the TCP acks are received.
> 
> Yes, this definitely makes sense.  It's very nice to have a clean async 
> notification API for writes, as this could (for example) also eventually bind 
> to storage APIs like libaio for the Unix backend. 

agree.

also, in passing, wouldn't this also make it a (the first?) e2e TCP 
implementation in that acks are sent after data is delivered to the 
*application* rather than just received by the receiving stack... :)

> However, how would NFS/TCP work with FLOW for this?  That would require a 
> scatter/gather API (much like the RING requests/responses), whereas FLOW 
> works with streams.  Oh, unless you are referring to a FLOW of a single file 
> over NFS/TCP, rather than the underlying XDR RPCs from NFS.

not sure i understand. isn't this something that the nfs/tcp implementation 
handles when it passes things down?

> (Perhaps we should shift to calling it Flow in e-mail instead of FLOW to 
> AVOID SHOUTING)

PLUS ONE.

>> Separately, in the case of vchan the buffer size is set at ring setup time. 
>> If you connected a vhan ring to a TCP transmitter then the TCP transmitter, 
>> presumably with it's higher latency link, would try to keep its link full by 
>> buffering more. If the vchan ring size is smaller then the TCP window size 
>> (likely), TCP would have to copy into temporary buffers. If we knew we were 
>> going to need more buffered data then we could make the vchan ring larger 
>> and avoid the copying? Perhaps that wouldn't work due to alignment 
>> requirements. Anyway, this is more of a 'flow setup' issue than a 
>> during-the-flow issue. Perhaps a CHANNEL would be where we could close and 
>> re-open flows in order to adjust their parameters.
> 
> This is definitely a path-MTU style problem, but the real question is why you 
> would be running TCP directly over vchan, which is a lossless medium.  The 
> only use would be for congestion control, and we could consider just cherry 
> picking some of that logic straight out of the TCP stack rather than 
> suffering the overhead of marshalling into the TCP wire format just to go 
> over shared memory.

agree; but the comment about CHANNEL makes sense to me (a Channel is an 
abstraction over a set of Flows), is that right?

-- 
Cheers,

R.




Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

_______________________________________________
MirageOS-devel mailing list
MirageOS-devel@xxxxxxxxxxxxxxxxxxxx
http://lists.xenproject.org/cgi-bin/mailman/listinfo/mirageos-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.