[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Remove Cstruct: one step forward?



Dear Hannes, dear Anil,

I'm sorry for the delay of my reply.
I agree with keeping Xen&Qubes targets in mind and the need for io-page library 
around mirage-*-xen, but if io-page can be restricted to a mirage-*xen 
dependency, it can be replaced in a later move when non moving bytes with Ocaml 
5 can be assured.

The first impact, as has been pointed out, is that now it will force some 
copies, and I don't have any performance measurements yet: lots of dlmalloc 
allocations for small buffers vs. more bytes copies. I've tried to purge 
Cstruct of a small unikernel, but I've run out of free time :'(

As for a new intermediate layer that uses bytes instead of bigarrays and allows 
for cheap split/sub/shift, I'm sure this will make code updating and 
maintenance easier, but will also lead to a new dependency that will have to be 
understood by users, as was the case for Cstruct with mirage-crypto. So I'm in 
the 'bytes+offsets' team in the lower functions API, and trust the code review 
for the lowest part of the stack.

Best,
Pierre



Oct 2, 2024, 10:56 by hannes@xxxxxxxxxxx:

> Dear Anil, dear Pierre,
>
> I appreciate to move forward to use bytes/string where possible - mainly 
> guided by:
> - nicer tooling support (statmemprof) [lots of tooling doesn't deal with 
> externally allocated memory well (doesn't take it into account, GC doesn't 
> nicely work with us when it comes to high memory pressure, have a look at the 
> hacks for qubes-mirage-firewall, now 
> https://github.com/mirage/mirage-qubes/blob/v1.0.0/lib/misc.ml#L2)
> - finally we've time to think about "immutability" of data, and rethink the 
> allocation disciplines (which leads to increased performance) -- as Sudha 
> said in her FunOCaml (OCaml 5 GC) talk, the best thing to do is to allocate 
> less
> - Cstruct was a pretty nice abstraction, but has two layers of bounds checks, 
> is an external library that is not too easy to comprehend (I've heard several 
> people complaining about the mirage-crypto interface before)
> - OCaml standard library moved to have get_uint8, set_uint16_be etc. 
> accessors since 4.08, so we can just use them \o/
> - Bigarray allocation is expensive (calls to malloc, and esp. in MirageOS we 
> then use dlmalloc which is pretty slow)
>
> Now, there are certainly difficulties when porting code:
> - Cstruct.shift and Cstruct.sub were cheap, now a String.sub is expensive 
> (allocates the whole data thing)
> - In e.g. ethernet, we have `val write: t -> ?src:Macaddr.t -> Macaddr.t -> 
> Packet.proto -> ?size:int -> (Cstruct.t -> int) -> (unit, error) result Lwt.t`
>  And here, the "fill" function (Cstruct.t -> int) is around to allow the 
> lowest layer to allocate the buffer, and then pass the buffer to-be-send onto 
> the upper layer -- but ethernet shifts the start by 14 bytes to have space 
> for the ethernet header.
> --> Replacing this with "bytes -> int" won't pan out, since we'd need some 
> sort of "bytes -> off:int -> int" and ensure (by code review?) that "off" is 
> always used. A similar argument applies for the receiving side.
> ===> I'm still undecided whether we should give up on that defensive coding 
> practise, or have a fresh library that encapsulates the offset and the 
> underlying data (then again, shift would be cheap).
>
> What Anil says is for sure true, we have Xen supported (with the amazing 
> qubes-mirage-firewall, I guess still a very widely used unikernel), which 
> requires non-moving page-aligned data. All other backends don't need 
> page-alignment, neither non-moving since they copy the data anyways. And of 
> course we want to provide an IP stack that allows zero-copying on the receive 
> and send path.
>
> Now, a quick idea:
> - what is the difference between bytes and bigarray, in terms of memory 
> representation (of the underlying data)? I welcome any insight you may have.
> - if it is equal, can't we come up with a library that unifies the access to 
> the data? would something as simple (and scary) as "let data : bytes = 
> Obj.magic cstruct.underlying_bigarray.underlying_memory" work?
> - can we have a compile-time choice selecting the right type for the 
> underlying data (and thus the right functions are selected)?
>
> It is also related to bounds checks (and how to avoid them using types or 
> other optimizations), take a look into 
> https://discuss.ocaml.org/t/bounds-checks-for-string-and-bytes-when-retrieving-or-setting-subparts-thereof/
>  if you're interested.
>
>
>
> Best,
>
> Hannes
>
> On 01/10/2024 17:26, Anil Madhavapeddy wrote:
>
>> Dear Pierre,
>>
>> While this is a good step forward, it is worth considering how to deal with 
>> the requirements of some backends for non-moving IO pages to pass through to 
>> the kernel/unikernel. This interface will force a copy, but perhaps that's 
>> ok. It should be possible to make OCaml 5 bytes non-moving with some 
>> consideration for the garbage collector, but I haven't had a chance to think 
>> through the details.
>>
>> best,
>> Anil
>>
>> --
>> Anil Madhavapeddy, Professor of Planetary Computing
>> Computer Laboratory, University of Cambridge (cst.cam.ac.uk)
>>
>>> On Oct 1, 2024, at 4:02 PM, pierre.alain@xxxxxxx wrote:
>>>
>>> Dear all,
>>>
>>> Over the past few months, a large amount of work has been carried out to 
>>> reduce the use of Cstruct.t. This was mainly driven by performances, as 
>>> there were a lot of expensive allocations for small buffers (from 0 to a 
>>> couple of bytes). Indeed, there were big performance improvements with 
>>> mirage-crypto (which weren't limited to extracting Cstruct, but also 
>>> algorithm improvements and a deeper thinking about allocations, like 
>>> allocating a buffer once, and writing to it rather than allocating multiple 
>>> buffers and concatenating them).
>>>
>>> Now I'm wondering if this could be a time to extend this work of Cstruct 
>>> removal with another API breakage. You can see at 
>>> https://github.com/mirage/mirage-net/pull/25 a proposal for the new API. 
>>> It's still conservative in term of "where each buffers would be allocated". 
>>> Of course, if it is merged, there will be the need for lot of PR to make 
>>> dependent libraries compliant with the new interface.
>>>
>>> That's why I'm happy to ask your opinion on this possible step forward :)
>>> Best,
>>> Pierre
>>>
>>> --
>>> P.
>>>
>>
>>
>
>




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.