[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Remove Cstruct: one step forward?
Dear Anil, dear Pierre,I appreciate to move forward to use bytes/string where possible - mainly guided by: - nicer tooling support (statmemprof) [lots of tooling doesn't deal with externally allocated memory well (doesn't take it into account, GC doesn't nicely work with us when it comes to high memory pressure, have a look at the hacks for qubes-mirage-firewall, now https://github.com/mirage/mirage-qubes/blob/v1.0.0/lib/misc.ml#L2) - finally we've time to think about "immutability" of data, and rethink the allocation disciplines (which leads to increased performance) -- as Sudha said in her FunOCaml (OCaml 5 GC) talk, the best thing to do is to allocate less - Cstruct was a pretty nice abstraction, but has two layers of bounds checks, is an external library that is not too easy to comprehend (I've heard several people complaining about the mirage-crypto interface before) - OCaml standard library moved to have get_uint8, set_uint16_be etc. accessors since 4.08, so we can just use them \o/ - Bigarray allocation is expensive (calls to malloc, and esp. in MirageOS we then use dlmalloc which is pretty slow) Now, there are certainly difficulties when porting code:- Cstruct.shift and Cstruct.sub were cheap, now a String.sub is expensive (allocates the whole data thing) - In e.g. ethernet, we have `val write: t -> ?src:Macaddr.t -> Macaddr.t -> Packet.proto -> ?size:int -> (Cstruct.t -> int) -> (unit, error) result Lwt.t` And here, the "fill" function (Cstruct.t -> int) is around to allow the lowest layer to allocate the buffer, and then pass the buffer to-be-send onto the upper layer -- but ethernet shifts the start by 14 bytes to have space for the ethernet header. --> Replacing this with "bytes -> int" won't pan out, since we'd need some sort of "bytes -> off:int -> int" and ensure (by code review?) that "off" is always used. A similar argument applies for the receiving side. ===> I'm still undecided whether we should give up on that defensive coding practise, or have a fresh library that encapsulates the offset and the underlying data (then again, shift would be cheap). What Anil says is for sure true, we have Xen supported (with the amazing qubes-mirage-firewall, I guess still a very widely used unikernel), which requires non-moving page-aligned data. All other backends don't need page-alignment, neither non-moving since they copy the data anyways. And of course we want to provide an IP stack that allows zero-copying on the receive and send path. Now, a quick idea:- what is the difference between bytes and bigarray, in terms of memory representation (of the underlying data)? I welcome any insight you may have. - if it is equal, can't we come up with a library that unifies the access to the data? would something as simple (and scary) as "let data : bytes = Obj.magic cstruct.underlying_bigarray.underlying_memory" work? - can we have a compile-time choice selecting the right type for the underlying data (and thus the right functions are selected)? It is also related to bounds checks (and how to avoid them using types or other optimizations), take a look into https://discuss.ocaml.org/t/bounds-checks-for-string-and-bytes-when-retrieving-or-setting-subparts-thereof/ if you're interested. Best, Hannes On 01/10/2024 17:26, Anil Madhavapeddy wrote: Dear Pierre, While this is a good step forward, it is worth considering how to deal with the requirements of some backends for non-moving IO pages to pass through to the kernel/unikernel. This interface will force a copy, but perhaps that's ok. It should be possible to make OCaml 5 bytes non-moving with some consideration for the garbage collector, but I haven't had a chance to think through the details. best, Anil -- Anil Madhavapeddy, Professor of Planetary Computing Computer Laboratory, University of Cambridge (cst.cam.ac.uk)On Oct 1, 2024, at 4:02 PM, pierre.alain@xxxxxxx wrote: Dear all, Over the past few months, a large amount of work has been carried out to reduce the use of Cstruct.t. This was mainly driven by performances, as there were a lot of expensive allocations for small buffers (from 0 to a couple of bytes). Indeed, there were big performance improvements with mirage-crypto (which weren't limited to extracting Cstruct, but also algorithm improvements and a deeper thinking about allocations, like allocating a buffer once, and writing to it rather than allocating multiple buffers and concatenating them). Now I'm wondering if this could be a time to extend this work of Cstruct removal with another API breakage. You can see at https://github.com/mirage/mirage-net/pull/25 a proposal for the new API. It's still conservative in term of "where each buffers would be allocated". Of course, if it is merged, there will be the need for lot of PR to make dependent libraries compliant with the new interface. That's why I'm happy to ask your opinion on this possible step forward :) Best, Pierre -- P.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |