[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Remove Cstruct: one step forward?


  • To: mirageos-devel@xxxxxxxxxxxxxxxxxxxx
  • From: Hannes Mehnert <hannes@xxxxxxxxxxx>
  • Date: Wed, 2 Oct 2024 10:55:05 +0200
  • Autocrypt: addr=hannes@xxxxxxxxxxx; keydata= xsFLBEIw1AoBEADAtXwEV8F1DBpE9lnBTbHDNeZwDVp84MhxxIT5GUexGgbOWGSEWHhC3rYe FfGRUxF4M9P4fwxpxCS5YCvxoijWHeEf8nG5IkztVv5cw63E443XWHcCMc80YAwglZ2cSP4U GTNeKb9rqVPckk/PL348BYRawhzvZK+Bc+bUvbtPCfUXT1BWIxAR1dzsfpAQVNZ4bA06xOoP QJYVNgl/lWOmQgnSgb0dE2zsgddKTOj05ru7Q7LobB7WAUTRJVkZcXnrvI1SOt/WbPTyqF8l RBh94xCqFhv4SlqZVOTXxo9gw3LpDv/cYXRl/m7+/7Wljl3ziQ9cawA6O1mbw8nm7Sfa+TZl qo+5lXEenXG+MCbH0XnnL2I4BO6HSGDtKX6htTG2xs6w4r9mVxTGJuJcGrC0dxuz5j4jylt/ KOVn9IaRKzhj8ga7kWffMp+JYdrn43732weoFFJxm78mD2ij4UbJtNkQIIcTv8IBJajHy2P3 h1NuBIwwb7RmBav4oo0CKWoasIHFwjMSBpCzJ8QOHeO/F3TY3DZp7FTwViUgSXVJoewO9yFG ctX7MC27/F1IonU9/SJW0j+F3Vz32SfxUBrDnLYpO7/vwA8w+xmWLnl0iJN/8injz5+CigsP e7O66t4MtC9BVCuLu7a/ikH5nW0q6RyTW8of9eZIsuEyqF1ZPwAGKc0jSGFubmVzIE1laG5l cnQgPGhhbm5lc0BtZWhuZXJ0Lm9yZz7CwXQEEwECAB4FAkIw1A0CGwMGCwkIBwMCAxUCAwMW AgECHgECF4AACgkQvIlliN98KO5HYg//UD6gk4sFcNop/EQivcnpfPnHrrUddsBl9bovQSXb zIh5HY/8xhO5i87n5Aox9jYLcZwa6HJ3ElHMOa+n9AY4/+H8bd+BiHWTgEhEzcZqcYwyP2S2 0X/e/m/+1XYs5tldKNZb7ruYRv6rNyUAF1H8EtYNaJpmGtXYurkMhWhEgeP9YB7svmkUN+JO og91tNhN1Wd10/JfKIytNcpXmW6zij0f3MJw/kdwIsmfSUMPaiEli+eB7nU0uLZWf4C3MWTT NmwNznEya5K9McH1Wc/lO9+oB+zRXFBUM/v9YaiyPZo0JcwSRdVYKvKteyqnL/lnx7vtkOnA EC/bcmMvlWLI+Q4Vw2cr2FKcIpJVwswZ5snFqgDr4O5JB88aEAzPFzyWWeBlVqXc0DbDu8jD YmG3yp/xn5UJQSRy6eUcXICNjJyIwekUCznRmhtGwkGFCFEZH/s2fQ7nETxZcuiE4meRnVQE 9lOafI5D+dlsG3SlyN1x0YvrPismep7PwA6FX3cDyz2iUUj4xICLvRLU6kq892KuFmv75pop VAZjJMQqc8BG3oN2YkDcO4NEuOT9/r9muk/WH5Mqcs2BJEG6+yiQ13uMS5TxXiPFp3vKRlq0 MFnm7YRZr5aK6B/WGLOHnRRb2OdAzUgsj4Qiyqvh8Ab+x9wjLwGePxlA1akrF2hQItfOwUsE QjDUdAEQAOHG4vdGxU3eH5hYDLYRsQP6ofoU36pV8iFEtZRJ833L5p9GP2xFUGVDH8yTdkdf QR1prsCJXA7sE/gYBf3k9lGicJQmYNo3uW9Ngz787BhiQJyW/JXcutyTt9b/AZmfJaDo1p0C 8IEtoG7wt4+giFwAJ1brTJtyxlKOGcjWiKh1/dTh13muXSOPcCmhNs4Zm0YNjrhW9nIn1iik lpMRJCCxY1RNcU2VZXfTqq63UTaIrZ1lgYXWilnTdpXt5UEDYBw8Ee6tpPfQflC02e8hbDeD JEP9MTM9pmmPOwZQXP36hTryakKt1Kpw3hgC+Yx9q4wwaZ4XIiWUgopT5mlI+LhnzCgO05YN NcPrbsr6Js34gC3odNicD+C1jSdOXCqAPZZNiVx0PBjRv+LbBZhUkjQJxidvXmrp55pLm+Ua IVl3E/HpFY8kTaJBHP7jvLp+W4J9tP64Ijk5Y9F0z93JwMspG671xuomFsRxUtyO6vldd7qH 1yVzDX7Dd0fAzMDOPQJW6zLiixCmA0McaZdeBXapMJDDoZAPY4pCbRyJJXe0tfv9ufzJrM8Z JHylONdBiIKWw0JldXkUvIGafl1JDOHjP1XoDWrSDO8yFhBR3uWxJy9u1s7aKvonQb5IcYU1 nPu1Olg3doPugXyC0V05MIa68iKw+Kv8KtDDWyibndoTAAYpwsFfBBgBAgAJBQJCMNR1AhsM AAoJELyJZYjffCjuelUP/jlCsxLzu3fZpuORY2LsOQMd4nFHSZLUjauLxDUn8jE//32IIJ0v QV9ab4k7JCLOuYJTTd9aYD6rkITZIVhAcsR/FQZNgVOvGTj6tAmNyn385vMz0p4bLOOy5T0C KMLKzzS4Rt4XgtzvH2xDXSHfPsqS/t/5WFkO+aLgcPALldWGQPgRu5DNoCLr989gCGu5vmd4 XwMRBt/LmJGI0v0EypL3eRmlGaUw5k6N1hStu4EETzdikAzXP5KTuloEXq/caYeUs/SIb5zi XVC1ISW0CIwj5ATbMh8DMG4splXCsajtnJjsKJATBZIWV4XoNqtgV+pQn1ShmW36nUfVGqzX AQ+9i/M+CCkxBrb85Bk8I1CA1nBHNk5SQqER40VRp6vcmuxvIBGi6t8dDWsDQ2q3kd4RjjDZ kYjSie7176bb9t5MfUGjA9WckHuyi+vjy3+sC/nRzByhXf+8iZsO2no3xWZkGUWI8F2hhpzW VsXqvC27LZvJk53fJbpuSueN8a7JKfbKPDqoDSsRaEtcM7ig475tqA/ZCzv6mdqhEV5buoLu cpW7UgYzjNQQXeYZygGWc7FTV3dqLmF1MY2+RlydQbUDjcj1CJ+UmKyxgoLyf7ru0sznr7Tp K4WDnVeJdWX1mqoSupF/u5LON1vpzh3OIl5NNAuV68Hb5On/ALC+DwFX
  • Delivery-date: Wed, 02 Oct 2024 08:55:22 +0000
  • List-id: Developer list for MirageOS <mirageos-devel.lists.xenproject.org>

Dear Anil, dear Pierre,

I appreciate to move forward to use bytes/string where possible - mainly guided by: - nicer tooling support (statmemprof) [lots of tooling doesn't deal with externally allocated memory well (doesn't take it into account, GC doesn't nicely work with us when it comes to high memory pressure, have a look at the hacks for qubes-mirage-firewall, now https://github.com/mirage/mirage-qubes/blob/v1.0.0/lib/misc.ml#L2) - finally we've time to think about "immutability" of data, and rethink the allocation disciplines (which leads to increased performance) -- as Sudha said in her FunOCaml (OCaml 5 GC) talk, the best thing to do is to allocate less - Cstruct was a pretty nice abstraction, but has two layers of bounds checks, is an external library that is not too easy to comprehend (I've heard several people complaining about the mirage-crypto interface before) - OCaml standard library moved to have get_uint8, set_uint16_be etc. accessors since 4.08, so we can just use them \o/ - Bigarray allocation is expensive (calls to malloc, and esp. in MirageOS we then use dlmalloc which is pretty slow)

Now, there are certainly difficulties when porting code:
- Cstruct.shift and Cstruct.sub were cheap, now a String.sub is expensive (allocates the whole data thing) - In e.g. ethernet, we have `val write: t -> ?src:Macaddr.t -> Macaddr.t -> Packet.proto -> ?size:int -> (Cstruct.t -> int) -> (unit, error) result Lwt.t` And here, the "fill" function (Cstruct.t -> int) is around to allow the lowest layer to allocate the buffer, and then pass the buffer to-be-send onto the upper layer -- but ethernet shifts the start by 14 bytes to have space for the ethernet header. --> Replacing this with "bytes -> int" won't pan out, since we'd need some sort of "bytes -> off:int -> int" and ensure (by code review?) that "off" is always used. A similar argument applies for the receiving side. ===> I'm still undecided whether we should give up on that defensive coding practise, or have a fresh library that encapsulates the offset and the underlying data (then again, shift would be cheap).

What Anil says is for sure true, we have Xen supported (with the amazing qubes-mirage-firewall, I guess still a very widely used unikernel), which requires non-moving page-aligned data. All other backends don't need page-alignment, neither non-moving since they copy the data anyways. And of course we want to provide an IP stack that allows zero-copying on the receive and send path.

Now, a quick idea:
- what is the difference between bytes and bigarray, in terms of memory representation (of the underlying data)? I welcome any insight you may have. - if it is equal, can't we come up with a library that unifies the access to the data? would something as simple (and scary) as "let data : bytes = Obj.magic cstruct.underlying_bigarray.underlying_memory" work? - can we have a compile-time choice selecting the right type for the underlying data (and thus the right functions are selected)?

It is also related to bounds checks (and how to avoid them using types or other optimizations), take a look into https://discuss.ocaml.org/t/bounds-checks-for-string-and-bytes-when-retrieving-or-setting-subparts-thereof/ if you're interested.



Best,

Hannes

On 01/10/2024 17:26, Anil Madhavapeddy wrote:
Dear Pierre,

While this is a good step forward, it is worth considering how to deal with the 
requirements of some backends for non-moving IO pages to pass through to the 
kernel/unikernel. This interface will force a copy, but perhaps that's ok. It 
should be possible to make OCaml 5 bytes non-moving with some consideration for 
the garbage collector, but I haven't had a chance to think through the details.

best,
Anil

--
Anil Madhavapeddy, Professor of Planetary Computing
Computer Laboratory, University of Cambridge (cst.cam.ac.uk)

On Oct 1, 2024, at 4:02 PM, pierre.alain@xxxxxxx wrote:

Dear all,

Over the past few months, a large amount of work has been carried out to reduce 
the use of Cstruct.t. This was mainly driven by performances, as there were a 
lot of expensive allocations for small buffers (from 0 to a couple of bytes). 
Indeed, there were big performance improvements with mirage-crypto (which 
weren't limited to extracting Cstruct, but also algorithm improvements and a 
deeper thinking about allocations, like allocating a buffer once, and writing 
to it rather than allocating multiple buffers and concatenating them).

Now I'm wondering if this could be a time to extend this work of Cstruct removal with 
another API breakage. You can see at https://github.com/mirage/mirage-net/pull/25 a 
proposal for the new API. It's still conservative in term of "where each buffers 
would be allocated". Of course, if it is merged, there will be the need for lot of 
PR to make dependent libraries compliant with the new interface.

That's why I'm happy to ask your opinion on this possible step forward :)
Best,
Pierre

--
P.







 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.