[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: MirageOS biweekly calls - Nov 25th 10:00 CET - 12:00 CET at https://meet.jit.si/MirageOS
The notes from the meeting yesterday are below. Thanks to everyone participating. We discussed and found Monday mornings as a better fit for meeting, thus the next meeting is on Nov 25th 10:00 - 12:00 CET. Please add your agenda items to https://pad.data.coop/To6IOSeNSOK9kFVlgo7XWw?both# Best, Hannes- Topic: (Hannes, Pierre) cstruct and performance (see e.g. https://github.com/mirage/mirage-net/pull/25) - Participants: Pierre, Sam, Hannes, Anh ### Minutes- Anh is a PhD student (advised by others & Pierre) who works on opportunistic firewalling - using MirageOS as a firewall - Also wants to work on an IDS in MirageOS, similar to suricata- Pierre is an associate professor at University of Rennes, MirageOS core team - mainly works on the qubes-mirage-firewall - joined MirageOS since he uses QubesOS on his main laptop - Hannes is working full-time on MirageOS - Works in Robur Coop with other peoples - Push MirageOS into production - Works on various applications for MirageOS (VPN, dnsmasq, ...)- Has done some performance investigations, and would like to improve the performance of MirageOS to convince more people to use it - Sam works at Tarides since ~2 years - since last year works on MirageOS - mainly to get MirageOS working on OCaml 5 - also to get MirageOS working on unikraft (replacing solo5)- since solo5 lacks a bit maintenance, also performance (unikraft has batch IO), maybe some day also multicore - Cstruct & performances- Cstruct are important for some backends (Xen) where non moving memory areas are shared amoing domains - Cstruct are heavy to allocate (using dlmalloc, is expensive), and this is against the GC (only the finalizer is used for free) - Some work in (e.g. mirage-crypto) has shown performance improvments (2.5x - 3x) during the Cstruct->{string, bytes} swap - Sam: for some operations we would need to copy- Sam: for a packet receive / send ring, we need non-moving memory as well - Pierre: in the qubes-mirage-firewall we do a lot of copies anyways, e.g. NAT - Pierre: it is "probably"(TM) not too painful to move from Cstruct.t to bytes - Pierre: a big issue is the finalizer, it is unclear when it is called, and the memory is fragmented a lot - Hannes: API-wise, mirage-net receive function takes a callback (Cstruct.t -> unit) Lwt.t, where the mirage-net allocated a buffer and passes it to the callback - Hannes: and the send function gets a `size` hint, and allocates a buffer to be filled - Hannes: what about ownership? Should the mirage-net receive, once the callback has finished, reclaim ownership and reuse the piece of memory? - Pierre: maybe an opaque type is the path to go?- Pierre: should the send be a write-only buffer, the receive a read-only buffer? -- Hannes: there's Cstruct_cap that uses phantom types for it - Sam: maybe move to an abstract API would help to benchmark the two options, test them on real workloads - Hannes: next to types (API), the question is about ownership (and who is responsible to allocate / free the memory) - Hannes: asking the question the other way around, from the application: what should be done for a packet that is received at the firewall? - Hannes: from my point of view, the perfect firewall should not copy: once a packet is received (given a ring buffer of received packets), this packet should copied (and eventually modified, if NAT needs to be done) to an element of the send ring buffer -- there shouldn't be an allocation of the entire packet in the code - Pierre: we should avoid any allocations, and also all copies- Pierre: started to use a bridge firewall which doesn't copy, and avoiding copies is good for performance (for e.g. solo5), for xen the copy we can't really avoid (on xen, you either need to copy or you would need to reconfigure with which VM you share the memory) - Hannes: with the ring buffer approach, we can't really avoid the copying -- it would mean a lot of buerocracy, and the ownership and lifetime of a buffer in the ring buffer isn't well-specified anymore - Hannes: given xen and uring, the ring buffer would need to contain non-moving memory, for solo5-hvt/spt it shouldn't matter -- but is there a difference between using bytes or bigarray for such a ring buffer? - Hannes also tried to write a library that has an abstract type t and is backed by either byte or bigarray memory (and the implementation can be selected at compile time - so no functorisation, but you get a B.get_uint8 etc. directly), but the issue is that exposing the raw memory from bigarray makes the OCaml runtime unhappy -> segmentation fault - Florian mentioned each OCaml value needs to have a tag/header, so we'd need to allocate a bit memory before the page-aligned stuff, and put the header in there - so that may be a path to investigate- Pierre: not sure how the Cstruct.split works, esp. with the header -> it creates a new OCaml variable with the same starting buffer address, and different offset and length - Hannes: I can see multiple paths to investigate: - virtio firewall using bytes/string vs cstruct- virtio firewall using a ring buffer (i.e. allocate not for each packet) vs allocate for each packet - Hannes: also we (well, Romain) figured that allocations of < 255 bytes is very cheap if you allocate bytes/string (it is in the minor heap) - so the high-performance MirageOS unikernel would allocate data in chunks of < 255 bytes - Hannes: Cstruct is more than the memory region: we have an offset and length as well, thus replacing Cstruct.t by Bytes.t removes some safety, since we don't carry around the offset anymore (and thus the ethernet layer can hardly pass on the payload (Cstruct.shift buf 14)) - Hannes: what is the path forward? do we have a concrete application that we want to use for performance investigations? - Pierre: the qubes-mirage-firewall would be a great study, since there are users (who sometimes complain about the performance) - Pierre: a huge performance benefit in the qubes/xen setting would be segmentation offload - Pierre: started to measure the virtio (simple-fw) with no cstruct (see https://github.com/palainp/simple-fw/tree/no-cstruct (doesn't yet compile, needs some further work on mirage-tcpip)) - Pierre: likes the idea to not trust the upper layer, an abstract type would be great - Hannes: the abstract type could as well be cstruct.t, and have an implementation that uses bytes instead of bigarray.t for switching ;) - Pierre: we had this in 2022, but the qubes-mirage-firewall fails to compile with it - Hannes: let's use that cstruct-backed-by-bytes branch (https://github.com/hannesm/ocaml-cstruct/tree/no-bigarray) and test the simple-fw with virtio on it :) [and for now, ignore the qubes-mirage-firewall compilation issues] #### OCaml 5 and ocaml-solo5 - how should we move? - OCaml 5 has a different GC which memory profile is different- Virgile tested that PR on the mirage website, redirecting every other flow to the OCaml 5 unikernel - The behaviour was different under lots of stress (with aborted connections) -- it would be interesting to see whether under normal conditions there's a difference? - With OCaml 5.3, there had been various GC fixes, and big users like Coq/Frama-C, maybe time to look into it again - With OCaml 5, we need to call GC.compact manually- Pierre: it compiles fine for the qubes-mirage-firewall, but doesn't have any long-term runs (only ~10 hours) - with a slightly improved memory bandwidth - Pierre: also tested on dns-resolver, which died due to memory fragmentation (so we should move the bytes) - Pierre: with OCaml 5, it uses more memory at startup - Pierre plans to test simple-fw with bytes x bigarray on OCaml 4 x OCaml 5 - Sam: we should move forward to test it in real conditions- Sam: we should merge and release, maybe something like the ocaml compiler with release candidates - so it is available, but you've to ask for it explicitly - Hannes: maybe not even needed, since ocaml-solo5 depends on the OCaml version of your switch, and so if you're using OCaml 5, you'll get the ocaml-solo5 compatible with OCaml 5, and if you're using OCaml 4, you'll get the ocaml-solo5 compatible with OCaml 4
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |