Xen project Mailing List

Re: Receiving Network Packets with Mirage/kFreeBSD

To: PALI Gabor Janos <pgj@xxxxxxx>

From: "Robert N. M. Watson" <robert.watson@xxxxxxxxxxxx>

Date: Wed, 22 Aug 2012 08:43:56 +0100

Cc: cl-mirage@xxxxxxxxxxxxxxx, Anil Madhavapeddy <anil@xxxxxxxxxx>

List-id: MirageOS development <cl-mirage.lists.cam.ac.uk>

On 16 Aug 2012, at 14:13, PALI Gabor Janos wrote: > For your information, I have just added support for receiving network packets > to the kFreeBSD backend of mirage-platform: Excellent news -- sorry for the delay in responding, things have been a bit congested here, lately :-). > - Plugged interfaces are stored in a linked list and have their ng_ether(4) > hook (called from ether_input()) activated and pointed to > netif_ether_input(). At the same time, there is a shared ring buffer > created for each of them in Mirage then passed to the C function > responsible for administering the list of plugged interfaces, > caml_plug_vif(). The pfil(9) KPI in FreeBSD defines 'PFIL_TYPE_IFNET ' but doesn't implement it -- but it sounds like that is what Mirage/kFreeBSD wants. It might be worth implementing it and seeing if ng_ether(4) can also register using the same mechanism. > - Shared ring buffers are created as Io_pages by allocating page-aligned, > contiguous, multi-page memory areas via FreeBSD's contigmalloc(9). These > are directly accessible in Mirage as character arrays. For the kernel OCaml stack, is physically contiguous memory required? Normally the FreeBSD VM system will return virtually (and likely physically) contiguous memory for kernel memory allocations, but allowing it to use physically non-contiguous memory gives the VM system flexibility. I.e., is there a reason not to just use malloc(9), which for large allocation sizes, simply requests pages from the VM system rather than using the slab allocator? > - Each shared ring buffer is currently of size 33 pages, and operates with > 2048-byte slots. The buffers start with a header that maintains all the > required meta information, like next position, available items, size of > stored items. > > - Each packet arriving on any of the plugged interfaces is placed to the next > available slot of the corresponding shared ring buffer with m_copydata(). As we talked about briefly a couple of days ago on IRC, it would be great if we could avoid the mandatory data copy here. Allowing mbuf cluster memory to transparently flow into (and out of) the OCaml runtime, subject to the PL runtime itself, would perhaps allow the copy to be avoided where not required by Mirage, which helps with memory footprint, cache footprint, etc. In principle, at the point where the mbuf is snarfed by Mirage, it should have exclusive ownership of that meta-data, and often exclusive ownership of the memory pointed to by the mbuf -- although if there are attempts to write, you might at that point need to duplicate the data if it's a shared mbuf. E.g., if Mirage is doing loopback NFS to the NFS server, and mbufs are pointing at pages in the buffer cache, writing back to the buffer cache may be undesirable. :-) > - In parallel with this in Mirage, the rx_poll function is run in loop that > polls for available packets in the shared ring buffer. > > - When rx_poll finds unprocessed packets then it runs the user-specified > function on them, e.g. print the size of the packet in basic/netif. It is > implemented by passing a view on the Io_page, i.e. without copying. After > the user function has finished, the packet is removed from the ring. > > - When no packets are available on the polled interface, rx_poll sleeps for a > millisecond. What was the eventual conclusion on the ability to directly dispatch Mirage instances from the low-level interrupt thread? For the default network stack, we measured significant reductions in latency when switching to that as the default model, as well as efficiency improvements under load: packets are dropped before entering the NIC descriptor ring, rather than at the asynchronous dispatch point to a higher-level thread. Otherwise, you throw away all the cycles used to process the packet up until that point. Robert

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.