[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Receiving Network Packets with Mirage/kFreeBSD

On 16 Aug 2012, at 14:13, PALI Gabor Janos wrote:

> For your information, I have just added support for receiving network packets
> to the kFreeBSD backend of mirage-platform:

Excellent news -- sorry for the delay in responding, things have been a bit 
congested here, lately :-).

> - Plugged interfaces are stored in a linked list and have their ng_ether(4)
>  hook (called from ether_input()) activated and pointed to
>  netif_ether_input().  At the same time, there is a shared ring buffer
>  created for each of them in Mirage then passed to the C function
>  responsible for administering the list of plugged interfaces,
>  caml_plug_vif().

The pfil(9) KPI in FreeBSD defines 'PFIL_TYPE_IFNET ' but doesn't implement it 
-- but it sounds like that is what Mirage/kFreeBSD wants. It might be worth 
implementing it and seeing if ng_ether(4) can also register using the same 

> - Shared ring buffers are created as Io_pages by allocating page-aligned,
>  contiguous, multi-page memory areas via FreeBSD's contigmalloc(9).  These
>  are directly accessible in Mirage as character arrays.

For the kernel OCaml stack, is physically contiguous memory required? Normally 
the FreeBSD VM system will return virtually (and likely physically) contiguous 
memory for kernel memory allocations, but allowing it to use physically 
non-contiguous memory gives the VM system flexibility. I.e., is there a reason 
not to just use malloc(9), which for large allocation sizes, simply requests 
pages from the VM system rather than using the slab allocator?

> - Each shared ring buffer is currently of size 33 pages, and operates with
>  2048-byte slots.  The buffers start with a header that maintains all the
>  required meta information, like next position, available items, size of
>  stored items.
> - Each packet arriving on any of the plugged interfaces is placed to the next
>  available slot of the corresponding shared ring buffer with m_copydata().

As we talked about briefly a couple of days ago on IRC, it would be great if we 
could avoid the mandatory data copy here. Allowing mbuf cluster memory to 
transparently flow into (and out of) the OCaml runtime, subject to the PL 
runtime itself, would perhaps allow the copy to be avoided where not required 
by Mirage, which helps with memory footprint, cache footprint, etc. In 
principle, at the point where the mbuf is snarfed by Mirage, it should have 
exclusive ownership of that meta-data, and often exclusive ownership of the 
memory pointed to by the mbuf -- although if there are attempts to write, you 
might at that point need to duplicate the data if it's a shared mbuf. E.g., if 
Mirage is doing loopback NFS to the NFS server, and mbufs are pointing at pages 
in the buffer cache, writing back to the buffer cache may be undesirable. :-)

> - In parallel with this in Mirage, the rx_poll function is run in loop that
>  polls for available packets in the shared ring buffer.
> - When rx_poll finds unprocessed packets then it runs the user-specified
>  function on them, e.g. print the size of the packet in basic/netif.  It is
>  implemented by passing a view on the Io_page, i.e. without copying.  After
>  the user function has finished, the packet is removed from the ring.
> - When no packets are available on the polled interface, rx_poll sleeps for a
>  millisecond.

What was the eventual conclusion on the ability to directly dispatch Mirage 
instances from the low-level interrupt thread? For the default network stack, 
we measured significant reductions in latency when switching to that as the 
default model, as well as efficiency improvements under load: packets are 
dropped before entering the NIC descriptor ring, rather than at the 
asynchronous dispatch point to a higher-level thread. Otherwise, you throw away 
all the cycles used to process the packet up until that point.




Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.