[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: raw netif performance enhancement



On 26 Aug 2013, at 09:29, Charalampos Rotsos <cr409@xxxxxxxxxxxx> wrote:
> 
> A first thing I observed with Mirage was that beyond 700Mbps the VM crashes
> because it runs out of memory. This behaviour I think is related to the logic 
> of
> the listen method which spawns a new thread for each packet received. If the 
> thread
> creation rate is lower than the packet processing rate, then the VM cannot
> fulfil the memory requirements and dies. This is easily solvable
> using a capped Lwt_stream which functions as a NIC rx queue. Using this trick 
> I
> managed to solve the memory sortage problem. 

That's right -- the default streams are pretty harmful by not imposing flow
control, but a capped stream is fine.  We should make that change in the listen
function and not allow the unbounded version at all.

> Now my next problem is that the VM cannot switch more that 700Mbps, while the
> CPU utilisation is maximum 60% during execution. I inserted some counter in 
> the
> code and noticed that the tx ring tends to drop some packets (I compare the
> number of packet I inject to the vif with the number of packets observed on 
> the
> end host of the iperf test). I have verified using tc and ifconfig that these
> packets are not lost in between the vif during the switching process (you need
> to do a bit of configuration in the txqueuelen in order to ensure that packets
> are not dropped). My guess so far is that the performance bottleneck is on the
> TX ring of the vif. Are there any hints on how I could improve the performance
> or any ideas on the bottleneck point? 

There are a couple of low-hanging-fruit fixes to drop CPU usage by 50% at least,
in ascending order of difficulty:

- OCaml 4.1beta1 has compiler builtins that are automatically used by cstruct
  to eliminate all the intermediate allocation in accessing struct values. This
  works fine in the UNIX backend, but we need to patch mirage-platform to
  detect which version of OCaml it's being built for, and swap in the runtime
  libraries for that version.  This isn't too difficult, but I've been putting
  it off until we get the core fully stable under 4.0 first.

- Investigate the netfront multiring/offload options in Xen.  There are patches
  flying around to remove the need for so much granting in Xen, which is a
  serious bottleneck with small packets.  If that's done, you relieve a lot of
  the CPU pressure from the interactions with Xen.
 
- As a slightly future-looking thing, Pierre is working on amazing looking
  inlining layer for OCaml that (when it does cross-module) should dramatically
  improve Mirage performance by working across all the libraries we use:
  http://www.ocamlpro.com/blog/2013/07/11/inlining-progress-report.html

Btw, is your test case possible to run as a single unikernel, as Balraj's
tcp loopback test is?  If so, committing it to mirage-skeleton would be useful
in order to make us run it regularly.

-anil


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.