|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [MirageOS-devel] Parallelizing writing to network devices
On 10/01/15 18:39, Thomas Leonard wrote: On 18 December 2014 at 15:17, Masoud Koleini <masoud.koleini@xxxxxxxxxxxxxxxx> wrote:On 18/12/14 13:19, Thomas Leonard wrote:On 17 December 2014 at 18:05, Masoud Koleini <masoud.koleini@xxxxxxxxxxxxxxxx> wrote:Thanks Thomas for the great tracing tool! The following is a very simple unikernel with two interfaces, which redirects frames captured on the first interface to the second one: https://github.com/koleini/parallelisation The problem is that in a high packet rate (more than 80'000 pps), switch stops receiving. The goal is to spot the problem and enhance the throughput of Mirage netif. Test environment consists of another vm running a traffic generator and sending frames of a specific pattern (UDP frames of size 100 bytes) over the bridge that connects to the first interface of the unikernel. Unikernel forwards frames by collecting a number of frames from input queue and running the same number of threads that write them to the output interface. Two trace files are uploaded to the repo. The first file is the output of this configuration. This trace shows that each netif write locks until the thread that writes on the front-end connection to the ring is returned (function write_already_locked.)Do these traces show it after it stopped? The second has a long sleep, while the first looks like it was in the middle of a run. If it had stopped in both cases, it suggests that the whole unikernel stopped (not just the listen thread), because there are no more timer interrupts and no sleep region. Does "xl top" show the unikernel still using the CPU? Or it is waiting, or crashed? If you have a thread writing a string to the console once per second, does it continue after unikernel stops accepting frames?Yes, both are. It looks that I have more info on the traces with updated Mirage libraries. So, I updated the traces in the repo. The unikernel is still working, as traces that periodically write info on the console are still working too.I'm not sure, but it might be worth applying this fix and testing again: https://github.com/mirage/mirage-net-xen/pull/16 (when Netif stopped to wait for space in the transmit ring, it would sometimes fail to notice when space became available) Great!I found another issue in Netif receive thread "poll_thread". Inlarge transfers, thread stops receiving events when all the free space on the ring is filled and then rx_poll deallocates all the grant table indices: https://github.com/mirage/mirage-net-xen/issues/15 With original configuration (netif unchanged), it looks that the reason is unikernel gets out of memory after some time, while error message is shown only in a few experiments. This is the main bottleneck for Mirage applications, which is waiting for a packet write to terminate is time consuming and doesn't allow high rate packet switching for network applications. Modifying netif by ignoring the thread that is waiting for the result of writing to the ring is also problematic. So, any idea how to do bulk packet write on a network interface?For the second trace, the return of the thread is ignored (commenting out "lwt () = th in" in write_already_locked). This considerably increases switching speed, but after some running time, it looks that after garbage collection, similar problem happens. Thomas and Anil, any idea from given traces, and how it is possible to make the traces more informative? Thanks. On 28/11/14 16:55, Thomas Leonard wrote:On 28 November 2014 at 16:24, Anil Madhavapeddy <anil@xxxxxxxxxx> wrote:On 28 Nov 2014, at 16:03, Masoud Koleini <masoud.koleini@xxxxxxxxxxxxxxxx> wrote: Thanks Anil.- graph the ring utilisation to see if it's always full (Thomas Leonard's profiling patches should help here)Would you please point me out to the profiling patches?See: http://roscidus.com/blog/blog/2014/10/27/visualising-an-asynchronous-monad/ This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received thismessage in error, please send it back to me, and immediately delete it. Please do not use, copy or disclose the information contained in this message or in any attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. This message has been checked for viruses but the contents of an attachment may still contain software viruses which could damage your computer system, you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation. _______________________________________________ MirageOS-devel mailing list MirageOS-devel@xxxxxxxxxxxxxxxxxxxx http://lists.xenproject.org/cgi-bin/mailman/listinfo/mirageos-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |