[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [MirageOS-devel] Parallelizing writing to network devices
On 17 December 2014 at 18:05, Masoud Koleini <masoud.koleini@xxxxxxxxxxxxxxxx> wrote: > Thanks Thomas for the great tracing tool! > > The following is a very simple unikernel with two interfaces, which > redirects frames captured on the first interface to the second one: > > https://github.com/koleini/parallelisation > > The problem is that in a high packet rate (more than 80'000 pps), switch > stops receiving. The goal is to spot the problem and enhance the throughput > of Mirage netif. I don't know if this is the problem, but in the code, I see you do: listen if1 if2 >> (forward_thread if2) This ignores the result from listen, so if the listen thread later fails then the error will be discarded. I'd try something like this: Lwt.choose [ listen if1 if2; forward_thread if2 ] (and lose the "return" at the end of listen) I really think the >> operator should be banned... > Test environment consists of another vm running a traffic generator and > sending frames of a specific pattern (UDP frames of size 100 bytes) over the > bridge that connects to the first interface of the unikernel. Unikernel > forwards frames by collecting a number of frames from input queue and > running the same number of threads that write them to the output interface. > > Two trace files are uploaded to the repo. The first file is the output of > this configuration. This trace shows that each netif write locks until the > thread that writes on the front-end connection to the ring is returned > (function write_already_locked.) > > For the second trace, the return of the thread is ignored (commenting out > "lwt () = th in" in write_already_locked). This considerably increases > switching speed, but after some running time, it looks that after garbage > collection, similar problem happens. > > Thomas and Anil, any idea from given traces, and how it is possible to make > the traces more informative? > > Thanks. > > > On 28/11/14 16:55, Thomas Leonard wrote: >> >> On 28 November 2014 at 16:24, Anil Madhavapeddy <anil@xxxxxxxxxx> wrote: >>>> >>>> On 28 Nov 2014, at 16:03, Masoud Koleini >>>> <masoud.koleini@xxxxxxxxxxxxxxxx> wrote: >>>> >>>> Thanks Anil. >>>> >>>>> - graph the ring utilisation to see if it's always full (Thomas >>>>> Leonard's profiling patches should help here) >>>> >>>> Would you please point me out to the profiling patches? >>> >>> See: >>> http://roscidus.com/blog/blog/2014/10/27/visualising-an-asynchronous-monad/ >> >> The installation instructions here are for the previous version >> (though they should still work). If you want to try the latest >> version, the current Git mirage allows you to pass a ~tracing argument >> to "register" in your config.ml, e.g. >> >> let tracing = mprof_trace ~size:1000000 () in >> register "myunikernel" ~tracing [ >> main $ ... >> ] >> >> This uses a newer version of the profiling API. You should generally >> "opam pin" the #tracing2 branches rather than #tracing to use it. >> >> Note also that it doesn't currently record ring utilisation, so you'll >> still need to do some work to get that. You could use the >> MProf.Counter interface, in which case the GUI will display it as a >> graph over the trace. >> >>>>> - try to reduce the parallelisation to see if some condition there >>>>> alleviates the issue to track it down. >>>> >>>> Reducing the maximum number of threads running in parallel reduced CPU >>>> utilization, and vm was functioning for a much longer time, but the same >>>> problem occurred at the end. >>>> >>>> It might be more useful looking at the code. Please have a look at the >>>> function "f_thread" in the file uploaded on the following repo: >>>> >>>> https://github.com/koleini/parallelisation >>> >>> That's a lot of code to try and distill down a test case. Try to cut it >>> down significantly by building a minimal Ethernet traffic generator that >>> outputs frames with a predictable pattern in the frame, and a receiver that >>> will check that the pattern is received as expected. >>> >>> Then you can try out your parallel algorithm variations on the simple >>> Ethernet sender/receiver and narrow down the problem without all the other >>> concerns. >>> >>> Once the bug is tracked down, we can add the sender/receiver into >>> mirage-skeleton and use it as a test case to ensure that this functional >>> never regresses in the future. Line rate Ethernet transmission has worked >>> in the past, but we never added a test case to ensure it stays working. >>> >>> Anil >>> _______________________________________________ >>> MirageOS-devel mailing list >>> MirageOS-devel@xxxxxxxxxxxxxxxxxxxx >>> http://lists.xenproject.org/cgi-bin/mailman/listinfo/mirageos-devel >> >> >> > > > > > > This message and any attachment are intended solely for the addressee and > may contain confidential information. If you have received this message in > error, please send it back to me, and immediately delete it. Please do not > use, copy or disclose the information contained in this message or in any > attachment. Any views or opinions expressed by the author of this email do > not necessarily reflect the views of the University of Nottingham. > > This message has been checked for viruses but the contents of an attachment > may still contain software viruses which could damage your computer system, > you are advised to perform your own checks. Email communications with the > University of Nottingham may be monitored as permitted by UK legislation. > -- Dr Thomas Leonard http://0install.net/ GPG: 9242 9807 C985 3C07 44A6 8B9A AE07 8280 59A5 3CC1 GPG: DA98 25AE CAD0 8975 7CDA BD8E 0713 3F96 CA74 D8BA _______________________________________________ MirageOS-devel mailing list MirageOS-devel@xxxxxxxxxxxxxxxxxxxx http://lists.xenproject.org/cgi-bin/mailman/listinfo/mirageos-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |